Tag: llm
-
How to write effective evals

We talk a lot about vibe coding, being able to build out product ideas quickly. However, when we deploy products built on existing AI models, we need to ensure the AI’s quality is high, consistent and scalable. Evals provide us with a system to measure quality at scale. Best to start with manual evaluations, which…
-
GPT-5 (Product Review)

The first thing that strikes me when using OpenAI’s GPT-5 is that there isn’t a model switcher anymore. You no longer have to choose between different models – GPT-5 operates as an integrated system that’s clever about the different models it needs to apply. GPT‑5 is the new default in ChatGPT, replacing GPT‑4o, OpenAI o3,…
-
Gumloop (Product Review)

My summary of Gumloop before using it – A workflow automation tool similar to tools like UiPath, Cassidy AI, Zapier and n8n. How does Gumloop work? I started with a simple workflow to understand Gumloop’s core building blocks: YouTube to blog post I chose the simplest task I could think of: converting a YouTube video…
-
DeepSeek R1 (Product Review)

By now, the Internet is all about DeepSeek and what makes its LLM different from other LLMs like OpenAI’s ChatGpT or Anthropic’s Claude. What stands out to me is DeepSeek’s thinking and reasoning output. As a simple example of DeepSeek’s ability to think in real-time, I asked it to tell me how to build a…
-
What is Retrieval-Augmented Generation (RAG)?

As we’re getting more used to Large Language Models (LLMs) and their applications, we’re also starting to see the gaps in the accuracy and reliability of the responses that we get from these models. LLMs can start hallucinating, which means that they provide a response that might seem accurate at first glance, but isn’t. One…
