<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/">
    <channel>
        <title>Ashpreet Bedi</title>
        <link>https://ashpreetbedi.com</link>
        <description>Ashpreet's blog</description>
        <lastBuildDate>Thu, 30 Apr 2026 19:41:08 GMT</lastBuildDate>
        <docs>https://validator.w3.org/feed/docs/rss2.html</docs>
        <generator>https://github.com/jpmonette/feed</generator>
        <image>
            <title>Ashpreet Bedi</title>
            <url>https://ashpreetbedi.com/favicon.ico</url>
            <link>https://ashpreetbedi.com</link>
        </image>
        <copyright>All rights reserved 2026</copyright>
        <item>
            <title><![CDATA[Agent Engineering 101]]></title>
            <link>https://ashpreetbedi.com/articles/agent-engineering</link>
            <guid isPermaLink="false">https://ashpreetbedi.com/articles/agent-engineering</guid>
            <pubDate>Thu, 23 Oct 2025 00:00:00 GMT</pubDate>
            <content:encoded><![CDATA[<span class="text-2xl font-semibold"><p>✨ The intersection of software, systems and security engineering.</p></span>
<p>For a moment, stop debating what an Agent should be — deterministic or autonomous, a workflow or graph. Just pause for a sec and step back.</p>
<p>Our goal is to make use of this technology, which in my opinion, lends itself to 3 major use-cases:</p>
<ol>
<li><strong>Tools</strong> that improve productivity (chatgpt, claude, cursor).</li>
<li><strong>Workflows</strong> that saves time (marketing research, report generation).</li>
<li><strong>AI products</strong> that solve user problems (eg: Notion AI).</li>
</ol>
<p>You can buy AI tools, and tools for building workflows, but building AI products is where the real engineering happens. Let's dive in.</p>
<h2>What is Agent Engineering?</h2>
<p>Agent Engineering is the practice of <strong>building</strong>, <strong>running</strong> and <strong>managing</strong> agentic systems. It sits at the intersection of software engineering, system design and security engineering.</p>
<p>In practice, if you're building an AI product, you'll need an AI backend — a system that connects to your frontend via an API. This backend is responsible for running agents (concurrently), managing memory, knowledge, state, and ensuring the security and privacy of your environment. This is Agent Engineering, which focuses on:</p>
<ul>
<li><strong>Runtime architecture</strong>: how agents are orchestrated, manage state, and handle execution loops.</li>
<li><strong>Memory systems</strong>: how agents retain and manage context, session history, memory, knowledge and culture.</li>
<li><strong>Tooling integration</strong>: how agents connect to APIs, databases, or internal functions (MCPs are popular here).</li>
<li><strong>Safety &amp; Security</strong>: how to ensure data, application and user-level security.</li>
<li><strong>Evaluation &amp; performance</strong>: measuring usefulness, latency, cost, and reliability of the agentic system.</li>
</ul>
<p>Agent Engineers are responsible for answering questions like:</p>
<ul>
<li>How do we serve our agents as an API that our frontend can call?</li>
<li>When should we use REST versus Websockets?</li>
<li>How do we handle request/response timeouts (29 seconds for aws api gateway)?</li>
<li>If tools are exposed via MCP, how should our AI backend establish and maintain a connection to the MCP server? Should it be initialized once using FastAPI lifecycle hooks, or re-established every time an agent runs (probably not)?</li>
<li>How should authentication and authorization be handled — once (probably not), per request, or through persistent sessions?</li>
<li>How do we manage concurrency and state when multiple users call the same agent? Are sessions properly isolated?</li>
<li>What is the security boundary of each request? Are agents only accessing data permitted by RBAC?</li>
<li>How do we log and monitor the agentic system? Tracing is popular, but it’s not enough. How do we capture events like “this request was made,” “this agent, via this request, accessed this data,” and the complete lifecycle of what happened during execution?</li>
</ul>
<p>Agent Engineering is not just about building agents, it's about building the system that runs them (securely). Its 40% agent development, 40% system design and 20% security engineering.</p>
<h2>How Agno helps with Agent Engineering?</h2>
<p><strong>Agno is a multi-agent framework, runtime, and control plane.</strong> It delivers a complete solution for building, deploying and managing multi-agent systems via 3 tightly coupled components:</p>
<ol>
<li><strong>Framework</strong>: for building Agents, Multi-Agent Teams and Workflows.</li>
<li><strong>Pre-built FastAPI Runtime</strong>: for deploying multi-agent systems.</li>
<li><strong>Control Plane</strong>: web interface for managing multi-agent systems.</li>
</ol>
<blockquote>
<p>One frustration I have with most frameworks is that they give you a way to build an agent, but almost no guidance on how to run it in production. Like, how do I serve this as an SSE compatible API that my frontend can call? How do I build a product out of this? This to me, is incomplete, because the real engineering happens after the agent is built. And no, logging (telemetry) and evals is not what makes a system production-grade. Since when did cloudwatch and unit-tests make a product? They're parts of it, sure, but stop selling them as the whole story.</p>
</blockquote>
<p>While Agno gives you an incredibly feature-rich agent framework — it's the pre-built FastAPI application that really sets it apart. We call this the AgentOS. This is the real advantage of Agno, the advantage of working with people who've built these types of systems before.</p>
<span class="text-teal-400">A very simple example: along with the pre-build endpoints, the AgentOS initializes MCP connections in FastAPI lifecycle hooks, and gives you a security-key for authenticating every request.</span>
<p>Next, the control plane — our web interface for managing AgentOS — connects directly to your runtime via your browser, letting you test the real performance of your system. <strong>This architecture honestly only makes sense once you test it.</strong> So give it a try.</p>
<p>It's a novel architecture that makes your setup inherently secure, since your browser connects directly to the runtime, no data is sent to agno, or any external telemetry services or stored outside your cloud, you avoid unnecessary egress and retention costs.</p>
<blockquote>
<p>Sending our AI app data to telemetry services is fundamentally broken. We don't send your app data, user data, or business data to a third-party logger — so why send our AI data? <strong>Why not just connect to the database directly to view it?</strong></p>
</blockquote>
<h2>Minimal Example</h2>
<p>Okay, let's demonstrate the power of Agno with a simple example. Here's a fully working Agent, with conversation history, access to tools via MCP, deployed as a FastAPI app - in 20 lines of code.</p>
<pre class="language-javascript"><code class="language-javascript"><span class="token keyword module">from</span> agno<span class="token punctuation">.</span><span class="token property-access">agent</span> <span class="token keyword module">import</span> <span class="token imports"><span class="token maybe-class-name">Agent</span></span>
<span class="token keyword module">from</span> agno<span class="token punctuation">.</span><span class="token property-access">db</span><span class="token punctuation">.</span><span class="token property-access">sqlite</span> <span class="token keyword module">import</span> <span class="token imports"><span class="token maybe-class-name">SqliteDb</span></span>
<span class="token keyword module">from</span> agno<span class="token punctuation">.</span><span class="token property-access">models</span><span class="token punctuation">.</span><span class="token property-access">anthropic</span> <span class="token keyword module">import</span> <span class="token imports"><span class="token maybe-class-name">Claude</span></span>
<span class="token keyword module">from</span> agno<span class="token punctuation">.</span><span class="token property-access">os</span> <span class="token keyword module">import</span> <span class="token imports"><span class="token maybe-class-name">AgentOS</span></span>
<span class="token keyword module">from</span> agno<span class="token punctuation">.</span><span class="token property-access">tools</span><span class="token punctuation">.</span><span class="token property-access">mcp</span> <span class="token keyword module">import</span> <span class="token maybe-class-name">MCPTools</span>

# <span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">*</span> <span class="token maybe-class-name">Create</span> <span class="token maybe-class-name">Agent</span> <span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">*</span>
agno_agent <span class="token operator">=</span> <span class="token function"><span class="token maybe-class-name">Agent</span></span><span class="token punctuation">(</span>
    name<span class="token operator">=</span><span class="token string">"Agno Agent"</span><span class="token punctuation">,</span>
    model<span class="token operator">=</span><span class="token function"><span class="token maybe-class-name">Claude</span></span><span class="token punctuation">(</span>id<span class="token operator">=</span><span class="token string">"claude-sonnet-4-5"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
    db<span class="token operator">=</span><span class="token function"><span class="token maybe-class-name">SqliteDb</span></span><span class="token punctuation">(</span>db_file<span class="token operator">=</span><span class="token string">"agno.db"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
    tools<span class="token operator">=</span><span class="token punctuation">[</span><span class="token function"><span class="token maybe-class-name">MCPTools</span></span><span class="token punctuation">(</span>url<span class="token operator">=</span><span class="token string">"https://docs.agno.com/mcp"</span><span class="token punctuation">,</span> transport<span class="token operator">=</span><span class="token string">"streamable-http"</span><span class="token punctuation">)</span><span class="token punctuation">]</span><span class="token punctuation">,</span>
    add_history_to_context<span class="token operator">=</span><span class="token maybe-class-name">True</span><span class="token punctuation">,</span>
    markdown<span class="token operator">=</span><span class="token maybe-class-name">True</span><span class="token punctuation">,</span>
<span class="token punctuation">)</span>

# <span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">*</span> <span class="token maybe-class-name">Create</span> <span class="token maybe-class-name">AgentOS</span> <span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">*</span>
agent_os <span class="token operator">=</span> <span class="token function"><span class="token maybe-class-name">AgentOS</span></span><span class="token punctuation">(</span>agents<span class="token operator">=</span><span class="token punctuation">[</span>agno_agent<span class="token punctuation">]</span><span class="token punctuation">)</span>
app <span class="token operator">=</span> agent_os<span class="token punctuation">.</span><span class="token method function property-access">get_app</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
</code></pre>
<p>Run your AgentOS using <code>fastapi dev agno_agent.py</code> and chat with it on the <a target="_blank" rel="noopener noreferrer" class="" href="https://os.agno.com">AgentOS UI</a>.</p>
<video width="700" height="700" class="rounded-2xl" loop="" autoplay="" muted="" playsinline="" controls=""><source src="/videos/agentos-chat.mp4">Your browser does not support the video tag.</video>
<p>Deploy your FastAPI app to your cloud of choice, and voilà, you're live in production. <strong>It's impossible to move this quickly without Agno.</strong></p>
<h2>Summary: The Layers of Agent Engineering</h2>
<p>Agent Engineering has three fundamental layers:</p>
<ol>
<li>The <strong>Framework</strong> (Build)</li>
</ol>
<p>This is where you define your Agents, Teams and Workflows — the schemas, memory, knowledge, and guardrails, the reasoning loop.</p>
<ol start="2">
<li>The <strong>Runtime</strong> (Run)</li>
</ol>
<p>The runtime serves (via API), scales, and orchestrates Agents in production. It handles concurrency, async execution, error recovery, and communication between agents and tools.</p>
<ol start="3">
<li>The <strong>Control Plane</strong> (Manage)</li>
</ol>
<p>The control plane provides visibility: dashboards, monitoring, debugging, and human-in-the-loop control. It's how you understand what your agents are doing — and why.</p>
<p>Agno combines all three. It's not just a framework. It's a <strong>complete runtime and control plane</strong> for multi-agent systems.</p>
<h2>Designed for Agent Engineering</h2>
<p>I'll end this article with a list of features of Agno:</p>
<table><thead><tr><th><strong>Category</strong></th><th><strong>Feature</strong></th><th><strong>Description</strong></th></tr></thead><tbody><tr><td><strong>Core Intelligence</strong></td><td><strong>Model Agnostic</strong></td><td>Works with any model provider so you can use your favorite LLMs.</td></tr><tr><td></td><td><strong>Type Safe</strong></td><td>Enforce structured I/O through <code>input_schema</code> and <code>output_schema</code> for predictable, composable behavior.</td></tr><tr><td></td><td><strong>Dynamic Context Engineering</strong></td><td>Inject variables, state, and retrieved data on the fly into context. Perfect for dependency-driven agents.</td></tr><tr><td><strong>Memory, Knowledge, and Persistence</strong></td><td><strong>Persistent Storage</strong></td><td>Give your Agents, Teams, and Workflows a database to persist session history, state, and messages.</td></tr><tr><td></td><td><strong>User Memory</strong></td><td>Built-in memory system that allows Agents to recall user-specific context across sessions.</td></tr><tr><td></td><td><strong>Agentic RAG</strong></td><td>Connect to 20+ vector stores (called <strong>Knowledge</strong> in Agno) with hybrid search + reranking out of the box.</td></tr><tr><td></td><td><strong>Culture (Collective Memory)</strong></td><td>Shared knowledge that compounds across agents and time.</td></tr><tr><td><strong>Execution &amp; Control</strong></td><td><strong>Human-in-the-Loop</strong></td><td>Native support for confirmations, manual overrides, and external tool execution.</td></tr><tr><td></td><td><strong>Guardrails</strong></td><td>Built-in safeguards for validation, security, and prompt protection.</td></tr><tr><td></td><td><strong>Agent Lifecycle Hooks</strong></td><td>Pre- and post-hooks to validate or transform inputs and outputs.</td></tr><tr><td></td><td><strong>MCP Integration</strong></td><td>First-class support for the Model Context Protocol (MCP) to connect Agents with external systems.</td></tr><tr><td></td><td><strong>Toolkits</strong></td><td>100+ built-in toolkits with thousands of tools, ready for use across data, code, web, and enterprise APIs.</td></tr><tr><td><strong>Runtime &amp; Evaluation</strong></td><td><strong>Runtime</strong></td><td>Pre-built FastAPI based runtime with SSE compatible endpoints, ready for production on day 1.</td></tr><tr><td></td><td><strong>Control Plane (UI)</strong></td><td>Integrated interface to visualize, monitor, and debug agent activity in real time.</td></tr><tr><td></td><td><strong>Natively Multimodal</strong></td><td>Agents can process and generate text, images, audio, video, and files.</td></tr><tr><td></td><td><strong>Evals</strong></td><td>Measure your Agents' Accuracy, Performance, and Reliability.</td></tr><tr><td><strong>Security &amp; Privacy</strong></td><td><strong>Private by Design</strong></td><td>Runs entirely in your cloud. The UI connects directly to your AgentOS from your browser, no data is ever sent externally.</td></tr><tr><td></td><td><strong>Data Governance</strong></td><td>Your data lives securely in your Agent database, no external data sharing or vendor lock-in.</td></tr><tr><td></td><td><strong>Access Control</strong></td><td>Role-based access (RBAC) and per-agent permissions to protect sensitive contexts and tools.</td></tr></tbody></table>
<hr>
<h2>Want to build with Agno?</h2>
<ul>
<li>
<p><strong>Agno documentation:</strong> <a target="_blank" rel="noopener noreferrer" class="" href="https://agno.link/docs">agno.link/docs</a></p>
</li>
<li>
<p><strong>Signup for the AgentOS:</strong> <a target="_blank" rel="noopener noreferrer" class="" href="https://os.agno.com">os.agno.com</a></p>
</li>
<li>
<p><strong>Star Agno on Github:</strong> <a target="_blank" rel="noopener noreferrer" class="" href="https://agno.link/gh">agno.link/gh</a></p>
</li>
</ul>
<hr>
<p>Read more on <a target="_blank" rel="noopener noreferrer" class="" href="https://www.agno.com">agno.com</a></p>]]></content:encoded>
            <author>hi@ashpreetbedi.com (Ashpreet Bedi)</author>
        </item>
        <item>
            <title><![CDATA[Agent Security 101]]></title>
            <link>https://ashpreetbedi.com/articles/agent-security</link>
            <guid isPermaLink="false">https://ashpreetbedi.com/articles/agent-security</guid>
            <pubDate>Tue, 28 Oct 2025 00:00:00 GMT</pubDate>
            <content:encoded><![CDATA[<p>PSA: If you're serious about Agent Security, stop sending your transactional data to telemetry services. <strong>Here's how to do it right:</strong></p>
<ol>
<li>Give your agents a database.</li>
<li>Store all transactions in that database.</li>
<li>Keep your data <strong>inside</strong> your system.</li>
<li>Avoid duplication across multiple systems.</li>
<li>Stop paying for egress and retention.</li>
</ol>
<h2>Transactional data ≠ Telemetry</h2>
<p>Somewhere along the way, people started treating conversational traces as logs (they're not), and started pushing <em>everything</em> (agent inputs, outputs, reasoning, memory) to telemetry vendors. It's not just bad security hygiene, it's inefficient, redundant, and expensive.</p>
<p><strong>Transactional data</strong> is what's happening in your system: inputs, outputs, tool calls, memory updates, and internal reasoning. It's the source of truth for your system and should never leave it.</p>
<p><strong>Telemetry data</strong> is system metrics and operational metadata (latency, token usage, error rates, throughput, uptime). That's the stuff you aggregate and throw in cold storage after 180 days.</p>
<p>In an agentic system, conversational traces are <strong>transactional data</strong>. They belong inside your infrastructure:</p>
<ol>
<li>They often contain <strong>PII, proprietary logic, and sensitive data</strong> and should never be sent externally.</li>
<li>They need to be <strong>re-used by your application</strong> (by future runs, for debugging and optimization), so you'll store them internally anyway.</li>
</ol>
<hr>
<span class="text-2xl font-semibold"><p>So how do you do it properly?</p></span>
<h2>1. Give your agents a database.</h2>
<p>Agents need structured storage. Sessions, runs, memory, knowledge — all of it should persist in your database. Just like any other application.</p>
<blockquote class="not-prose relative isolate pl-6 text-ink py-3 text-lg"><span aria-hidden="true" class="absolute inset-y-1 left-0 w-0.5 rounded-full bg-accent"></span><div class="flex gap-3"><div class="space-y-1"><div class="leading-relaxed">I personally use <strong>Postgres + PgVector</strong> in production, and <strong>Sqlite</strong> for demos.</div></div></div></blockquote>
<p>Here's a minimal example:</p>
<pre class="language-python"><code class="language-python"><span class="token comment"># /// script</span>
<span class="token comment"># dependencies = [</span>
<span class="token comment">#   "agno",</span>
<span class="token comment">#   "anthropic",</span>
<span class="token comment">#   "yfinance",</span>
<span class="token comment">#   "sqlalchemy",</span>
<span class="token comment"># ]</span>
<span class="token comment"># ///</span>

<span class="token keyword">from</span> agno<span class="token punctuation">.</span>agent <span class="token keyword">import</span> Agent
<span class="token keyword">from</span> agno<span class="token punctuation">.</span>db<span class="token punctuation">.</span>sqlite <span class="token keyword">import</span> SqliteDb
<span class="token keyword">from</span> agno<span class="token punctuation">.</span>models<span class="token punctuation">.</span>anthropic <span class="token keyword">import</span> Claude
<span class="token keyword">from</span> agno<span class="token punctuation">.</span>tools<span class="token punctuation">.</span>yfinance <span class="token keyword">import</span> YFinanceTools

<span class="token comment"># ************* Create Agent *************</span>
agno_agent <span class="token operator">=</span> Agent<span class="token punctuation">(</span>
    name<span class="token operator">=</span><span class="token string">"Finance Agent"</span><span class="token punctuation">,</span>
    model<span class="token operator">=</span>Claude<span class="token punctuation">(</span><span class="token builtin">id</span><span class="token operator">=</span><span class="token string">"claude-sonnet-4-5"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
    db<span class="token operator">=</span>SqliteDb<span class="token punctuation">(</span>db_file<span class="token operator">=</span><span class="token string">"tmp/finance_agent.db"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
    tools<span class="token operator">=</span><span class="token punctuation">[</span>YFinanceTools<span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">]</span><span class="token punctuation">,</span>
    instructions<span class="token operator">=</span><span class="token string">"Use tables to display data."</span><span class="token punctuation">,</span>
    add_history_to_context<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span>
    add_datetime_to_context<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span>
    num_history_runs<span class="token operator">=</span><span class="token number">3</span><span class="token punctuation">,</span>
    markdown<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span>
<span class="token punctuation">)</span>

<span class="token comment"># ************* Run Agent *************</span>
agno_agent<span class="token punctuation">.</span>print_response<span class="token punctuation">(</span><span class="token builtin">input</span><span class="token operator">=</span><span class="token string">"What is the stock price of Apple?"</span><span class="token punctuation">,</span> stream<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span> stream_intermediate_steps<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">)</span>
<span class="token comment"># Run #2 that continues the conversation</span>
agno_agent<span class="token punctuation">.</span>print_response<span class="token punctuation">(</span><span class="token builtin">input</span><span class="token operator">=</span><span class="token string">"Can you write a report on it? Just give me the report, no other text."</span><span class="token punctuation">,</span> stream<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span> stream_intermediate_steps<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">)</span>
</code></pre>
<p>Save this to a file and run it with <code>uv run finance_agent.py</code>. You can see conversation history work flawlessly because it's stored in a local sqlite database.</p>
<video width="700" height="700" class="rounded-2xl" loop="" autoplay="" muted="" playsinline="" controls=""><source src="/videos/finance-agent.mp4">Your browser does not support the video tag.</video>
<h2>2. Store all transactions in that database.</h2>
<p>When you run your agents, store all transactions in that database. Including: inputs, outputs, context, messages, tool calls, memory updates, knowledge updates, culture updates. Basically everything that happens in your agentic system should be stored in your database.</p>
<p>For enterprise workloads, this isn't just best practice, it's a requirement. You need to persist traces for <strong>compliance, auditing, debugging, and continuity</strong>.</p>
<blockquote class="not-prose relative isolate pl-6 text-ink py-3 text-lg"><span aria-hidden="true" class="absolute inset-y-1 left-0 w-0.5 rounded-full bg-accent"></span><div class="flex gap-3"><div class="space-y-1"><div class="leading-relaxed">Agno does this automatically for you.</div></div></div></blockquote>
<p>External telemetry tools were never designed for this. They're built for metrics and logs, not for sensitive, replayable transactional data. You can make the case for running the data plane inside your VPC, you still have to deal with duplicated data (and pay enterprise data license costs).</p>
<h2>3. Keep data within your system (and avoid duplication).</h2>
<p>Every time you send LLM traces to an external service, you create redundant copies of sensitive data. This violates least-privilege principles and adds unnecessary complexity, you'll have to create "linking-ids" to connect your application usage to actual traces (solving problems that shouldn't exist in the first place).</p>
<p>Anyone who's built data pipelines knows: joining transactional data from app DBs with telemetry metrics is a nightmare. Skip the headache. Keep everything in one system.</p>
<h2>4. Want a UI? No problem.</h2>
<p>Once your data lives inside your infrastructure, it's easy to visualize. You could spin up a quick Streamlit dashboard or just use the <a target="_blank" rel="noopener noreferrer" class="" href="http://os.agno.com">AgentOS UI</a>, which gives you a ready-to-use view of all your agent sessions, runs, memory, knowledge, etc.</p>
<p>Here's how:</p>
<pre class="language-python"><code class="language-python"><span class="token comment"># /// script</span>
<span class="token comment"># dependencies = [</span>
<span class="token comment">#   "agno",</span>
<span class="token comment">#   "anthropic",</span>
<span class="token comment">#   "yfinance",</span>
<span class="token comment">#   "sqlalchemy",</span>
<span class="token comment">#   "fastapi[standard]",</span>
<span class="token comment">#   "mcp",</span>
<span class="token comment"># ]</span>
<span class="token comment"># ///</span>

<span class="token keyword">from</span> agno<span class="token punctuation">.</span>agent <span class="token keyword">import</span> Agent
<span class="token keyword">from</span> agno<span class="token punctuation">.</span>db<span class="token punctuation">.</span>sqlite <span class="token keyword">import</span> SqliteDb
<span class="token keyword">from</span> agno<span class="token punctuation">.</span>models<span class="token punctuation">.</span>anthropic <span class="token keyword">import</span> Claude
<span class="token keyword">from</span> agno<span class="token punctuation">.</span>os <span class="token keyword">import</span> AgentOS
<span class="token keyword">from</span> agno<span class="token punctuation">.</span>tools<span class="token punctuation">.</span>mcp <span class="token keyword">import</span> MCPTools

<span class="token comment"># ************* Create Agent *************</span>
agno_agent <span class="token operator">=</span> Agent<span class="token punctuation">(</span>
    name<span class="token operator">=</span><span class="token string">"Agno Agent"</span><span class="token punctuation">,</span>
    model<span class="token operator">=</span>Claude<span class="token punctuation">(</span><span class="token builtin">id</span><span class="token operator">=</span><span class="token string">"claude-sonnet-4-5"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
    db<span class="token operator">=</span>SqliteDb<span class="token punctuation">(</span>db_file<span class="token operator">=</span><span class="token string">"tmp/agno.db"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
    tools<span class="token operator">=</span><span class="token punctuation">[</span>MCPTools<span class="token punctuation">(</span>transport<span class="token operator">=</span><span class="token string">"streamable-http"</span><span class="token punctuation">,</span> url<span class="token operator">=</span><span class="token string">"https://docs.agno.com/mcp"</span><span class="token punctuation">)</span><span class="token punctuation">]</span><span class="token punctuation">,</span>
    add_history_to_context<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span>
    add_datetime_to_context<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span>
    num_history_runs<span class="token operator">=</span><span class="token number">3</span><span class="token punctuation">,</span>
    markdown<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span>
<span class="token punctuation">)</span>

<span class="token comment"># ************* Create AgentOS *************</span>
agent_os <span class="token operator">=</span> AgentOS<span class="token punctuation">(</span>agents<span class="token operator">=</span><span class="token punctuation">[</span>agno_agent<span class="token punctuation">]</span><span class="token punctuation">)</span>
app <span class="token operator">=</span> agent_os<span class="token punctuation">.</span>get_app<span class="token punctuation">(</span><span class="token punctuation">)</span>

<span class="token comment"># ************* Run AgentOS *************</span>
<span class="token keyword">if</span> __name__ <span class="token operator">==</span> <span class="token string">"__main__"</span><span class="token punctuation">:</span>
    agent_os<span class="token punctuation">.</span>serve<span class="token punctuation">(</span>app<span class="token operator">=</span><span class="token string">"basic_demo:app"</span><span class="token punctuation">,</span> <span class="token builtin">reload</span><span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">)</span>
</code></pre>
<p>Run this file using <code>uv run basic_agentos.py</code> and connect to it on the <a target="_blank" rel="noopener noreferrer" class="" href="http://os.agno.com">AgentOS UI</a>.</p>
<video width="700" height="700" class="rounded-2xl" loop="" autoplay="" muted="" playsinline="" controls=""><source src="/videos/basic-agentos.mp4">Your browser does not support the video tag.</video>
<h2>5. Finally, stop paying for egress and retention.</h2>
<p>Shipping full traces to third parties is expensive. Text is ok but when it comes to images, audio, video, files, etc., you're looking at a lot of bandwidth that is leaving your VPC. Egress fees, retention costs, and redundant storage add up — fast. Keeping data in your own infrastructure saves both <strong>money</strong> and <strong>risk</strong>.</p>
<span class="text-2xl font-semibold flex justify-center"><p><strong>Own your data, control your costs.</strong></p></span>
<h2>Why Agno?</h2>
<p>Agno was designed from the ground up for building private, secure, high-performance, agentic systems.</p>
<ol>
<li>Every Agent comes with its own database.</li>
<li>All data stays within your system.</li>
<li>Private. Secure. Open Source.</li>
</ol>
<p><strong>Agno documentation:</strong> <a target="_blank" rel="noopener noreferrer" class="" href="https://agno.link/docs">agno.link/docs</a></p>
<p><strong>Signup for the AgentOS:</strong> <a target="_blank" rel="noopener noreferrer" class="" href="https://os.agno.com">os.agno.com</a></p>
<p><strong>Star Agno on Github:</strong> <a target="_blank" rel="noopener noreferrer" class="" href="https://agno.link/gh">agno.link/gh</a></p>
<hr>
<p>I know mentioning Agno here seems like a plug, it's not. The architecture is simple: you should own your data. You don't have to use Agno for that. You can build it yourself. The difference is that with most telemetry providers, your data stays locked with them forever. With Agno, it stays with you.</p>]]></content:encoded>
            <author>hi@ashpreetbedi.com (Ashpreet Bedi)</author>
        </item>
        <item>
            <title><![CDATA[Agentic Culture]]></title>
            <link>https://ashpreetbedi.com/articles/agentic-culture</link>
            <guid isPermaLink="false">https://ashpreetbedi.com/articles/agentic-culture</guid>
            <pubDate>Tue, 21 Oct 2025 00:00:00 GMT</pubDate>
            <content:encoded><![CDATA[<p>Andrej Karpathy shared on the <a target="_blank" rel="noopener noreferrer" class="" href="https://youtu.be/lXUZvyajciY?si=i1Q-eOCBUEWmMGh7&amp;t=6034">Dwarkesh Podcast</a> that LLMs don't have the equivalent of "culture".</p>
<p>So we built the scaffolding for them to develop one.</p>
<h2>Why Culture?</h2>
<p>Every Agent learns from its own interactions — the tasks it runs, the conversations it has, the errors it fixes. But that knowledge is siloed. It disappears when the session ends or the user changes.</p>
<p>Humans solved this problem a long time ago. We call it <strong>culture</strong> — the consolidation of shared knowledge that compounds over time.</p>
<p>With Agno, you can now give your Agents the same ability to <strong>learn collectively</strong>.</p>
<hr>
<h2>Introducing Agentic Culture</h2>
<p><strong>Agentic Culture</strong> is an open-source experiment in <strong>collective memory</strong> and <strong>in-context cultural</strong> for multi-agent systems.</p>
<p>It provides a shared cultural database where Agents can store and retrieve knowledge that persists beyond individual sessions, users, or memories. Culture becomes a living, evolving layer of context that shapes Agent reasoning and behavior over time.</p>
<p>Agents can now create, read, explore, and learn from their collective experience. See the <a target="_blank" rel="noopener noreferrer" class="" href="https://agno.link/agentic-culture">Agentic Culture</a> cookbook for example code.</p>
<blockquote>
<p>“Culture is how intelligence compounds”</p>
</blockquote>
<hr>
<h2>How It Works</h2>
<p>Culture acts as a shared database where Agents can save reusable knowledge that benefits all interactions.</p>
<p>While <strong>Memory</strong> captures user-specific details (e.g. "Sarah prefers email"), <strong>Culture</strong> captures universal principles that benefit all interactions (e.g. "Always provide actionable next steps").</p>
<p>You can use Agno’s <code>CultureManager</code> to create and manage cultural knowledge entries. These are stored in your chosen database and automatically retrieved by your Agents for contextual grounding.</p>
<pre class="language-python"><code class="language-python"><span class="token triple-quoted-string string">"""Demonstrates how to create and persist shared cultural knowledge with Agno's `CultureManager`."""</span>

<span class="token keyword">from</span> agno<span class="token punctuation">.</span>culture<span class="token punctuation">.</span>manager <span class="token keyword">import</span> CultureManager
<span class="token keyword">from</span> agno<span class="token punctuation">.</span>db<span class="token punctuation">.</span>sqlite <span class="token keyword">import</span> SqliteDb
<span class="token keyword">from</span> agno<span class="token punctuation">.</span>models<span class="token punctuation">.</span>anthropic <span class="token keyword">import</span> Claude
<span class="token keyword">from</span> rich<span class="token punctuation">.</span>pretty <span class="token keyword">import</span> pprint

<span class="token comment"># Step 1. Initialize the database</span>
db <span class="token operator">=</span> SqliteDb<span class="token punctuation">(</span>db_file<span class="token operator">=</span><span class="token string">"tmp/demo.db"</span><span class="token punctuation">)</span>

<span class="token comment"># Step 2. Create the Culture Manager</span>
culture_manager <span class="token operator">=</span> CultureManager<span class="token punctuation">(</span>
    db<span class="token operator">=</span>db<span class="token punctuation">,</span>
    model<span class="token operator">=</span>Claude<span class="token punctuation">(</span><span class="token builtin">id</span><span class="token operator">=</span><span class="token string">"claude-sonnet-4-5"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
<span class="token punctuation">)</span>

<span class="token comment"># Step 3. Create cultural knowledge from a message</span>
message <span class="token operator">=</span> <span class="token punctuation">(</span>
    <span class="token string">"All technical guidance should follow the 'Operational Thinking' principle:\n"</span>
    <span class="token string">"1. **State the Objective** — What outcome are we trying to achieve and why.\n"</span>
    <span class="token string">"2. **Show the Procedure** — List clear, reproducible steps (commands/configs).\n"</span>
    <span class="token string">"3. **Surface Pitfalls** — What usually fails and how to detect it early.\n"</span>
    <span class="token string">"4. **Define Validation** — How to confirm it’s working (logs, tests, metrics).\n"</span>
    <span class="token string">"5. **Close the Loop** — Suggest next iterations or improvements."</span>
<span class="token punctuation">)</span>

culture_manager<span class="token punctuation">.</span>create_cultural_knowledge<span class="token punctuation">(</span>message<span class="token operator">=</span>message<span class="token punctuation">)</span>

<span class="token comment"># Step 4. Retrieve and inspect stored knowledge</span>
pprint<span class="token punctuation">(</span>culture_manager<span class="token punctuation">.</span>get_all_knowledge<span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
</code></pre>
<p>Now give your agents access to the shared culture by setting <code>add_culture_to_context=True</code>. That's it. Your Agents now learn from shared cultural knowledge.</p>
<pre class="language-python"><code class="language-python"><span class="token triple-quoted-string string">"""Use cultural knowledge with your Agents."""</span>

<span class="token keyword">from</span> agno<span class="token punctuation">.</span>agent <span class="token keyword">import</span> Agent
<span class="token keyword">from</span> agno<span class="token punctuation">.</span>db<span class="token punctuation">.</span>sqlite <span class="token keyword">import</span> SqliteDb
<span class="token keyword">from</span> agno<span class="token punctuation">.</span>models<span class="token punctuation">.</span>anthropic <span class="token keyword">import</span> Claude

db <span class="token operator">=</span> SqliteDb<span class="token punctuation">(</span>db_file<span class="token operator">=</span><span class="token string">"tmp/demo.db"</span><span class="token punctuation">)</span>

agent <span class="token operator">=</span> Agent<span class="token punctuation">(</span>
    model<span class="token operator">=</span>Claude<span class="token punctuation">(</span><span class="token builtin">id</span><span class="token operator">=</span><span class="token string">"claude-sonnet-4-5"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
    db<span class="token operator">=</span>db<span class="token punctuation">,</span>
    add_culture_to_context<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span>
    <span class="token comment"># optional: run culture manager after each run</span>
    <span class="token comment"># update_cultural_knowledge=True,</span>
<span class="token punctuation">)</span>

agent<span class="token punctuation">.</span>print_response<span class="token punctuation">(</span>
    <span class="token string">"How do I set up a FastAPI service using Docker?"</span><span class="token punctuation">,</span>
    stream<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span>
    markdown<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span>
<span class="token punctuation">)</span>
</code></pre>
<hr>
<h2>What You Can Do With It</h2>
<p>The current <strong>v0.1</strong> release focuses on helping Agents stay consistent in tone, reasoning, and behavior. Over time, the goal is to transform isolated Agents into a living, evolving system of intelligence.</p>
<p>With Culture, you can:</p>
<ul>
<li>Accumulate learnings and behavioral patterns from successful runs</li>
<li>Use that collective context to guide future decisions</li>
<li>Observe how "culture" evolves across teams, orgs, and domains</li>
</ul>
<hr>
<h2>Examples</h2>
<p>The <a target="_blank" rel="noopener noreferrer" class="" href="https://agno.link/agentic-culture">Agentic Culture</a> cookbook includes several runnable recipes:</p>
<table><thead><tr><th>File</th><th>Description</th></tr></thead><tbody><tr><td>01_create_cultural_knowledge.py</td><td>Create cultural knowledge using a model.</td></tr><tr><td>02_use_cultural_knowledge_in_agent.py</td><td>Use cultural knowledge inside Agents.</td></tr><tr><td>03_automatic_cultural_management.py</td><td>Let Agents autonomously update culture over time.</td></tr><tr><td>04_manually_add_culture.py</td><td>Manually seed culture for tone guides or org-wide principles.</td></tr><tr><td>05_test_agent_with_cultural_knowledge.py</td><td>Freestyle testing — see culture in action.</td></tr></tbody></table>
<p>Each builds on the previous one, so you can run them in sequence.</p>
<p>Agno is open-source, so you can contribute to the cookbook or build your own recipes. Here's the github repository: <a target="_blank" rel="noopener noreferrer" class="" href="https://agno.link/gh">agno.link/gh</a></p>
<hr>
<h2>Future Work</h2>
<p>This is early, but promising. We're exploring how to:</p>
<ul>
<li>Integrate culture across multi-agent teams.</li>
<li>Sync or version cultural knowledge programmatically</li>
<li>Store culture in Postgres, Redis, or your own backend</li>
<li>Let Agents evolve shared norms collectively, like emergent civilizations</li>
</ul>
<p>Karpathy describes a future where LLMs have a "giant scratchpad" — a shared space to think, write, and build on each other's ideas.</p>
<p>Agno is providing the scaffolding for developing that culture.</p>
<hr>
<h2>Explore &amp; Build</h2>
<ul>
<li><strong>Explore Agentic Culture:</strong> <a target="_blank" rel="noopener noreferrer" class="" href="https://agno.link/agentic-culture">agno.link/agentic-culture</a></li>
<li><strong>Agno on GitHub:</strong> <a target="_blank" rel="noopener noreferrer" class="" href="https://agno.link/gh">agno.link/gh</a></li>
<li><strong>Documentation:</strong> <a target="_blank" rel="noopener noreferrer" class="" href="https://agno.link/docs">agno.link/docs</a></li>
<li><strong>Agno Website:</strong> <a target="_blank" rel="noopener noreferrer" class="" href="https://www.agno.com">agno.com</a></li>
</ul>]]></content:encoded>
            <author>hi@ashpreetbedi.com (Ashpreet Bedi)</author>
        </item>
        <item>
            <title><![CDATA[Agentic Software Engineering]]></title>
            <link>https://ashpreetbedi.com/articles/agentic-software-engineering</link>
            <guid isPermaLink="false">https://ashpreetbedi.com/articles/agentic-software-engineering</guid>
            <pubDate>Sun, 01 Mar 2026 00:00:00 GMT</pubDate>
            <content:encoded><![CDATA[<blockquote>
<p>Note: this post is about building your own agents (agentic software engineering), not about using coding agents.</p>
</blockquote>
<p>By now you've probably used a few agents, or at least heard of Claude Code, Codex, or OpenClaw. Ever wondered what it takes to build your own?</p>
<p>Most people think of agents as prompts + tools in a loop. That's a reasonable assumption, but it's not production architecture.</p>
<p>The moment your agent needs to know who it's talking to, maintain state, handle concurrent requests, take sensitive actions like refunds, and survive failing tool calls, it stops being an "LLM + tools in a loop" and becomes a distributed system.</p>
<p>Building agents is the easy part. There are 75 frameworks that help you do that. The hard part is the runtime: the harness around the agent that makes it work in the real world. That's what agentic software engineering is all about.</p>
<h2>Build. Serve. Connect.</h2>
<p>Here's how I think about shipping agentic software.</p>
<p><strong>Build</strong> the agent. Define the model, tools, knowledge base, memory, storage, and guardrails. This is the layer that most frameworks give you.</p>
<p><strong>Serve</strong> it as an API. User-scoped, session-scoped, horizontally scalable. Add persistent storage, streaming, background execution, retry semantics. This is where most agentic products stall. Not because the agent doesn't work, but because it doesn't have the infrastructure to work reliably at scale.</p>
<p><strong>Connect</strong> it to where users live. Your product, Slack, Discord, MCP, wherever. An agent in a notebook is an experiment. An agent where your users are is a product.</p>
<h2>The 6 Pillars of Agentic Software</h2>
<p>Building an agent is AI engineering. Running it in production is software engineering. Together, they form agentic software engineering: the practice of building, running, and scaling agents as production services.</p>
<p>Here are the six pillars that hold it up:</p>
<p><strong>Durability</strong>. Agents reason across multiple steps, call tools that time out, and fail halfway through. If your agent crashes on step 12 of 15, restarting might duplicate a side effect or lose critical context. Agentic software needs to pause, resume, checkpoint, and recover gracefully. Durability turns failure into resumption, not a full restart.</p>
<p><strong>Isolation</strong>. Agentic software serves thousands of users simultaneously. Each user needs their own session, their own memory, their own context. Passing a user_id with each request is easy. Isolating every resource the agent touches is where the engineering comes in. Your database, your vector store, your model provider, all need to respect user boundaries. One missing filter becomes a data breach.</p>
<p><strong>Governance</strong>. Agents that can act can also cause damage. Looking up a record is harmless. Deleting a record or issuing a refund needs approval. Agentic software needs layered authority: what runs automatically, what needs human approval, and what needs admin sign-off. Today, most agents auto-execute with minimal oversight. As they get more capable, governance becomes the product.</p>
<p><strong>Persistence</strong>. An agent without persistent storage can't learn, can't build context, can't improve. We need to store sessions, memory, knowledge in a database. Persistent state is what turns a chatbot into a product. Every conversation makes the next one better.</p>
<p><strong>Scale</strong>. A thousand users hit your agent at the same time. Requests queue, you hit model rate limits, and tool calls compete for resources. Traditional services call your own backends. Agentic software calls external model APIs and third-party tools, which means you inherit their rate limits, latency, and downtime. Scaling agentic software means scaling around dependencies you don't control.</p>
<p><strong>Composability</strong>. When an agent is a service, other agents can call it. Your frontend can call it. Your Slack bot can call it. MCP clients can discover it. It becomes a building block in your architecture, and every new integration becomes a standard API call. That's how single-agent tools become multi-agent systems.</p>
<p>None of this is new. We've been building reliable distributed systems for decades. The AI industry just hasn't brought those lessons along yet, and we're feeling it in every failed deployment.</p>
<h2>From Theory to Practice</h2>
<p>As always, I come bearing code. Here's how you can start building your own agentic service today.</p>
<pre class="language-bash"><code class="language-bash"><span class="token comment"># 1. Clone the repo</span>
<span class="token function">git</span> clone <span class="token punctuation">\</span>
    https://github.com/agno-agi/agentos-docker-template.git <span class="token punctuation">\</span>
    agentos

<span class="token builtin class-name">cd</span> agentos

<span class="token comment"># 2. Set your model provider key</span>
<span class="token function">cp</span> example.env .env
<span class="token comment"># Edit .env and add OPENAI_API_KEY</span>

<span class="token comment"># 3. Start the application</span>
<span class="token function">docker</span> compose up -d --build

<span class="token comment"># 4. Optional: Load documents for the knowledge agent</span>
<span class="token function">docker</span> <span class="token builtin class-name">exec</span> -it agentos-api python -m agents.knowledge_agent
</code></pre>
<p>This gives you a containerized service with persistent storage (Postgres), two starter agents (a knowledge agent using Agentic RAG and an MCP agent for external tool use), and a REST API you can connect to from anywhere.</p>
<p>I'm using Docker for this template because Docker runs everywhere: your laptop, AWS, GCP, Azure, Railway. The same container you develop locally is the one you deploy to production. The <a target="_blank" rel="noopener noreferrer" class="" href="https://github.com/agno-agi/agentos-docker-template">README</a> covers everything you need to get started.</p>
<p>After running the service:</p>
<ol>
<li>Open <a target="_blank" rel="noopener noreferrer" class="" href="http://localhost:8000/docs">localhost:8000/docs</a> to see your API.</li>
<li>Connect to the web UI at <a target="_blank" rel="noopener noreferrer" class="" href="https://os.agno.com">os.agno.com</a> where you can chat with your agents, trace runs, manage knowledge, create schedules and approve sensitive tool calls. One UI for your agentic software.</li>
</ol>
<video width="700" height="700" class="rounded-2xl" loop="" autoplay="" muted="" playsinline="" controls=""><source src="/videos/agentos-docker-template.mp4">Your browser does not support the video tag.</video>
<p>Adding your own agent is a few lines of Python and a restart. Swap models with a one-line change. Add tools from 100+ integrations. The template is a starting point. Read the <a target="_blank" rel="noopener noreferrer" class="" href="https://docs.agno.com/introduction">Agno docs</a> to learn more.</p>
<h2>Governance &amp; Elicitation</h2>
<p>Most agents run tool calls with minimal oversight or auditability. In practice, we need layered authority:</p>
<ol>
<li>Tools that run freely</li>
<li>Tools that need user approval</li>
<li>Tools that need admin approval</li>
</ol>
<p>Agents also need to ask questions (often called elicitation). The Claude Code team shared a <a target="_blank" rel="noopener noreferrer" class="" href="https://x.com/trq212/status/2027463795355095314">great article</a> on the AskUserQuestion tool used by Claude.</p>
<p>This is available in Agno as <code>UserFeedbackTools</code>. Here's a support agent that can look up orders freely, ask the customer structured questions when it needs more information, and wait for admin approval before issuing a refund:</p>
<pre class="language-python"><code class="language-python">support <span class="token operator">=</span> Agent<span class="token punctuation">(</span>
    <span class="token builtin">id</span><span class="token operator">=</span><span class="token string">"support"</span><span class="token punctuation">,</span>
    name<span class="token operator">=</span><span class="token string">"Support"</span><span class="token punctuation">,</span>
    model<span class="token operator">=</span>OpenAIResponses<span class="token punctuation">(</span><span class="token builtin">id</span><span class="token operator">=</span><span class="token string">"gpt-5.2"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
    db<span class="token operator">=</span>agent_db<span class="token punctuation">,</span>
    tools<span class="token operator">=</span><span class="token punctuation">[</span>
        lookup_order<span class="token punctuation">,</span>             <span class="token comment"># auto-execute</span>
        search_help_docs<span class="token punctuation">,</span>         <span class="token comment"># auto-execute</span>
        issue_refund<span class="token punctuation">,</span>             <span class="token comment"># requires user confirmation</span>
        UserFeedbackTools<span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>      <span class="token comment"># structured questions</span>
    <span class="token punctuation">]</span><span class="token punctuation">,</span>
    instructions<span class="token operator">=</span>instructions<span class="token punctuation">,</span>
    enable_agentic_memory<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span>
<span class="token punctuation">)</span>
</code></pre>
<p>Watch what happens when a customer asks for a refund.</p>
<ul>
<li>The agent looks up the order on its own, no permission needed.</li>
<li>Then it hits a decision point: why does the customer want the refund?</li>
<li>Instead of guessing, it presents a structured question with clear options: defective, wrong item, changed mind.</li>
<li>The customer picks one. Now the agent calls the refund tool, but because refunds carry real consequences, it pauses for user approval.</li>
<li>Once approved, the agent runs the refund tool.</li>
</ul>
<p>Three levels of agency in one conversation. You can view the <a target="_blank" rel="noopener noreferrer" class="" href="https://github.com/agno-agi/demo/tree/main/agents/support">full code here</a>.</p>
<video width="700" height="700" class="rounded-2xl" loop="" autoplay="" muted="" playsinline="" controls=""><source src="/videos/approvals-flow.mp4">Your browser does not support the video tag.</video>
<p>The agent knows when to act, when to ask, and when to wait. That's what governance looks like in practice. The runtime has to support all three modes, and the transitions between them have to feel natural.</p>
<blockquote>
<p>Note: the approvals flow on the UI is actively being developed. The refund should wait for admin approval, not user approval. This is implemented on the SDK but not the UI yet. This is being fixed this week.</p>
</blockquote>
<h2>Agents are distributed systems</h2>
<p>The <a target="_blank" rel="noopener noreferrer" class="" href="https://x.com/ashpreetbedi/status/2024885969250394191">5 Levels</a> describe how agentic software grows in capability (and complexity). The <a target="_blank" rel="noopener noreferrer" class="" href="https://x.com/ashpreetbedi/status/2026708881972535724">7 Sins</a> describe how they fail in production. The 6 Pillars describe what it takes to build them right.</p>
<p>The consistent message across all three: agentic software engineering is a discipline. The teams that internalize this early will ship great products. The teams that keep treating agents as scripts will continue to miss the mark.</p>
<p>Clone the <a target="_blank" rel="noopener noreferrer" class="" href="https://github.com/agno-agi/agentos-docker-template">repo</a>. Build your first agent. Ship it where your users are.</p>
<hr>
<p>Links:</p>
<ul>
<li>
<a target="_blank" rel="noopener noreferrer" class="" href="https://docs.agno.com/">Agno Docs</a>
</li>
<li>
<a target="_blank" rel="noopener noreferrer" class="" href="https://github.com/agno-agi/agno">Agno Github</a>
</li>
<li>
<a target="_blank" rel="noopener noreferrer" class="" href="https://docs.agno.com/deploy/introduction">AgentOS Templates</a>
</li>
<li>
<a target="_blank" rel="noopener noreferrer" class="" href="https://github.com/agno-agi/agentos-docker-template">AgentOS Docker Template</a>
</li>
</ul>]]></content:encoded>
            <author>hi@ashpreetbedi.com (Ashpreet Bedi)</author>
        </item>
        <item>
            <title><![CDATA[Becoming AI-first]]></title>
            <link>https://ashpreetbedi.com/articles/becoming-ai-first</link>
            <guid isPermaLink="false">https://ashpreetbedi.com/articles/becoming-ai-first</guid>
            <pubDate>Sun, 19 Oct 2025 00:00:00 GMT</pubDate>
            <content:encoded><![CDATA[<span class="text-2xl font-semibold"><p>✨ Lessons from 100s of conversations on AI products and how teams are adopting AI.</p></span>
<p>Every tuesday and thursday, I take 3–5 calls with builders, CTOs, and CEOs of companies. One question on every CEO's mind is:</p>
<blockquote class="not-prose relative isolate pl-6 text-ink py-3 text-lg"><span aria-hidden="true" class="absolute inset-y-1 left-0 w-0.5 rounded-full bg-accent"></span><div class="flex gap-3"><div class="space-y-1"><div class="leading-relaxed">"How do I make my company AI-first?"</div></div></div></blockquote>
<p>Common variations include: how can we use AI better? should we be building Agents? how do we add AI to our products?</p>
<p>Over time, I've identified patterns in how leading companies are approaching this question, and what separates the ones making real progress from those still in exploration mode.</p>
<h2>What "AI-first" really means</h2>
<p>Being AI-first doesn't mean using AI everywhere, or re-architecting you entire company or product around ChatGPT.</p>
<p>It means understanding where intelligence creates leverage for your team, your operations, and your product. If you can identify where AI genuinely moves the needle, you're already halfway there.</p>
<p>Broadly, I've found three high-leverage entry points:</p>
<ol>
<li><strong>Internal tools</strong> that improve productivity and decision-making.</li>
<li><strong>Workflow automation</strong> that saves time and reduces operational load.</li>
<li><strong>User-facing products</strong> that create revenue and differentiation.</li>
</ol>
<p>Each represents a layer in your company's AI maturity. Let's dig in.</p>
<span class="text-xl font-semibold"><p><strong>1. Internal Tools</strong></p></span>
<p>These tools help your team save time, become more productive, and build intuition around AI. <strong>General-purpose agents</strong> (ChatGPT, Claude), <strong>coding assistants</strong> (Cursor, Claude Code), or <strong>vertical agents</strong> (legal, sales, marketing) all fit here. I'm yet to meet a team that's not all-in here.</p>
<p>These don't require a polished UX or commercial rollout — just curiosity and experimentation. The payoff is your team becoming AI-native faster than your competitors.</p>
<blockquote>
<p><strong>Most teams I speak to give everyone access to a multitude of AI tools. The cost is trivial compared to the learning dividend.</strong></p>
</blockquote>
<p>If you're not doing this already, get your team a ChatGPT subscription and cursor/CC for coding. Connect these tools to your company knowledge, databases, and documents. Let your team explore, learn, and build intuition.</p>
<span class="text-xl font-semibold"><p><strong>2. Workflow Automation</strong></p></span>
<p>Once your team sees what's possible, you'll start spotting repeatable patterns ripe for automation. This is where AI turns mundane tasks into automated processes that can run in the background.</p>
<p>Examples: invoice classification, market research, sales prep, support summarization, or daily reporting.</p>
<p>That said, the highest-ROI workflows are almost always specific to your team. They take effort to design — and while "no-code" tools like N8N or Zapier can help, most serious setups eventually involve code. Frameworks like Agno can help here if you have engineering resources.</p>
<blockquote>
<p><strong>Treat automation as part of your system design, not a side project. Its ok to invest in it, if only to learn and build intuition.</strong></p>
</blockquote>
<span class="text-xl font-semibold"><p><strong>3. User-Facing AI Products</strong></p></span>
<p>This is where AI creates compounding value — by improving the product your users already love. You can:</p>
<ol>
<li>Buy off-the-shelf products that add AI-powered features to existing products (e.g., a support agent). I highly recommend this as a starting point, its easy to get started and you start seeing immediate value.</li>
<li>Build new AI features specific to your product. The goal here is to make your product smarter, faster, and more delightful.</li>
</ol>
<p>Your goal here isn't to "add AI" — it's to make the experience better. The best AI features often don't look like AI at all.</p>
<blockquote>
<p><strong>Our most successful case studies are ones where users don't even realize AI is at work, they just notice things getting smarter, forms getting filled automatically, and buttons that automate what was previously a 10-step manual process.</strong></p>
</blockquote>
<p>So general recommendation is to start with off-the-shelf products that add AI-powered features to your product. But once you need to build AI-features that are specific to your product, here's how to do it.</p>
<ol>
<li>
<p>Add small, reliable AI features - ideally as "magic buttons" or "magic interactions". Reliability is the keyword here.</p>
</li>
<li>
<p>Automate targeted, well-defined problems - solve one painful step at a time. Serve the AI application as a RestAPI, which your product can call when the user clicks the "magic button".</p>
</li>
<li>
<p>Avoid generic chatbots - they shift the cognitive load to the user and expose an incredibly vast surface area, which is bound to disappoint. Instead, build clear, purposeful interfaces that do the work for them. This will also force you to think about the user experience and how to make it more intuitive and delightful.</p>
</li>
</ol>
<p>Each of these "magic moments" compounds. Over time, your product becomes AI-first not by branding, but by behavior.</p>
<blockquote class="not-prose relative isolate pl-6 text-ink py-2 text-base"><span aria-hidden="true" class="absolute inset-y-1 left-0 w-0.5 rounded-full bg-accent"></span><div class="flex gap-3"><div class="space-y-1"><div class="leading-relaxed"><p><strong>Start simple, focus on clarity and reliability over complexity</strong>.</p></div></div></div></blockquote>
<h2>From exploration to execution</h2>
<p>If you want to accelerate this journey, <a target="_blank" rel="noopener noreferrer" class="" href="https://agno.com">Agno</a> is a starting point.</p>
<p>It will give you the right primitives for building AI features and a FastAPI application that you can deploy in your cloud (for privacy and security). Your product can easily integrate with this API and before you know it, you'll be serving AI features to your users.</p>
<hr>
<h2>Want to build with Agno?</h2>
<ul>
<li>
<p><strong>Agno documentation:</strong> <a target="_blank" rel="noopener noreferrer" class="" href="https://agno.link/docs">agno.link/docs</a></p>
</li>
<li>
<p><strong>Signup for the AgentOS:</strong> <a target="_blank" rel="noopener noreferrer" class="" href="https://os.agno.com">os.agno.com</a></p>
</li>
<li>
<p><strong>Star Agno on Github:</strong> <a target="_blank" rel="noopener noreferrer" class="" href="https://agno.link/gh">agno.link/gh</a></p>
</li>
</ul>
<hr>
<p>Read more on <a target="_blank" rel="noopener noreferrer" class="" href="https://www.agno.com">agno.com</a></p>]]></content:encoded>
            <author>hi@ashpreetbedi.com (Ashpreet Bedi)</author>
        </item>
        <item>
            <title><![CDATA[Dash: The Data Agent Every Company Needs]]></title>
            <link>https://ashpreetbedi.com/articles/dash-v2</link>
            <guid isPermaLink="false">https://ashpreetbedi.com/articles/dash-v2</guid>
            <pubDate>Wed, 15 Apr 2026 00:00:00 GMT</pubDate>
            <content:encoded><![CDATA[<p>Every company with 30+ people should have an internal data agent and today I'm making ours open-source: take <a target="_blank" rel="noopener noreferrer" class="" href="https://github.com/agno-agi/dash">Dash</a>, run it in your cloud, and give your team access via Slack.</p>
<p>Most AI-forward companies have in-house data agents:</p>
<ul>
<li>OpenAI: <a target="_blank" rel="noopener noreferrer" class="" href="https://openai.com/index/inside-our-in-house-data-agent/">Inside OpenAI's in-house data agent</a></li>
<li>Vercel: <a target="_blank" rel="noopener noreferrer" class="" href="https://vercel.com/blog/we-removed-80-percent-of-our-agents-tools">d0</a>, <a target="_blank" rel="noopener noreferrer" class="" href="https://vercel.com/blog/anyone-can-build-agents-but-it-takes-a-platform-to-run-them">another post</a></li>
<li>Uber: <a target="_blank" rel="noopener noreferrer" class="" href="https://www.uber.com/gb/en/blog/query-gpt/">QueryGPT</a> (creative name)</li>
<li>LinkedIn: <a target="_blank" rel="noopener noreferrer" class="" href="https://www.linkedin.com/blog/engineering/ai/practical-text-to-sql-for-data-analytics">SQLBot</a> (absolutely LinkedIn-coded name for the agent)</li>
<li>Salesforce: <a target="_blank" rel="noopener noreferrer" class="" href="https://www.salesforce.com/blog/text-to-sql-agent/">Horizon Agent</a></li>
<li>DoorDash: <a target="_blank" rel="noopener noreferrer" class="" href="https://careersatdoordash.com/blog/beyond-single-agents-doordash-building-collaborative-ai-ecosystem/">How to use every buzzword in a blog post</a></li>
</ul>
<p>This post will show you how to build a best-in-class data system and make it available to your team over Slack. If you do this well, Dash should handle roughly 80% of routine data questions, send daily reports, and catch metric anomalies before anyone asks.</p>
<h2>What is Dash?</h2>
<p>Dash is a self-learning data system made of 3 agents: Dash (the team leader), a Data Analyst and a Data Engineer.</p>
<img alt="Dash AgentOS" loading="lazy" width="700" height="700" decoding="async" data-nimg="1" class="rounded-2xl" style="color:transparent" srcset="/_next/image?url=%2Fimages%2Fdash-agentos.png&amp;w=750&amp;q=75 1x, /_next/image?url=%2Fimages%2Fdash-agentos.png&amp;w=1920&amp;q=75 2x" src="/_next/image?url=%2Fimages%2Fdash-agentos.png&amp;w=1920&amp;q=75">
<p>It uses a dual-tier knowledge and learning system to deliver an incredible work-with-your-data experience.</p>
<p>You can chat with it via Slack or the AgentOS UI.</p>
<p>It writes SQL, runs it, and tells you what the numbers mean. More importantly, when it makes a mistake or gets corrected, it learns from it. When your team keeps asking the same question, it builds infrastructure so the answer is faster next time.</p>
<p><strong>A self-learning data system, not a data agent.</strong></p>
<p>Dash uses its own PostgreSQL database. You don't point it at your production database. You progressively load the tables you want it to work with, along with the context it needs to be useful. This is the part most people skip. This is the part that makes it special.</p>
<p>Here's how it looks in Slack (8x speedup when waiting):</p>
<video width="700" height="700" class="rounded-2xl" loop="" autoplay="" muted="" playsinline="" controls=""><source src="/videos/dash-in-slack.mp4">Your browser does not support the video tag.</video>
<p>And on the AgentOS UI:</p>
<video width="700" height="700" class="rounded-2xl" loop="" autoplay="" muted="" playsinline="" controls=""><source src="/videos/dash-agentos-ui.mp4">Your browser does not support the video tag.</video>
<p>Using the AgentOS UI, you can chat with your agents, view sessions, traces, metrics, and schedules.</p>
<p>AgentOS is the agent platform you didn't know you needed.</p>
<h2>How It Works</h2>
<h3>1. Context is everything</h3>
<p>Most data agents get a schema dump and the impossible task of writing SQL from business logic that only lives in the data engineer's head. That's why they're bad. Column names and types tell you nothing about the data. They don't tell you that <code>ended_at IS NULL</code> means a subscription is active. That annual billing gets a 10% discount. That usage metrics are sampled 3-5 days per month, so summing them gives you garbage.</p>
<p>I wrote about this problem in detail in my <a target="_blank" rel="noopener noreferrer" class="" href="https://www.ashpreetbedi.com/articles/sql-agent">Self-Improving Text-to-SQL Agent</a> post. The core insight holds: <strong>the biggest improvement you can make to your data agent is giving it the same tribal knowledge that human engineers have.</strong></p>
<p>Dash uses a carefully curated knowledge system backed by PgVector. It contains:</p>
<p><strong>Table metadata.</strong> Table schema, column types, what they mean, what to use each table for, the gotchas. Every table ships with use cases and data quality notes. Example: status is 'active', 'churned', or 'trial'; always check against subscriptions for ground truth.</p>
<p><strong>Validated queries (must have).</strong> Battle-tested SQL with the right JOINs, the right NULL handling, the right edge cases. When the Analyst gets your question, it searches knowledge first. Before it writes a line of SQL, it already knows the shape of the data and which traps to avoid.</p>
<p><strong>Business rules.</strong> How MRR is calculated, what NRR means, that a customer can have multiple subscription records because upgrades close the old row and open a new one. This is the context that separates a correct answer from a plausible-looking wrong one.</p>
<blockquote>
<p>This knowledge is curated by the user. What makes Dash special is its ability to learn on its own.</p>
</blockquote>
<h3>2. Self-learning loop</h3>
<p>Separate from knowledge, Dash captures what it learns automatically (via tool calls). When the Analyst hits a type error and fixes it, the fix gets saved. When a user corrects a result, that correction is recorded. When the system discovers a data quirk, it notes it.</p>
<p>Next time anyone asks a similar question, the Analyst checks learnings before writing SQL. Dash gets better the more it's used.</p>
<p>I've been developing this pattern since December 2025, first as <a target="_blank" rel="noopener noreferrer" class="" href="https://www.ashpreetbedi.com/articles/gpu-poor-continuous-learning">GPU Poor Continuous Learning</a> and then refined through <a target="_blank" rel="noopener noreferrer" class="" href="https://www.ashpreetbedi.com/articles/dash">Dash v1</a>. The approach is simple: the model stays frozen. The system gets smarter. Learning happens in retrieval, not in weights. It's auditable, reversible, and requires zero training compute.</p>
<h3>3. Three agents, two schemas</h3>
<p>Dash is three agents. <strong>Leader</strong> routes requests and synthesizes answers. <strong>Analyst</strong> writes and runs SQL. <strong>Engineer</strong> builds views, summary tables, and computed data. They work together, sharing knowledge and learnings.</p>
<p><strong>The Leader</strong> has no SQL tools. It cannot touch the database.</p>
<p><strong>The Analyst</strong> is read-only. Not "read-only because the prompt says so." Read-only because the PostgreSQL connection is configured with <code>default_transaction_read_only=on</code>. The database itself rejects writes. No prompt injection or clever jailbreak changes this. The database says no.</p>
<p><strong>The Engineer</strong> can write, but only to the <code>dash</code> schema. A SQLAlchemy event listener intercepts every SQL statement before execution and blocks anything targeting the <code>public</code> schema. Your company data is untouchable.</p>
<p>This gives you two schemas with a hard boundary:</p>
<ul>
<li><strong>public schema:</strong> your company data. You load it. Agents read it.</li>
<li><strong>dash schema:</strong> views, summary tables, computed data. The Engineer owns and maintains it.</li>
</ul>
<p>There's also an <code>ai</code> schema where Dash stores its sessions, learnings, knowledge vectors, and other operational data. It powers the AgentOS UI and the self-improvement loop.</p>
<p>I covered the security model in depth in my <a target="_blank" rel="noopener noreferrer" class="" href="https://www.ashpreetbedi.com/articles/systems-engineering">Systems Engineering</a> post. The key principle: security is a system property enforced by configuration, tested across layers.</p>
<h3>The part nobody else has</h3>
<p>When the Leader notices your team keeps asking the same expensive question (MRR by plan, churn by segment, revenue waterfall) it asks the Engineer to build a view.</p>
<p>The Engineer creates <code>dash.monthly_mrr_by_plan</code>. A SQL view joining the right tables, handling all edge cases, producing a clean result. Then it does the critical thing: it calls <code>update_knowledge</code> to record the view in the knowledge base. What it contains, what columns it has, example queries.</p>
<p>Next time someone asks about MRR by plan, the Analyst searches knowledge, finds the view, and queries it directly. No complex join. No risk of getting NULL handling wrong. Faster. Pre-validated. Consistent.</p>
<p>The agents build on each other's work. The Engineer creates infrastructure. The Analyst discovers and uses it. The Leader notices patterns and triggers the cycle. Over time, the <code>dash</code> schema fills with views and summary tables that nobody manually created. An analytics layer the system built for itself, shaped by what your team actually asks about.</p>
<h3>The full loop</h3>
<ol>
<li>You ask a question. Leader delegates.</li>
<li>The Analyst searches knowledge, writes correct SQL, returns an insight.</li>
<li>Good queries get saved to knowledge. Errors become learnings.</li>
<li>Repeated patterns become views. Views get recorded to knowledge.</li>
<li>Next time, the Analyst uses the view. Faster, pre-validated, consistent.</li>
</ol>
<p>Dash accumulates institutional knowledge about your data and compounds with use.</p>
<h2>Build Your Own</h2>
<p>Dash is free and open-source. Check out the <a target="_blank" rel="noopener noreferrer" class="" href="https://github.com/agno-agi/dash">GitHub repo</a> and follow the README for in-depth instructions.</p>
<h3>Quick Start</h3>
<pre class="language-bash"><code class="language-bash"><span class="token function">git</span> clone https://github.com/agno-agi/dash <span class="token operator">&amp;&amp;</span> <span class="token builtin class-name">cd</span> dash
<span class="token function">cp</span> example.env .env  <span class="token comment"># Add OPENAI_API_KEY</span>

<span class="token function">docker</span> compose up -d --build

<span class="token function">docker</span> <span class="token builtin class-name">exec</span> -it dash-api python scripts/generate_data.py
<span class="token function">docker</span> <span class="token builtin class-name">exec</span> -it dash-api python scripts/load_knowledge.py
</code></pre>
<p>This starts Dash with a synthetic dataset (~900 customers, 6 tables) and loads the knowledge base (table metadata, validated queries, business rules). You can demo the entire system without connecting any real data.</p>
<h3>Connect to the Web UI</h3>
<ol>
<li>Open <a target="_blank" rel="noopener noreferrer" class="" href="https://os.agno.com">os.agno.com</a></li>
<li>Add OS → Local → <code>http://localhost:8000</code></li>
<li>Connect</li>
</ol>
<h2>Connect to Slack</h2>
<p>Dash lives in Slack. You can DM it or mention it in a channel with @Dash. Each thread maps to one session, so every conversation gets its own context.</p>
<ol>
<li>Run Dash and give it a public URL (use ngrok for local, or your deployed domain).</li>
<li>Follow instructions in <code>docs/SLACK_CONNECT</code> to create and install the Slack app from the manifest.</li>
<li>Set <code>SLACK_TOKEN</code> and <code>SLACK_SIGNING_SECRET</code>, then restart Dash.</li>
</ol>
<video width="700" height="700" class="rounded-2xl" loop="" autoplay="" muted="" playsinline="" controls=""><source src="/videos/dash-in-slack.mp4">Your browser does not support the video tag.</video>
<h2>Adding Your Own Data</h2>
<p>Once you have Dash running, making it your own is straightforward. Replace the sample dataset with your data and give Dash the context it needs.</p>
<h3>1. Load your tables into the <code>public</code> schema</h3>
<p>Use whatever pipeline you already have. <code>pg_dump</code>, a Python script, dbt, Airbyte. Dash reads from <code>public</code> and never writes to it. You can use your existing workflow orchestration tools (Airflow, Dagster), or use Dash's built-in scheduler.</p>
<h3>2. Add table knowledge</h3>
<p>For each table, create a JSON file in <code>knowledge/tables/</code>:</p>
<pre class="language-json"><code class="language-json"><span class="token punctuation">{</span>
  <span class="token property">"table_name"</span><span class="token operator">:</span> <span class="token string">"customers"</span><span class="token punctuation">,</span>
  <span class="token property">"table_description"</span><span class="token operator">:</span> <span class="token string">"B2B SaaS customer accounts with company info and lifecycle status"</span><span class="token punctuation">,</span>
  <span class="token property">"use_cases"</span><span class="token operator">:</span> <span class="token punctuation">[</span><span class="token string">"Churn analysis"</span><span class="token punctuation">,</span> <span class="token string">"Cohort segmentation"</span><span class="token punctuation">,</span> <span class="token string">"Acquisition reporting"</span><span class="token punctuation">]</span><span class="token punctuation">,</span>
  <span class="token property">"data_quality_notes"</span><span class="token operator">:</span> <span class="token punctuation">[</span>
    <span class="token string">"signup_date is DATE (not TIMESTAMP) — no time component"</span><span class="token punctuation">,</span>
    <span class="token string">"status values: active, churned, trial"</span><span class="token punctuation">,</span>
    <span class="token string">"company_size is self-reported"</span>
  <span class="token punctuation">]</span><span class="token punctuation">,</span>
  <span class="token property">"table_columns"</span><span class="token operator">:</span> <span class="token punctuation">[</span>
    <span class="token punctuation">{</span><span class="token property">"name"</span><span class="token operator">:</span> <span class="token string">"id"</span><span class="token punctuation">,</span> <span class="token property">"type"</span><span class="token operator">:</span> <span class="token string">"SERIAL"</span><span class="token punctuation">,</span> <span class="token property">"description"</span><span class="token operator">:</span> <span class="token string">"Primary key"</span><span class="token punctuation">}</span><span class="token punctuation">,</span>
    <span class="token punctuation">{</span><span class="token property">"name"</span><span class="token operator">:</span> <span class="token string">"company_name"</span><span class="token punctuation">,</span> <span class="token property">"type"</span><span class="token operator">:</span> <span class="token string">"TEXT"</span><span class="token punctuation">,</span> <span class="token property">"description"</span><span class="token operator">:</span> <span class="token string">"Company name"</span><span class="token punctuation">}</span><span class="token punctuation">,</span>
    <span class="token punctuation">{</span><span class="token property">"name"</span><span class="token operator">:</span> <span class="token string">"status"</span><span class="token punctuation">,</span> <span class="token property">"type"</span><span class="token operator">:</span> <span class="token string">"TEXT"</span><span class="token punctuation">,</span> <span class="token property">"description"</span><span class="token operator">:</span> <span class="token string">"Current status: active, churned, trial"</span><span class="token punctuation">}</span>
  <span class="token punctuation">]</span>
<span class="token punctuation">}</span>
</code></pre>
<blockquote class="not-prose relative isolate pl-6 text-ink py-3 text-lg"><span aria-hidden="true" class="absolute inset-y-1 left-0 w-0.5 rounded-full bg-accent"></span><div class="flex gap-3"><div class="space-y-1"><div class="leading-relaxed"><p>This is the single highest-leverage thing you can do. The better your knowledge, the better Dash performs.</p></div></div></div></blockquote>
<h3>3. Add validated queries</h3>
<p>For your most common questions, write the SQL that gives the correct answer and save it in <code>knowledge/queries/</code>:</p>
<pre class="language-sql"><code class="language-sql"><span class="token comment">-- &lt;query current_mrr&gt;</span>
<span class="token comment">-- &lt;description&gt;Current total MRR from active subscriptions&lt;/description&gt;</span>
<span class="token comment">-- &lt;query&gt;</span>
<span class="token keyword">SELECT</span>
    <span class="token function">SUM</span><span class="token punctuation">(</span>mrr<span class="token punctuation">)</span> <span class="token keyword">AS</span> total_mrr<span class="token punctuation">,</span>
    <span class="token function">COUNT</span><span class="token punctuation">(</span><span class="token operator">*</span><span class="token punctuation">)</span> <span class="token keyword">AS</span> active_subscriptions
<span class="token keyword">FROM</span> subscriptions
<span class="token keyword">WHERE</span> <span class="token keyword">status</span> <span class="token operator">=</span> <span class="token string">'active'</span><span class="token punctuation">;</span>
<span class="token comment">-- &lt;/query&gt;</span>
</code></pre>
<p>This is the easiest way to make sure Dash uses your internal semantics for answering routine questions. Your job is to deliver the best work-with-your-data experience for your team. This makes it possible.</p>
<h3>4. Add business rules</h3>
<p>Document your metrics, definitions, and gotchas in <code>knowledge/business/</code>:</p>
<pre class="language-json"><code class="language-json"><span class="token punctuation">{</span>
  <span class="token property">"metrics"</span><span class="token operator">:</span> <span class="token punctuation">[</span>
    <span class="token punctuation">{</span>
      <span class="token property">"name"</span><span class="token operator">:</span> <span class="token string">"MRR"</span><span class="token punctuation">,</span>
      <span class="token property">"definition"</span><span class="token operator">:</span> <span class="token string">"Sum of active subscriptions excluding trials"</span><span class="token punctuation">,</span>
      <span class="token property">"calculation"</span><span class="token operator">:</span> <span class="token string">"SUM(mrr) FROM subscriptions WHERE status = 'active'"</span>
    <span class="token punctuation">}</span>
  <span class="token punctuation">]</span><span class="token punctuation">,</span>
  <span class="token property">"common_gotchas"</span><span class="token operator">:</span> <span class="token punctuation">[</span>
    <span class="token punctuation">{</span>
      <span class="token property">"issue"</span><span class="token operator">:</span> <span class="token string">"Active subscription detection"</span><span class="token punctuation">,</span>
      <span class="token property">"solution"</span><span class="token operator">:</span> <span class="token string">"Filter on ended_at IS NULL, not status column"</span>
    <span class="token punctuation">}</span>
  <span class="token punctuation">]</span>
<span class="token punctuation">}</span>
</code></pre>
<p>Helpful context for Dash. You can skip if it's too much work up front.</p>
<h3>5. Load knowledge</h3>
<pre class="language-bash"><code class="language-bash">python scripts/load_knowledge.py             <span class="token comment"># Upsert changes</span>
python scripts/load_knowledge.py --recreate  <span class="token comment"># Fresh start</span>
</code></pre>
<h2>Scheduled Tasks</h2>
<p>Dash ships with a built-in scheduler. You can schedule any type of task that your container can handle.</p>
<p>Out of the box, Dash comes with a pre-built schedule that re-indexes your knowledge base every night at 4am UTC:</p>
<pre class="language-python"><code class="language-python">mgr<span class="token punctuation">.</span>create<span class="token punctuation">(</span>
    name<span class="token operator">=</span><span class="token string">"knowledge-refresh"</span><span class="token punctuation">,</span>
    cron<span class="token operator">=</span><span class="token string">"0 4 * * *"</span><span class="token punctuation">,</span>
    endpoint<span class="token operator">=</span><span class="token string">"/knowledge/reload"</span><span class="token punctuation">,</span>
    payload<span class="token operator">=</span><span class="token punctuation">{</span><span class="token punctuation">}</span><span class="token punctuation">,</span>
    timezone<span class="token operator">=</span><span class="token string">"UTC"</span><span class="token punctuation">,</span>
    description<span class="token operator">=</span><span class="token string">"Daily knowledge file re-index"</span><span class="token punctuation">,</span>
<span class="token punctuation">)</span>
</code></pre>
<p>Same pattern for anything else: daily metric summaries posted to Slack, anomaly detection runs, weekly email digests, automated data quality checks. Register a schedule, point it at an endpoint, Dash handles the rest.</p>
<p>The best agents are proactive. Scheduled tasks are the first step in that direction.</p>
<h2>Run Evals</h2>
<p>Dash ships with five eval categories:</p>
<ul>
<li><strong>Accuracy:</strong> correct data and meaningful insights</li>
<li><strong>Routing:</strong> team routes to the correct agent</li>
<li><strong>Security:</strong> no credential or secret leaks</li>
<li><strong>Governance:</strong> refuses destructive SQL operations</li>
<li><strong>Boundaries:</strong> schema access boundaries respected</li>
</ul>
<pre class="language-bash"><code class="language-bash">python -m evals                      <span class="token comment"># Run all</span>
python -m evals --category accuracy  <span class="token comment"># Run one category</span>
python -m evals --verbose            <span class="token comment"># Show response details</span>
</code></pre>
<h2>Deploy to Production</h2>
<p>You can deploy Dash to Railway with one command:</p>
<pre class="language-bash"><code class="language-bash"><span class="token function">cp</span> example.env .env.production
<span class="token comment"># Edit .env.production — set OPENAI_API_KEY</span>

railway login
./scripts/railway_up.sh
</code></pre>
<p>Railway is fine for getting started. Eventually you'd want it wherever your existing data infrastructure lives. Everything is containerized so deployment should be straightforward. Be mindful of egress costs.</p>
<blockquote class="not-prose relative isolate pl-6 text-ink py-3 text-lg"><span aria-hidden="true" class="absolute inset-y-1 left-0 w-0.5 rounded-full bg-sky-500 dark:bg-sky-400"></span><div class="flex gap-3"><div class="space-y-1"><div class="leading-relaxed"><p>Production requires a <code>JWT_VERIFICATION_KEY</code> from os.agno.com for RBAC. It would be insane to expose Dash on a public endpoint.</p></div></div></div></blockquote>
<h2>What's Next</h2>
<p>Dash is built with <a target="_blank" rel="noopener noreferrer" class="" href="https://www.ashpreetbedi.com/articles/systems-engineering">systems engineering principles</a>. Five layers: agent, data, security, interface, infrastructure. Each layer affects the others. Design them together and the system compounds.</p>
<p>If there's interest, I'll do deep dives on each layer:</p>
<ul>
<li><strong>Agent Engineering:</strong> The business logic. Model, instructions, tools, knowledge, and the self-learning loop.</li>
<li><strong>Data Engineering:</strong> The context layer. Memory, knowledge, learnings, storage. Why the data layer is the most underinvested part of the stack.</li>
<li><strong>Security Engineering:</strong> Auth, RBAC, governance, data isolation, and audit trails designed into the system as core primitives.</li>
<li><strong>Interface Engineering:</strong> Turning an agent into a product. REST APIs, web UIs, Slack, MCP, and how one agent serves multiple surfaces.</li>
<li><strong>Infrastructure Engineering:</strong> How to deploy and scale Dash. Containers, deployment, scheduling.</li>
</ul>
<h2>TLDR</h2>
<p>Every company with 30+ people should have an internal data agent. Dash is a free, open-source, self-learning data system made of 3 agents. It uses curated knowledge and continuous learning to get better with every query. Three agents (Leader, Analyst, Engineer) share knowledge and build on each other's work. Security is enforced by the system: read-only connections, schema-level isolation, eval-tested boundaries. Runs in your cloud, lives in Slack. Clone it, run <code>docker compose up</code>, and have the entire system running in minutes.</p>
<hr>
<ul>
<li>
<a target="_blank" rel="noopener noreferrer" class="" href="https://github.com/agno-agi/dash">GitHub: agno-agi/dash</a>
</li>
<li>
<a target="_blank" rel="noopener noreferrer" class="" href="https://openai.com/index/inside-our-in-house-data-agent/">OpenAI's data agent</a>
</li>
<li>
<a target="_blank" rel="noopener noreferrer" class="" href="https://www.ashpreetbedi.com/articles/systems-engineering">Systems Engineering</a>
</li>
<li>
<a target="_blank" rel="noopener noreferrer" class="" href="https://www.ashpreetbedi.com/articles/gpu-poor-continuous-learning">GPU Poor Continuous Learning</a>
</li>
<li>
<a target="_blank" rel="noopener noreferrer" class="" href="https://www.ashpreetbedi.com/articles/dash">Dash v1</a>
</li>
</ul>
<p>Built with <a target="_blank" rel="noopener noreferrer" class="" href="https://github.com/agno-agi/agno">Agno</a>.</p>]]></content:encoded>
            <author>hi@ashpreetbedi.com (Ashpreet Bedi)</author>
        </item>
        <item>
            <title><![CDATA[Dash: Self-learning data agent]]></title>
            <link>https://ashpreetbedi.com/articles/dash</link>
            <guid isPermaLink="false">https://ashpreetbedi.com/articles/dash</guid>
            <pubDate>Mon, 02 Feb 2026 00:00:00 GMT</pubDate>
            <content:encoded><![CDATA[<blockquote>
<p>Here's a link to the <a target="_blank" rel="noopener noreferrer" class="" href="https://github.com/agno-agi/dash">GitHub repo</a> if you want to dive right in.</p>
</blockquote>
<p>OpenAI shared <a target="_blank" rel="noopener noreferrer" class="" href="https://openai.com/index/inside-our-in-house-data-agent/">how they built their internal data agent</a>. 6 layers of context, a self-learning memory system, and real lessons from running it in production. The best enterprise data agent out there.</p>
<p>I've been working on <a target="_blank" rel="noopener noreferrer" class="" href="https://www.ashpreetbedi.com/articles/sql-agent">a similar agent</a> and their architecture validates the gpu-poor continuous learning approach I've been testing.</p>
<p>Today I'm open-sourcing my version. It's called Dash.</p>
<p><a target="_blank" rel="noopener noreferrer" class="" href="https://github.com/agno-agi/dash">Dash</a> is a self-learning data agent that grounds its answers in 6 layers of context and improves with every run.</p>
<ul>
<li><strong>Table Usage:</strong> schema, columns, relationships</li>
<li><strong>Human Annotations:</strong> metrics, definitions, gotchas</li>
<li><strong>Query Patterns:</strong> SQL that's known to work</li>
<li><strong>Institutional Knowledge:</strong> external docs, research</li>
<li><strong>Memory:</strong> error patterns, discovered fixes</li>
<li><strong>Runtime Context:</strong> live schema when things change</li>
</ul>
<h2>The 6 Layers of Context</h2>
<p>OpenAI's insight: context is everything. Without it, even strong models hallucinate column names, miss type quirks, and ignore tribal knowledge.</p>
<p>Another problem is that most Text-to-SQL agents are stateless, they make mistakes, you fix them, then they make the same mistake again because every session starts fresh.</p>
<p>Dash fixes this by implementing 6 layers of context:</p>
<table><thead><tr><th>Layer</th><th>What it provides</th><th>Source</th></tr></thead><tbody><tr><td><strong>Table Usage</strong></td><td>Schema, columns, relationships</td><td><code>knowledge/tables/*.json</code></td></tr><tr><td><strong>Human Annotations</strong></td><td>Metrics, definitions, gotchas</td><td><code>knowledge/business/*.json</code></td></tr><tr><td><strong>Query Patterns</strong></td><td>SQL that's known to work</td><td><code>knowledge/queries/*.sql</code></td></tr><tr><td><strong>Institutional Knowledge</strong></td><td>External docs, research</td><td>MCP (optional)</td></tr><tr><td><strong>Memory</strong></td><td>Error patterns, discovered fixes</td><td><code>LearningMachine</code></td></tr><tr><td><strong>Runtime Context</strong></td><td>Live schema when things change</td><td><code>introspect_schema</code> tool</td></tr></tbody></table>
<p>The agent retrieves relevant context at runtime via hybrid search, uses this to generate grounded SQL, then uses the results to deliver insights.</p>
<video width="700" height="700" class="rounded-2xl" loop="" autoplay="" muted="" playsinline="" controls=""><source src="/videos/dash-context-retrieval.mp4">Your browser does not support the video tag.</video>
<p><a target="_blank" rel="noopener noreferrer" class="" href="https://openai.com/index/inside-our-in-house-data-agent/">OpenAI's post</a> goes into more detail about each layer.</p>
<h2>The Self-Learning Loop</h2>
<p>Instead of fine-tuning or retraining, Dash learns through two complementary systems:</p>
<p><strong>Static Knowledge:</strong> Validated queries, business context, table schemas, data quality notes, metric definitions, tribal knowledge and gotchas. These are curated by your team and maintained alongside Dash (it also updates successful queries as it comes across them).</p>
<p><strong>Continuous Learning:</strong> Patterns that Dash discovers through trial and error. The more you use Dash, the better it gets. Eg: Columns named <code>state</code> in one table map to <code>status</code> in another. It also learns what your team is focused on: preparing for an IPO? Dash learns that S-1 metrics live in a separate dataset, that "revenue" means ARR not bookings, and that the board wants cohort retention broken out by enterprise vs SMB. Every learning becomes a data point that improves Dash.</p>
<p>I call this gpu-poor continuous learning (no GPUs are harmed in these experiments) and it's literally 5 lines of code:</p>
<pre class="language-python"><code class="language-python">
learning<span class="token operator">=</span>LearningMachine<span class="token punctuation">(</span>
    knowledge<span class="token operator">=</span>data_agent_learnings<span class="token punctuation">,</span>
    user_profile<span class="token operator">=</span>UserProfileConfig<span class="token punctuation">(</span>mode<span class="token operator">=</span>LearningMode<span class="token punctuation">.</span>AGENTIC<span class="token punctuation">)</span><span class="token punctuation">,</span>
    user_memory<span class="token operator">=</span>UserMemoryConfig<span class="token punctuation">(</span>mode<span class="token operator">=</span>LearningMode<span class="token punctuation">.</span>AGENTIC<span class="token punctuation">)</span><span class="token punctuation">,</span>
    learned_knowledge<span class="token operator">=</span>LearnedKnowledgeConfig<span class="token punctuation">(</span>mode<span class="token operator">=</span>LearningMode<span class="token punctuation">.</span>AGENTIC<span class="token punctuation">)</span><span class="token punctuation">,</span>
<span class="token punctuation">)</span>
</code></pre>
<h2>Build your own</h2>
<p>Follow the <a target="_blank" rel="noopener noreferrer" class="" href="https://github.com/agno-agi/dash">README</a> for an in-depth guide. Here's a quick start:</p>
<pre class="language-bash"><code class="language-bash"><span class="token comment"># Clone the repo and export your OpenAI API key</span>
<span class="token function">git</span> clone https://github.com/agno-agi/dash <span class="token operator">&amp;&amp;</span> <span class="token builtin class-name">cd</span> dash
<span class="token function">cp</span> example.env .env  <span class="token comment"># Add OPENAI_API_KEY</span>

<span class="token comment"># Start dash</span>
<span class="token function">docker</span> compose up -d --build

<span class="token comment"># Load data and knowledge</span>
<span class="token function">docker</span> <span class="token builtin class-name">exec</span> -it dash-api python -m dash.scripts.load_data
<span class="token function">docker</span> <span class="token builtin class-name">exec</span> -it dash-api python -m dash.scripts.load_knowledge
</code></pre>
<p>This loads sample data (F1 race data from 1950-2020) and the knowledge base (table metadata, validated queries, business rules).</p>
<h2>Connect to the UI</h2>
<p>Dash comes with a UI out of the box (via Agno). Use it to interact with Dash, view sessions and traces:</p>
<ol>
<li>Open <a target="_blank" rel="noopener noreferrer" class="" href="https://os.agno.com">os.agno.com</a></li>
<li>Add OS → Local → <code>http://localhost:8000</code></li>
<li>Connect</li>
</ol>
<video width="700" height="700" class="rounded-2xl" loop="" autoplay="" muted="" playsinline="" controls=""><source src="/videos/dash-ui-demo.mp4">Your browser does not support the video tag.</video>
<p>Try these on the F1 dataset:</p>
<pre><code>Who won the most F1 World Championships?
How many races has Lewis Hamilton won?
Compare Ferrari vs Mercedes points 2015-2020
</code></pre>
<h2>Run evals</h2>
<p>Dash ships with an extensive evaluation suite. String matching, LLM grading, and golden SQL comparison. Extend and add your own, this is one of those projects where evals work surprisingly well.</p>
<pre class="language-bash"><code class="language-bash"><span class="token function">docker</span> <span class="token builtin class-name">exec</span> -it dash-api python -m dash.evals.run_evals         <span class="token comment"># string matching</span>
<span class="token function">docker</span> <span class="token builtin class-name">exec</span> -it dash-api python -m dash.evals.run_evals -g      <span class="token comment"># LLM grader</span>
<span class="token function">docker</span> <span class="token builtin class-name">exec</span> -it dash-api python -m dash.evals.run_evals -g -r   <span class="token comment"># both + golden SQL</span>
</code></pre>
<h2>Closing thoughts</h2>
<p>Data agents are one of the best enterprise use cases for AI right now. Every company (over a certain size) should have one. Vercel has D0, OpenAI built one.</p>
<h3>Dash is my attempt to make that accessible to everyone.</h3>
<hr>
<h2>Learn More</h2>
<ul>
<li>
<a target="_blank" rel="noopener noreferrer" class="" href="https://github.com/agno-agi/dash">GitHub: agno-agi/dash</a>
</li>
<li>
<a target="_blank" rel="noopener noreferrer" class="" href="https://openai.com/index/inside-our-in-house-data-agent/">OpenAI's post: Inside OpenAI's In-House Data Agent</a>
</li>
<li>
<a target="_blank" rel="noopener noreferrer" class="" href="https://www.ashpreetbedi.com/articles/sql-agent">Previous work: Self-Improving SQL Agent</a>
</li>
</ul>
<p>Built with <a target="_blank" rel="noopener noreferrer" class="" href="https://github.com/agno-agi/agno">Agno</a>. Give it a ⭐️</p>]]></content:encoded>
            <author>hi@ashpreetbedi.com (Ashpreet Bedi)</author>
        </item>
        <item>
            <title><![CDATA[Dynamic Software]]></title>
            <link>https://ashpreetbedi.com/articles/dynamic-software</link>
            <guid isPermaLink="false">https://ashpreetbedi.com/articles/dynamic-software</guid>
            <pubDate>Thu, 30 Apr 2026 00:00:00 GMT</pubDate>
            <content:encoded><![CDATA[<p>For fifty years, software has been static.</p>
<p>Every program you've ever used is a collection of functions run through a hard-coded control flow: If, else, while, for. The functions do the work. Reading from databases. Calling APIs. Transforming data.</p>
<p>Same Input = Same Output. This was the contract for fifty years.</p>
<p>Then 2024 happened. The control flow came alive and created a new category of software. Software that is alive, dynamic, on-demand.</p>
<h2>Software is dead, long live Software</h2>
<p>Static software is a recording. You press play and you get back exactly what was captured. Same notes, same order, every time. The performance happened once, in a devbox, and now it plays the same tune every time.</p>
<p>Dynamic Software is a live orchestra.</p>
<p>The score exists. The instruments exist. The musicians exist. But what happens in the room tonight depends on the maestro, the players, the moment. The model is the maestro. The tools are the instruments. The control flow is the performance, not the recording.</p>
<p>This is what people feel when they use a great agent and can't quite explain why it feels different. They've spent their whole lives interacting with buttons. Now they're in a room with a live performance for the first time. The software is responding to them, here, now, with judgment and presence. It's listening. It's adjusting. It's alive in a way software has never been alive before.</p>
<p>Recordings are perfect. Live performances aren't. A live orchestra makes choices. Sometimes it stumbles. Sometimes it surprises you. The reason we still pay to hear live music is that something different happens in the room.</p>
<p>The performance is the point.</p>
<p>Dynamic Software is alive. It's not deterministic. It's not perfect. And once you've felt the difference, recordings feel like what they always were. Frozen.</p>
<p>We're not building better recordings. We're building the first generation of software that performs.</p>
<h2>Assumptions Dynamic Software breaks</h2>
<p>When software comes alive, every assumption built on static software breaks.</p>
<p><strong>Determinism breaks.</strong> Same input no longer means same output. The model considers context, memory, learnings. The software does something different on Tuesday afternoon than it did on Monday morning. While this can be (somewhat) controlled in text, we should note that the visual era is next. Charts, dashboards, entire screens generated on-demand. Instead of forcing determinism on non-deterministic software, give in, enjoy the ride.</p>
<p><strong>State and time work differently.</strong> Static programs don't need to remember much. The control flow is the same every time, so state lives in a database and is CRUD only. In Dynamic Software, state is context. Memory of past sessions. History of what worked. Knowledge of the domain. The database stops being storage and becomes the context the software runs on.</p>
<p>Sessions follow from this. A static API endpoint is stateless by design. Each request is independent. Dynamic Software is the opposite. A session is a continuous context that spans minutes, days, sometimes weeks. The user comes back, the agent picks up where it left off. Sessions become first-class.</p>
<p>Time changes too. Static software returns in milliseconds, seconds if you don't believe in data co-location. Dynamic Software reasons. It calls tools. It waits for tools to return. It reasons again. A single request takes minutes sometimes. Streaming is the default. Background execution is a core primitive. The HTTP request/response model strains and breaks and so does the default 29s loadbalancer timeout.</p>
<p><strong>The software needs to watch itself.</strong> With static software, you can read the code and know what it does. With Dynamic Software, you can't. The control flow is a model and the model is opaque. The only way to know what your software did is to record everything it did. Every reasoning step. Every tool call. Every retrieval. Tracing goes from a debugging tool to the only way to understand your software.</p>
<p>Watching isn't enough. Static programs don't make decisions, so there's nothing to approve. Dynamic Software makes decisions, and decisions have consequences. Some can be made freely. Some need the user. Some need an admin. Your software has to express which is which, and your runtime has to enforce it.</p>
<p>Every one of these is a real engineering problem. Every team building Dynamic Software hits them all. Most spend months solving these from scratch.</p>
<h2>A new category needs a new runtime</h2>
<p>Static software has a mature runtime. You write Django or Express, deploy to a managed platform, and don't think about HTTP, sessions, scaling, or recovery. The infrastructure is solved. The platform handles it.</p>
<p>Dynamic Software has no equivalent. You write an agent. Then you build six months of infrastructure around it, fixing every edge case manually. Edge cases you only learn after running agents at scale. SSE + websockets. Streaming + background execution. Sessions that survive restarts. Storage you can actually query, not five vendors stitched together. Approval gates that wait for admin sign-off, not just user confirmation. Per-resource, per-tool RBAC. Agents available on Slack, Telegram, WhatsApp, because no one wants to use a custom UI.</p>
<p>This is why 80% of agents don't work, there's a painful amount of grind in the last mile.</p>
<p>The last shift this big was going from desktop apps to web apps. Web software needed its own runtime, its own protocols, its own infrastructure, its own developer tools. We spent two decades building all of it.</p>
<p>Dynamic Software is here. Starting from scratch. Its own runtime. Its own protocols. Its own infrastructure. Its own developer tools.</p>
<h2>The next decade</h2>
<p>Static software took fifty years to mature. Operating systems, databases, web servers, deploy platforms, observability stacks, identity providers. We forget how recent most of it is. Heroku was 2007. Kubernetes was 2014. Vercel was 2015. The infrastructure we now take for granted is younger than most of the people building on it.</p>
<p>Dynamic Software is at year one.</p>
<p>Whoever builds the runtime, the protocols, the developer tools, the platforms, defines the next era of software. The work ahead is enormous. It is also the most interesting work I've done in the past fifteen years.</p>
<p>Come build with us at <a target="_blank" rel="noopener noreferrer" class="" href="https://agno.link/gh">Agno</a>.</p>]]></content:encoded>
            <author>hi@ashpreetbedi.com (Ashpreet Bedi)</author>
        </item>
        <item>
            <title><![CDATA[Evals Don't Give You a Working Product]]></title>
            <link>https://ashpreetbedi.com/articles/evals-not-enough</link>
            <guid isPermaLink="false">https://ashpreetbedi.com/articles/evals-not-enough</guid>
            <pubDate>Sat, 10 Jan 2026 00:00:00 GMT</pubDate>
            <content:encoded><![CDATA[<p>Evals are the holy grail of AI engineering. Or so we've been told.</p>
<p>Two years. Billions in VC funding. Thousands of blog posts about "production-ready agents." An entire industry built around evaluation frameworks, observability platforms, and benchmarks.</p>
<p>The result?</p>
<ul>
<li>
<p><strong>11% of organizations have agents in production.</strong> <a target="_blank" rel="noopener noreferrer" class="" href="https://www.deloitte.com/us/en/insights/topics/technology-management/tech-trends/2026/agentic-ai-strategy.html">[Deloitte]</a></p>
</li>
<li>
<p><strong>40%+ of agentic AI projects will be cancelled by 2027.</strong> <a target="_blank" rel="noopener noreferrer" class="" href="https://www.gartner.com/en/newsroom/press-releases/2024-10-22-gartner-says-over-40-percent-of-agentic-ai-projects-will-be-abandoned-by-2027">[Gartner]</a></p>
</li>
<li>
<p><strong>80%+ never reach meaningful production.</strong> <a target="_blank" rel="noopener noreferrer" class="" href="https://www.rand.org/pubs/research_reports/RRA2680-1.html">[RAND]</a></p>
</li>
</ul>
<p>If evals were the answer, these numbers would be different.</p>
<p>Here's what I've learned after two years of shipping agents: <strong>passing evals ≠ working product.</strong> You can have a green test suite and a broken product. You can hit 95% on your benchmark and watch your agent choke the moment a real user touches it.</p>
<p>Evals don't get you to production. A working product does.</p>
<h2>The Pitch vs. The Reality</h2>
<p>Here's what the eval-industrial complex told us:</p>
<blockquote>
<p>"Evals are the key to production-ready agents" — <a target="_blank" rel="noopener noreferrer" class="" href="https://www.databricks.com/blog/key-production-ai-agents-evaluations">Databricks</a></p>
</blockquote>
<p>Here's what actually happens:</p>
<p>You build an agent in a Python script. It works. You run your eval suite. Green lights everywhere. You demo it to stakeholders. They love it. Then you try to ship it.</p>
<p>Everything falls apart.</p>
<h2>What Evals Don't Test</h2>
<p>Your eval suite said the agent was ready. Here's what it missed:</p>
<p><strong>Your agent isn't a function — it's a process.</strong> A single response might take 30 seconds. Or 3 minutes. Or 10 minutes if it's doing research. Traditional servers handle stateless request-response cycles in milliseconds. Your agent thinks, waits, calls tools, thinks again. Try fitting that into a Lambda with a 15-second timeout.</p>
<p><strong>State breaks at scale.</strong> Works great with 1 user on 1 container. Add more users? State bleeds across sessions. Add more containers? State disappears entirely. Store it in memory? Gone when the process dies. Store it in a database? Now you're building infrastructure you didn't plan for.</p>
<p><strong>Streaming is harder than it looks.</strong> In your notebook, responses just appeared. In production, users stare at a blank screen for 8 seconds wondering if the app crashed. You try SSE. Then WebSockets. Then you realize you need durable streams that survive network hiccups, handle backpressure, and resume gracefully after disconnects.</p>
<p><strong>The real world doesn't mock.</strong> Your agent calls an external API. In testing, mocks returned clean data every time. In production, the API times out. Returns malformed JSON. Hits rate limits. Requires re-authentication mid-session. Your agent chokes. Your eval suite never saw it coming.</p>
<blockquote class="not-prose relative isolate pl-6 text-ink py-3 text-lg"><span aria-hidden="true" class="absolute inset-y-1 left-0 w-0.5 rounded-full bg-accent"></span><div class="flex gap-3"><div class="space-y-1"><div class="leading-relaxed"><p>Agents fail because of an inadequate runtime, not intelligence. Evals don't measure any of it.</p></div></div></div></blockquote>
<p>We've been obsessing over the brain while ignoring the nervous system.</p>
<h2>The Trap: Evals Too Early</h2>
<p>Here's the thing that really kills projects: writing evals before you have a working product.</p>
<p>Every hour spent writing evals is an hour not spent learning what your product actually needs. You're locking yourself into test cases for a system that doesn't exist yet.</p>
<p>The agent you're building now? It's not the one that's going to ship. It's going to be the second iteration. Or the fifth. The eval suite you wrote for version one is useless for version three. Worse than useless — it's weight you're dragging around.</p>
<p>The eval-industrial complex sold you on this idea that evals-first is disciplined. It's not.</p>
<p>The right sequence:</p>
<ol>
<li>Build something that runs</li>
<li>Get it in front of real users (internal users are fine)</li>
<li>Learn what breaks, what matters, what "good" actually looks like</li>
<li><em>Then</em> write evals to lock in that understanding</li>
</ol>
<p>You can't evaluate what you can't run.</p>
<h2>What Evals Are Actually Good For</h2>
<p>I'm not saying evals are useless. They're critical — for model providers shipping foundation models. If you're training GPT-5, you need benchmarks. Even for AI engineers building products on top of those models, evals help with:</p>
<ul>
<li>Catching regressions after you change something</li>
<li>Comparing model versions</li>
<li>Compliance checkboxes</li>
</ul>
<p>That's it. They won't help you ship. They won't help you scale. They won't help you handle the thousand edge cases that only appear in production.</p>
<h2>What Actually Gets You to Production</h2>
<p>The market says: <strong>Evals → Observability → Production.</strong></p>
<p>This is backwards. Here's what actually works:</p>
<p><strong>Runtime → Production → (Evals + Observability)</strong></p>
<p>The foundation comes first. Everything else is a support layer.</p>
<p><strong>The foundation:</strong></p>
<ul>
<li>
<p><strong>A runtime that handles the weird stuff.</strong> Concurrent users. Failure recovery. Long-running stateful processes that survive container restarts. Your agent isn't a microservice — stop treating it like one.</p>
</li>
<li>
<p><strong>State management that doesn't disappear.</strong> Sessions that survive crashes. Context that carries across conversations. Memory that doesn't evaporate when Kubernetes decides to reschedule your pod.</p>
</li>
<li>
<p><strong>Storage that lives with the agent.</strong> The agent's data — sessions, memory, knowledge — stored where the agent runs. In your cloud. Under your control. Send it to a third-party service and you've lost control of your product's brain.</p>
</li>
<li>
<p><strong>Infrastructure you own.</strong> Your environment. Your data. Your competitive advantage.</p>
</li>
</ul>
<p><strong>The support layer (after you're running):</strong></p>
<ul>
<li><strong>Observability</strong> for real production behavior — not synthetic test traces.</li>
<li><strong>Evals</strong> to catch regressions — run them in CI, keep them lean.</li>
<li><strong>Tracing</strong> to debug when things go wrong.</li>
</ul>
<p>The support layer matters. But without the foundation, you're just testing in a notebook.</p>
<h2>The Questions That Actually Matter</h2>
<p>You have a working agent in a Python script. Great. Now answer these:</p>
<ul>
<li>Where will it run?</li>
<li>Can it handle 100 concurrent users? 1,000?</li>
<li>What happens when a container crashes mid-conversation?</li>
<li>Is streaming smooth or do users watch a loading spinner for 10 seconds?</li>
<li>Where does the agent's memory live? Who owns it?</li>
<li>How do you deploy updates without breaking active sessions?</li>
</ul>
<p>Evals don't answer any of these questions. The runtime does.</p>
<h2>The Path Forward</h2>
<p>I built <a target="_blank" rel="noopener noreferrer" class="" href="https://github.com/agno-agi/agno">Agno</a> because I got tired of watching good agents die in the gap between "works in a notebook" and "runs in production."</p>
<p>Agno is a runtime for agents. It handles the stuff evals can't test:</p>
<ul>
<li><strong>Concurrent execution</strong> — thousands of users, isolated state</li>
<li><strong>Persistent storage</strong> — sessions survive crashes, memory persists across conversations</li>
<li><strong>Streaming that works</strong> — SSE out of the box, handles disconnects gracefully</li>
<li><strong>Your infrastructure</strong> — runs in your cloud, data never leaves your environment</li>
</ul>
<p>The eval-industrial complex had their shot. Two years. Billions in funding. The production numbers haven't moved.</p>
<p>Maybe it's time to focus on actually shipping.</p>
<h2>Want to build with Agno?</h2>
<ul>
<li><strong>GitHub:</strong> <a target="_blank" rel="noopener noreferrer" class="" href="https://agno.link/gh">agno.link/gh</a></li>
<li><strong>Documentation:</strong> <a target="_blank" rel="noopener noreferrer" class="" href="https://agno.link/docs">agno.link/docs</a></li>
<li><strong>AgentOS:</strong> <a target="_blank" rel="noopener noreferrer" class="" href="https://os.agno.com">os.agno.com</a></li>
</ul>
<span class="text-teal-400">Production means a working product deployed to your cloud — not a green eval suite running on your laptop.</span>]]></content:encoded>
            <author>hi@ashpreetbedi.com (Ashpreet Bedi)</author>
        </item>
        <item>
            <title><![CDATA[GPU Poor Continuous Learning with Gemini 3]]></title>
            <link>https://ashpreetbedi.com/articles/gpu-poor-continuous-learning</link>
            <guid isPermaLink="false">https://ashpreetbedi.com/articles/gpu-poor-continuous-learning</guid>
            <pubDate>Thu, 18 Dec 2025 00:00:00 GMT</pubDate>
            <content:encoded><![CDATA[<p>Here's a pattern I've been using to make my agents better without fine-tuning or retraining. We'll use a simple system-level learning loop that's surprisingly effective.</p>
<h2>Table of Contents</h2>
<ol>
<li>The problem with disconnected sessions</li>
<li>What is "gpu-poor continuous learning"</li>
<li>Why Gemini 3 Flash</li>
<li>The learning loop</li>
<li>Demo</li>
<li>What we store (and what we don't)</li>
<li>How to run your own Self-Learning Agent</li>
<li>Why this pattern works</li>
</ol>
<h2>1. The problem with disconnected sessions</h2>
<p>Most agents run in independent sessions, disconnected from each other.</p>
<p>You ask a question. You get an answer. Tomorrow you ask a similar question and the agent starts from scratch. It doesn't remember what worked, what failed, or what it figured out along the way.</p>
<p>This is fine for simple tasks. But for anything complex—research, analysis, decision support—it means:</p>
<ul>
<li>Repeating the same reasoning patterns</li>
<li>Re-discovering the same gotchas</li>
<li>Never building on past success</li>
</ul>
<blockquote class="not-prose relative isolate pl-6 text-ink py-3 text-lg"><span aria-hidden="true" class="absolute inset-y-1 left-0 w-0.5 rounded-full bg-accent"></span><div class="flex gap-3"><div class="space-y-1"><div class="leading-relaxed"><p>If your agent can't learn from its own experience, you're leaving performance on the table.</p></div></div></div></blockquote>
<h2>2. What is GPU Poor Continuous Learning</h2>
<p>Let me be precise about terminology, because "continuous learning" has a specific meaning in ML.</p>
<p><strong>Traditional continuous learning:</strong></p>
<ul>
<li>Model weights update over time</li>
<li>Requires compute (GPUs, TPUs)</li>
<li>Risk of catastrophic forgetting</li>
<li>Learning happens in parameters</li>
</ul>
<p><strong>What I'm doing (GPU Poor Continuous Learning):</strong></p>
<ul>
<li>Model stays completely frozen</li>
<li>Zero training compute</li>
<li>Learning happens in retrieval</li>
<li>Knowledge is auditable and reversible</li>
</ul>
<p>The model doesn't get smarter. The <strong>system</strong> gets smarter.</p>
<p>I call it "GPU Poor" because you get continuous improvement without any of the infrastructure traditionally required for model updates. It's poor man's continuous learning—and it works surprisingly well.</p>
<h2>3. Why Gemini 3 Flash</h2>
<p>I built this with <a target="_blank" rel="noopener noreferrer" class="" href="https://blog.google/technology/developers/build-with-gemini-3-flash/">Gemini 3 Flash</a>, which launched today. Here's why:</p>
<table><thead><tr><th>Factor</th><th>Gemini 3 Flash</th></tr></thead><tbody><tr><td><strong>Cost</strong></td><td>$0.50/1M input, $3/1M output</td></tr><tr><td><strong>Speed</strong></td><td>3x faster than 2.5 Pro</td></tr><tr><td><strong>Context</strong></td><td>1M tokens input</td></tr><tr><td><strong>Agentic coding</strong></td><td>78% SWE-bench (beats Gemini 3 Pro)</td></tr><tr><td><strong>Context caching</strong></td><td>90% cost reduction for repeated tokens</td></tr></tbody></table>
<p>For a self-learning agent, you want:</p>
<ol>
<li><strong>Low cost</strong> — You're making many calls per session</li>
<li><strong>Fast inference</strong> — Tight feedback loops matter</li>
<li><strong>Large context</strong> — Prior learnings need room alongside new data</li>
<li><strong>Strong tool use</strong> — The agent needs to reliably call save/retrieve functions</li>
</ol>
<p>Gemini 3 Flash hits all four. The 1M context window is especially useful—you can include substantial prior learnings without truncating.</p>
<h2>4. The learning loop</h2>
<p>Here's the core pattern:</p>
<pre class="language-text"><code class="language-text">                         Query
                           │
                           ▼
                   Search learnings
                           │
                           ▼
                       Research
                           │
                           ▼
                      Synthesize
                           │
                           ▼
                        Reflect
                           │
              ┌────── reusable? ──────┐
              │                       │
             Yes                      No
              │                       │
              ▼                       │
        Propose to user               │
              │                       │
       ┌── approved? ──┐              │
       │               │              │
      Yes              No             │
       │               │              │
       ▼               │              │
     Save              │              │
       │               │              │
       └───────────────┴──────────────┘
                       │
                       ▼
                    Answer
</code></pre>
<p>Key details:</p>
<ol>
<li>
<p><strong>Search first</strong> — The agent must explicitly search the knowledge base before doing anything else. This isn't automatic; it's enforced through instructions.</p>
</li>
<li>
<p><strong>Most queries won't produce a learning</strong> — This is expected. Learnings should be rare and high-signal, not routine.</p>
</li>
<li>
<p><strong>Human-in-the-loop gating</strong> — The agent proposes learnings, but only saves them with explicit approval. If the user declines, the agent moves on without re-proposing.</p>
</li>
</ol>
<h2>5. Demo</h2>
<p>Here's a demo of the agent in action.</p>
<video width="700" height="700" class="rounded-2xl" loop="" autoplay="" muted="" playsinline="" controls=""><source src="/videos/gpu-poor-learning-agent.mp4">Your browser does not support the video tag.</video>
<h2>6. What we store (and what we don't)</h2>
<p>The biggest mistake is storing too much.</p>
<p>A learning is worth saving if it is:</p>
<ul>
<li><strong>Specific</strong>: "When comparing ETFs, check expense ratio AND tracking error" not "Look at ETF metrics"</li>
<li><strong>Actionable</strong>: Can be directly applied in future similar queries</li>
<li><strong>Generalizable</strong>: Useful beyond this specific question</li>
</ul>
<p>Do not save: raw facts, one-off answers, summaries, speculation, or anything unlikely to recur.</p>
<p>Each learning is structured:</p>
<pre class="language-python"><code class="language-python"><span class="token punctuation">{</span>
    <span class="token string">"title"</span><span class="token punctuation">:</span> <span class="token string">"ETF comparison checklist"</span><span class="token punctuation">,</span>
    <span class="token string">"context"</span><span class="token punctuation">:</span> <span class="token string">"When comparing similar ETFs for investment decisions"</span><span class="token punctuation">,</span>
    <span class="token string">"learning"</span><span class="token punctuation">:</span> <span class="token string">"Always check both expense ratio AND tracking error. Low expense ratio with high tracking error can cost more than a slightly more expensive fund with tight tracking."</span><span class="token punctuation">,</span>
    <span class="token string">"confidence"</span><span class="token punctuation">:</span> <span class="token string">"high"</span><span class="token punctuation">,</span>
    <span class="token string">"type"</span><span class="token punctuation">:</span> <span class="token string">"heuristic"</span><span class="token punctuation">,</span>
    <span class="token string">"created_at"</span><span class="token punctuation">:</span> <span class="token string">"2025-12-17T10:30:00Z"</span>
<span class="token punctuation">}</span>
</code></pre>
<blockquote class="not-prose relative isolate pl-6 text-ink py-3 text-lg"><span aria-hidden="true" class="absolute inset-y-1 left-0 w-0.5 rounded-full bg-accent"></span><div class="flex gap-3"><div class="space-y-1"><div class="leading-relaxed"><p>Most tasks will not produce a learning. That's expected.</p></div></div></div></blockquote>
<h2>7. How to run your own Self-Learning Agent</h2>
<p>I'm providing cookbooks for running your own self-learning agent, built using:</p>
<ul>
<li>FastAPI application for running the agent</li>
<li>Postgres database for storing sessions, memory, and knowledge</li>
</ul>
<p>Here's the <a target="_blank" rel="noopener noreferrer" class="" href="https://agno.link/gemini-agents">link to the code</a>.</p>
<blockquote class="not-prose relative isolate pl-6 text-ink py-3 text-lg"><span aria-hidden="true" class="absolute inset-y-1 left-0 w-0.5 rounded-full"></span><div class="flex gap-3"><div class="space-y-1"><div class="leading-relaxed"><p>You can wrap this up in a container and deploy it to Railway. Here's a sample <a target="_blank" rel="noopener noreferrer" class="text-teal-400 underline" href="https://github.com/agno-agi/agentos-railway">repository</a> you can use.</p></div></div></div></blockquote>
<span class="text-2xl font-semibold flex justify-center"><p><strong>Steps to run your own Self-Learning Agent</strong></p></span>
<h3>1. Clone the repo</h3>
<pre class="language-bash"><code class="language-bash"><span class="token function">git</span> clone https://github.com/agno-agi/agno.git
<span class="token builtin class-name">cd</span> agno
</code></pre>
<h3>2. Create and activate a virtual environment</h3>
<pre class="language-bash"><code class="language-bash">uv venv .gemini-agents --python <span class="token number">3.12</span>
<span class="token builtin class-name">source</span> .gemini-agents/bin/activate
</code></pre>
<h3>3. Install dependencies</h3>
<pre class="language-bash"><code class="language-bash">uv pip <span class="token function">install</span> -r cookbook/02_examples/04_gemini/requirements.txt
</code></pre>
<h3>4. Set environment variables</h3>
<pre class="language-bash"><code class="language-bash"><span class="token comment"># Required for Gemini models</span>
<span class="token builtin class-name">export</span> <span class="token assign-left variable">GOOGLE_API_KEY</span><span class="token operator">=</span>your-google-api-key

<span class="token comment"># Required for agents using parallel search</span>
<span class="token builtin class-name">export</span> <span class="token assign-left variable">PARALLEL_API_KEY</span><span class="token operator">=</span>your-parallel-api-key
</code></pre>
<h3>5. Run Postgres with PgVector</h3>
<p>Postgres stores agent sessions, memory, knowledge, and state. Install <a target="_blank" rel="noopener noreferrer" class="" href="https://www.docker.com/products/docker-desktop">Docker Desktop</a> and run:</p>
<pre class="language-bash"><code class="language-bash">./cookbook/scripts/run_pgvector.sh
</code></pre>
<p>Or run directly:</p>
<pre class="language-bash"><code class="language-bash"><span class="token function">docker</span> run -d <span class="token punctuation">\</span>
  -e <span class="token assign-left variable">POSTGRES_DB</span><span class="token operator">=</span>ai <span class="token punctuation">\</span>
  -e <span class="token assign-left variable">POSTGRES_USER</span><span class="token operator">=</span>ai <span class="token punctuation">\</span>
  -e <span class="token assign-left variable">POSTGRES_PASSWORD</span><span class="token operator">=</span>ai <span class="token punctuation">\</span>
  -e <span class="token assign-left variable">PGDATA</span><span class="token operator">=</span>/var/lib/postgresql <span class="token punctuation">\</span>
  -v pgvolume:/var/lib/postgresql <span class="token punctuation">\</span>
  -p <span class="token number">5532</span>:5432 <span class="token punctuation">\</span>
  --name pgvector <span class="token punctuation">\</span>
  agnohq/pgvector:18
</code></pre>
<h3>6. Run the Agent OS</h3>
<p>Agno provides a web interface for interacting with agents. Start the server:</p>
<pre class="language-bash"><code class="language-bash">python cookbook/02_examples/04_gemini/run.py
</code></pre>
<p>Then visit <a href="https://os.agno.com/?utm_source=github&amp;utm_medium=cookbook&amp;utm_campaign=gemini&amp;utm_content=cookbook-gemini-flash&amp;utm_term=gemini-flash">os.agno.com</a> and add <code>http://localhost:7777</code> as an endpoint.</p>
<h2>8. Why this pattern works</h2>
<p>This approach works because it separates concerns that are usually conflated:</p>
<table><thead><tr><th>Concern</th><th>Traditional</th><th>GPU Poor</th></tr></thead><tbody><tr><td><strong>Reasoning</strong></td><td>Model</td><td>Model (unchanged)</td></tr><tr><td><strong>Learning</strong></td><td>Model weights</td><td>Knowledge base</td></tr><tr><td><strong>Memory</strong></td><td>Context window</td><td>Persistent storage</td></tr></tbody></table>
<p>Benefits:</p>
<ul>
<li><strong>Auditable</strong> — You can see exactly what the agent "learned"</li>
<li><strong>Reversible</strong> — Delete a bad learning, system improves</li>
<li><strong>Fast feedback</strong> — No training cycles, immediate improvement</li>
<li><strong>No forgetting</strong> — New learnings don't overwrite capabilities</li>
</ul>
<p>The pattern generalizes beyond research. Use it for:</p>
<ul>
<li>Market analysis</li>
<li>Competitive intelligence</li>
<li>Technical support</li>
<li>Decision logging</li>
<li>Policy tracking</li>
</ul>
<p>Anywhere beliefs evolve, <strong>learnings beat stateless answers</strong>.</p>
<hr>
<p>Thank you for reading! Feel free to reach out on <a target="_blank" rel="noopener noreferrer" class="" href="https://x.com/ashpreetbedi">X</a> if you have questions or feedback.</p>]]></content:encoded>
            <author>hi@ashpreetbedi.com (Ashpreet Bedi)</author>
        </item>
        <item>
            <title><![CDATA[Introducing Agno]]></title>
            <link>https://ashpreetbedi.com/articles/introducing-agno</link>
            <guid isPermaLink="false">https://ashpreetbedi.com/articles/introducing-agno</guid>
            <pubDate>Wed, 15 Oct 2025 00:00:00 GMT</pubDate>
            <content:encoded><![CDATA[<span class="text-2xl font-semibold"><p>✨ The Multi-Agent Framework, Runtime, and UI.</p></span>
<img alt="Agno AgentOS" loading="lazy" width="700" height="700" decoding="async" data-nimg="1" class="rounded-2xl" style="color:transparent" srcset="/_next/image?url=%2Fimages%2Fintro_agno_agentos.png&amp;w=750&amp;q=75 1x, /_next/image?url=%2Fimages%2Fintro_agno_agentos.png&amp;w=1920&amp;q=75 2x" src="/_next/image?url=%2Fimages%2Fintro_agno_agentos.png&amp;w=1920&amp;q=75">
<p>Over the past 3 years, I've been obsessed with building the perfect harness for multi-agent systems. A mission to deliver the best system for building, deploying and scaling agentic software.</p>
<p>Today, Agno is used by thousands of builders at the largest companies in the world, including 3 of the fortune 5. Let's dive in.</p>
<h2>What is Agno?</h2>
<p><strong>Agno is a multi-agent framework, runtime, and UI.</strong> It takes a systems engineering approach to agent development by delivering 3 tightly coupled components:</p>
<ol>
<li><strong>Framework</strong>: for building multi-agent systems.</li>
<li><strong>Runtime</strong>: for deploying multi-agent systems.</li>
<li><strong>UI</strong>: for managing multi-agent systems.</li>
</ol>
<p>These 3 components form the harness for the perfect agentic system.</p>
<p>Can you build these yourself? Absolutely. But <strong>Agno gives you speed, speed gives you momentum, and momentum is everything.</strong></p>
<blockquote>
<p>Enough talk, let's see some code.</p>
</blockquote>
<p>Here's a fully working Agent, with conversation history, access to tools via MCP, deployed as a FastAPI app - in 20 lines of code.</p>
<pre class="language-javascript"><code class="language-javascript"><span class="token keyword module">from</span> agno<span class="token punctuation">.</span><span class="token property-access">agent</span> <span class="token keyword module">import</span> <span class="token imports"><span class="token maybe-class-name">Agent</span></span>
<span class="token keyword module">from</span> agno<span class="token punctuation">.</span><span class="token property-access">db</span><span class="token punctuation">.</span><span class="token property-access">sqlite</span> <span class="token keyword module">import</span> <span class="token imports"><span class="token maybe-class-name">SqliteDb</span></span>
<span class="token keyword module">from</span> agno<span class="token punctuation">.</span><span class="token property-access">models</span><span class="token punctuation">.</span><span class="token property-access">anthropic</span> <span class="token keyword module">import</span> <span class="token imports"><span class="token maybe-class-name">Claude</span></span>
<span class="token keyword module">from</span> agno<span class="token punctuation">.</span><span class="token property-access">os</span> <span class="token keyword module">import</span> <span class="token imports"><span class="token maybe-class-name">AgentOS</span></span>
<span class="token keyword module">from</span> agno<span class="token punctuation">.</span><span class="token property-access">tools</span><span class="token punctuation">.</span><span class="token property-access">mcp</span> <span class="token keyword module">import</span> <span class="token maybe-class-name">MCPTools</span>

# <span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">*</span> <span class="token maybe-class-name">Create</span> <span class="token maybe-class-name">Agent</span> <span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">*</span>
agno_agent <span class="token operator">=</span> <span class="token function"><span class="token maybe-class-name">Agent</span></span><span class="token punctuation">(</span>
    name<span class="token operator">=</span><span class="token string">"Agno Agent"</span><span class="token punctuation">,</span>
    model<span class="token operator">=</span><span class="token function"><span class="token maybe-class-name">Claude</span></span><span class="token punctuation">(</span>id<span class="token operator">=</span><span class="token string">"claude-sonnet-4-5"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
    db<span class="token operator">=</span><span class="token function"><span class="token maybe-class-name">SqliteDb</span></span><span class="token punctuation">(</span>db_file<span class="token operator">=</span><span class="token string">"agno.db"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
    tools<span class="token operator">=</span><span class="token punctuation">[</span><span class="token function"><span class="token maybe-class-name">MCPTools</span></span><span class="token punctuation">(</span>url<span class="token operator">=</span><span class="token string">"https://docs.agno.com/mcp"</span><span class="token punctuation">,</span> transport<span class="token operator">=</span><span class="token string">"streamable-http"</span><span class="token punctuation">)</span><span class="token punctuation">]</span><span class="token punctuation">,</span>
    add_history_to_context<span class="token operator">=</span><span class="token maybe-class-name">True</span><span class="token punctuation">,</span>
    markdown<span class="token operator">=</span><span class="token maybe-class-name">True</span><span class="token punctuation">,</span>
<span class="token punctuation">)</span>

# <span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">*</span> <span class="token maybe-class-name">Create</span> <span class="token maybe-class-name">AgentOS</span> <span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">*</span>
agent_os <span class="token operator">=</span> <span class="token function"><span class="token maybe-class-name">AgentOS</span></span><span class="token punctuation">(</span>agents<span class="token operator">=</span><span class="token punctuation">[</span>agno_agent<span class="token punctuation">]</span><span class="token punctuation">)</span>
app <span class="token operator">=</span> agent_os<span class="token punctuation">.</span><span class="token method function property-access">get_app</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
</code></pre>
<p>Run your AgentOS using <code>fastapi dev agno_agent.py</code> and chat with it on the <a target="_blank" rel="noopener noreferrer" class="" href="https://os.agno.com">AgentOS UI</a>.</p>
<video width="700" height="700" class="rounded-2xl" loop="" autoplay="" muted="" playsinline="" controls=""><source src="/videos/agentos-chat.mp4">Your browser does not support the video tag.</video>
<p>Deploy your FastAPI app to your cloud of choice, and voilà, you're live in production. <strong>It's impossible to move this quickly without Agno.</strong></p>
<h2>✨ Part I: The Framework</h2>
<blockquote>
<p>Agent Engineering is an exercise in iteration. You can't iterate if you don't have a v0.1. A batteries included setup gets your agent in the hands of your internal team. Then you can edit in a loop.</p>
</blockquote>
<div class="flex w-full justify-end text-xs"><a target="_blank" rel="noopener noreferrer" class="" href="https://www.vtrivedy.com/posts/claude-code-sdk-haas-harness-as-a-service/"><p>[stolen from vtridvedy]</p></a></div>
<p>Agno delivers a full-featured, performance-optimized agent framework with every primitive you can think of. <strong>Session storage</strong>, <strong>memory</strong>, <strong>knowledge (RAG)</strong>, <strong>context management</strong>, <strong>tools</strong> (pre-built and MCP), <strong>guardrails</strong>, <strong>dependency injection</strong>, <strong>human in the loop</strong>, and more. Every part of agent execution is customizable via pre-hooks, post-hooks, and state management, so you're never boxed into default behavior.</p>
<p>Agents are completely type-safe, you can use them as chatbots (string input, string output) or with structured inputs and outputs. Not only that, Agents can use separate parser-models to generate structured outputs, so reasoning is not compromised (only available on Agno).</p>
<span class="text-xl font-semibold justify-center flex"><p>✨ The Multi-Agent Paradox</p></span>
<p>The big debate in multi-agent systems is whether agents should execute other sub-agents (handoff-approach), or the developer should programmatically define the flow of execution (workflow-approach).</p>
<blockquote>
<p>The answer: why not both?</p>
</blockquote>
<p>With Agno, <strong>Agents can be executed by themselves, as part of a multi-agent Team (autonomous execution) or a step-based Workflow (controlled execution)</strong>. Your use-case determines your approach.</p>
<p>Agent Teams have a shared state, agentic context management (i.e. the team leader manages the context for the team), shared memory and knowledge. Teams can also execute other teams, or workflows.</p>
<p>Workflows are deterministic, where each step can be an agent, team, workflow, or a plain old python function. Steps can be parallelized, branched, run via conditional logic or loops.</p>
<p>There's so much more I can cover here, but i'll save that for the <a target="_blank" rel="noopener noreferrer" class="" href="https://agno.link/docs">docs</a>. The gist is, when building Agents, my goal is to get to v0.1 within a few hours and iterate from there. Agno gives me that.</p>
<blockquote>
<p>New agent engineers think that building the solution is the hard part - NO. Finding the right use-case is the hard part. To do that, you need to tackle 3, 5, or 10 different problems. Agno gets you to use-case #10, which is where the magic happens.</p>
</blockquote>
<h2>✨ Part II: The Runtime</h2>
<p>Seasoned builders know that to build successful agentic products, you need to iterate on multiple variations before you hit gold. Also:</p>
<ol>
<li>You're not going to build by yourself, you need to get it in the hands of your team quickly (especially the non-technical folks).</li>
<li>You need some sort of system to test, serve and integrate with your product as quickly as possible (to get user feedback).</li>
</ol>
<p>This means you need to build an API to serve your agents, your product will integrate with this API via REST or WebSockets. You also need a UI to test, monitor, debug and manage your system.</p>
<span class="text-xl font-semibold justify-center flex"><p>✨ You need an AI backend.</p></span>
<p>This is where the AgentOS comes in. In the simplest terms, it's a FastAPI application with pre-built endpoints for serving your agents, teams and workflows. You can also manage knowledge bases, user memories, agent sessions, and evaluate your system in real-time.</p>
<p><strong>The AgentOS is a high-performance runtime for multi-agent systems. It gives you a ready-to-use FastAPI app for deploying your agents, and an integrated UI for testing, monitoring and managing them.</strong></p>
<p>Deploy your AgentOS to your cloud of choice. Session data, knowledge, memories, all live in your database. No data ever leaves your system.</p>
<blockquote>
<p>In my experience, once you have a semblence of an Agent you like, you need to get it in the hands of your team and early users quickly. The pre-built api endpoints give you such an incredible headstart that its almost a no-brainer to use.</p>
</blockquote>
<p>Here are the pre-built api endpoints, ready to use:</p>
<img alt="Agno AgentOS API" loading="lazy" width="700" height="700" decoding="async" data-nimg="1" class="rounded-2xl" style="color:transparent" srcset="/_next/image?url=%2Fimages%2Fintro_agno_agentos_api.png&amp;w=750&amp;q=75 1x, /_next/image?url=%2Fimages%2Fintro_agno_agentos_api.png&amp;w=1920&amp;q=75 2x" src="/_next/image?url=%2Fimages%2Fintro_agno_agentos_api.png&amp;w=1920&amp;q=75">
<h2>✨ Part III: The Control Plane</h2>
<p>Wait, there's more?</p>
<p>The AgentOS comes with a web interface that connects directly to the AgentOS runtime (using the pre-built api endpoints). It's an novel architecture, where the web app (running in your browser) connects directly to the AgentOS runtime. You can test (chat and run) your agents, teams and workflows, manage knowledge bases, user memories, and evaluate your system in real-time. Here's how it looks:</p>
<img alt="Agno AgentOS UI" loading="lazy" width="700" height="700" decoding="async" data-nimg="1" class="rounded-2xl" style="color:transparent" srcset="/_next/image?url=%2Fimages%2Fintro_agno_agentos_ui.png&amp;w=750&amp;q=75 1x, /_next/image?url=%2Fimages%2Fintro_agno_agentos_ui.png&amp;w=1920&amp;q=75 2x" src="/_next/image?url=%2Fimages%2Fintro_agno_agentos_ui.png&amp;w=1920&amp;q=75">
<p>If you're using a tracing service, this will change how you look at things. You're not sending any data out, you're not paying for retention costs, and
you're not worrying about data privacy. The app pulls in sessions directly from the Agent's database and show's them:</p>
<img alt="Agno AgentOS UI Sessions" loading="lazy" width="700" height="700" decoding="async" data-nimg="1" class="rounded-2xl" style="color:transparent" srcset="/_next/image?url=%2Fimages%2Fintro_agno_agentos_sessions.png&amp;w=750&amp;q=75 1x, /_next/image?url=%2Fimages%2Fintro_agno_agentos_sessions.png&amp;w=1920&amp;q=75 2x" src="/_next/image?url=%2Fimages%2Fintro_agno_agentos_sessions.png&amp;w=1920&amp;q=75">
<p>The traces and runtime data is stored in your database, and the AgentOS UI connects from your browser to the AgentOS runtime.</p>
<p>Its a novel architecture designed to give you complete data ownership:</p>
<ul>
<li><strong>Your Infrastructure, Your Data</strong>: Your AgentOS runs in your cloud.</li>
<li><strong>Zero Data Transmission</strong>: No conversations, logs, or metrics are sent to external services. They belong to you.</li>
<li><strong>Private by Default</strong>: All processing, storage, and analytics happen in your environment.</li>
</ul>
<p>Personally, I'm surprised we collectively agreed to hand over every user interaction to tracing companies. Just the retention issues are enough to make you think twice, let alone the data privacy concerns.</p>
<span class="text-xl font-semibold flex justify-center"><p>For companies building agents, Agno delivers the complete solution.</p></span>
<p>Unless you're an infra or devtools company, you're focused on solving user problems. Agno free's up your mental capacity so you can <span class="text-orange-500">a)</span> find the right problem to tackle, <span class="text-orange-500">b)</span> build your MVP quickly, and <span class="text-orange-500">c)</span> iterate and improve your product.</p>
<p>Thousands of builders choose Agno, thank you for letting us be a part of your journey ✨</p>
<hr>
<h2>Want to build with Agno?</h2>
<ul>
<li>
<p><strong>Agno documentation:</strong> <a target="_blank" rel="noopener noreferrer" class="" href="https://agno.link/docs">agno.link/docs</a></p>
</li>
<li>
<p><strong>Signup for the AgentOS:</strong> <a target="_blank" rel="noopener noreferrer" class="" href="https://os.agno.com">os.agno.com</a></p>
</li>
<li>
<p><strong>Star Agno on Github:</strong> <a target="_blank" rel="noopener noreferrer" class="" href="https://agno.link/gh">agno.link/gh</a></p>
</li>
</ul>
<hr>
<p>Read more on <a target="_blank" rel="noopener noreferrer" class="" href="https://www.agno.com">agno.com</a></p>]]></content:encoded>
            <author>hi@ashpreetbedi.com (Ashpreet Bedi)</author>
        </item>
        <item>
            <title><![CDATA[The Programming Language for Agentic Software]]></title>
            <link>https://ashpreetbedi.com/articles/language-for-agents</link>
            <guid isPermaLink="false">https://ashpreetbedi.com/articles/language-for-agents</guid>
            <pubDate>Wed, 18 Feb 2026 00:00:00 GMT</pubDate>
            <content:encoded><![CDATA[<p>Every era of computing develops its own programming language.</p>
<p>The mainframe era had COBOL and Fortran. The systems era had C. The web era had JavaScript and Python. Each emerged for the same reason, the previous generation could no longer express the new abstraction.</p>
<p>We are now in the agentic era.</p>
<p>Software is no longer just executing predefined instructions. It is reasoning over context, calling tools, retrieving knowledge, learning from past runs, and making decisions at runtime.</p>
<p>When the contract of software changes, the language must change too.</p>
<h2>What makes a programming language?</h2>
<p>A programming language is made of three things:</p>
<ol>
<li>Primitives to think and build with.</li>
<li>An engine to execute those primitives.</li>
<li>A runtime that governs memory, I/O, permissions, and interaction with the outside world.</li>
</ol>
<p>An SDK alone is not a programming language. A collection of utilities is not a programming language. Without an execution engine and a runtime that enforces behavior, you have a library, not a language.</p>
<p>Python gives you lists, functions, and classes. Its interpreter runs them. Its runtime manages memory, exceptions, and interfaces with the operating system.</p>
<p>React gives you components and state. Its reconciler computes updates. The browser handles rendering and events.</p>
<p>Applying this to agentic systems:</p>
<ul>
<li><a target="_blank" rel="noopener noreferrer" class="" href="https://github.com/agno-agi/agno">Agno</a> gives you agents, teams, workflows, memory, knowledge, tools, guardrails, and approval flows.</li>
<li>The Engine runs them: model calls, tool execution, context construction, and iteration.</li>
<li>AgentOS, the production runtime, governs execution and interfaces with the outside world via an API: streaming, request-level isolation, authentication, RBAC, monitoring, background execution.</li>
</ul>
<p>The runtime is stateless. Sessions, memory, state and traces persist in your database. Permissions are enforced at request boundaries.</p>
<p>Agno provides the SDK + Engine + Runtime for agentic software.</p>
<h2>Agents are the new programs</h2>
<p>Traditional applications are collections of deterministic programs. Every path is written in advance. The system does exactly what the developer specified.</p>
<p>Agents change that.</p>
<p>An agent reasons over context. It chooses tools dynamically. It retrieves knowledge. It remembers previous runs. It decides which path to take at runtime.</p>
<p>This is still software, but the path between input and output is no longer fixed.</p>
<p>This does not mean deterministic systems disappear. For many workloads, static pipelines are faster, cheaper, and more reliable.</p>
<p>But when the system must pause, reason, retrieve, and adapt dynamically, predefined control flow breaks down.</p>
<p>For decades, the contract was simple:</p>
<blockquote>
<p>Same input, same output.</p>
</blockquote>
<p>Agentic software breaks that contract.</p>
<p>The same input can produce different outputs depending on memory, context, retrieval, and prior state. If execution is dynamic, the language must express that natively.</p>
<h2>Agentic software needs a new contract</h2>
<p>Agentic software requires new capabilities built into its programming language:</p>
<h3>1. A new interaction model</h3>
<p>Static software receives a request and returns a response.</p>
<p>Agentic software streams reasoning, tool calls, intermediate results, and pivots in real time. The execution path can change mid run, or pause for days. The system may retrieve knowledge halfway through and completely redirect its reasoning.</p>
<p>Streaming and iteration are the default and the language for agentic software must treat them as first class behavior.</p>
<h3>2. A new governance model</h3>
<p>Traditional systems execute predefined decisions within rules written in advance. Code does not decide whether to send an email or issue a refund. It simply follows instructions.</p>
<p>Agents make decisions, and not all decisions are equal.</p>
<p>Some actions are low risk: summarizing text or searching documentation.
<strong>Some require user approval</strong>: sending emails or booking travel.
<strong>Some require admin approval</strong>: issuing refunds, deleting records, changing permissions.</p>
<p>Without runtime-enforced approval boundaries, an agent that can draft an email can also execute a payment. The difference must be enforced by the runtime, not prompt engineering.</p>
<p>Governance must be part of the agent definition itself and the runtime must enforce it.</p>
<h3>3. A new trust model</h3>
<p>Static systems are trusted because every path is written in advance.</p>
<p>Agents introduce probabilistic reasoning into the execution path.</p>
<p>If guardrails and evaluation run outside the runtime, they are advisory rather than enforceable. Unsafe output can be produced before policy checks intervene.</p>
<p>Trust must therefore be part of the runtime semantics: guardrails, evaluation, logging, pre and post-response checks integrated into execution.</p>
<p>Interaction. Governance. Trust.</p>
<p>These are language-level concerns in the agentic era.</p>
<h2>What this looks like in practice</h2>
<p>Here is a lightweight coding agent that writes, reviews, and iterates on code. It remembers project conventions, retrieves knowledge, learns from past runs, and operates within explicit governance boundaries.</p>
<p>This example is intentionally minimal but production-capable. It has persistence, memory, learning, and controlled tool execution.</p>
<pre class="language-python"><code class="language-python"><span class="token keyword">from</span> agno<span class="token punctuation">.</span>agent <span class="token keyword">import</span> Agent
<span class="token keyword">from</span> agno<span class="token punctuation">.</span>db<span class="token punctuation">.</span>sqlite <span class="token keyword">import</span> SqliteDb
<span class="token keyword">from</span> agno<span class="token punctuation">.</span>learn <span class="token keyword">import</span> LearnedKnowledgeConfig<span class="token punctuation">,</span> LearningMachine<span class="token punctuation">,</span> LearningMode
<span class="token keyword">from</span> agno<span class="token punctuation">.</span>models<span class="token punctuation">.</span>openai <span class="token keyword">import</span> OpenAIResponses
<span class="token keyword">from</span> agno<span class="token punctuation">.</span>tools<span class="token punctuation">.</span>coding <span class="token keyword">import</span> CodingTools
<span class="token keyword">from</span> agno<span class="token punctuation">.</span>tools<span class="token punctuation">.</span>reasoning <span class="token keyword">import</span> ReasoningTools

gcode <span class="token operator">=</span> Agent<span class="token punctuation">(</span>
    name<span class="token operator">=</span><span class="token string">"Gcode"</span><span class="token punctuation">,</span>
    model<span class="token operator">=</span>OpenAIResponses<span class="token punctuation">(</span><span class="token builtin">id</span><span class="token operator">=</span><span class="token string">"gpt-5.2"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
    db<span class="token operator">=</span>SqliteDb<span class="token punctuation">(</span>db_file<span class="token operator">=</span><span class="token string">"agno.db"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
    instructions<span class="token operator">=</span>instructions<span class="token punctuation">,</span>

    <span class="token comment"># Knowledge: searchable long-term memory</span>
    knowledge<span class="token operator">=</span>gcode_knowledge<span class="token punctuation">,</span>
    search_knowledge<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span>

    <span class="token comment"># Learning: extract and store learnings over time</span>
    learning<span class="token operator">=</span>LearningMachine<span class="token punctuation">(</span>
        knowledge<span class="token operator">=</span>gcode_learnings<span class="token punctuation">,</span>
        learned_knowledge<span class="token operator">=</span>LearnedKnowledgeConfig<span class="token punctuation">(</span>mode<span class="token operator">=</span>LearningMode<span class="token punctuation">.</span>AGENTIC<span class="token punctuation">)</span><span class="token punctuation">,</span>
    <span class="token punctuation">)</span><span class="token punctuation">,</span>

    <span class="token comment"># Tools: controlled extensions</span>
    tools<span class="token operator">=</span><span class="token punctuation">[</span>CodingTools<span class="token punctuation">(</span>base_dir<span class="token operator">=</span>workspace<span class="token punctuation">,</span> <span class="token builtin">all</span><span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">)</span><span class="token punctuation">,</span> ReasoningTools<span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">]</span><span class="token punctuation">,</span>

    <span class="token comment"># Memory: learn user preferences</span>
    enable_agentic_memory<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span>

    <span class="token comment"># Context: include prior runs</span>
    add_history_to_context<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span>
    num_history_runs<span class="token operator">=</span><span class="token number">10</span><span class="token punctuation">,</span>
    markdown<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span>
<span class="token punctuation">)</span>
</code></pre>
<p>Notice what is being defined:</p>
<ul>
<li>Knowledge as a first class primitive</li>
<li>Learning as a built in capability</li>
<li>Tools as controlled extensions</li>
<li>Memory and historical context as defaults</li>
<li>A runtime that governs how the system executes</li>
</ul>
<p>These are not utilities or third party integrations. They are the vocabulary of the agent and enforced by the runtime and execution layer.</p>
<p>That is what a programming language does. It gives you the right primitives for the era you are building in. You define the behavior. The language enforces it.</p>
<h2>Every era gets the language it needs</h2>
<p>COBOL abstracted business logic away from assembly. C abstracted system engineering without hiding it. Python abstracted memory management and low level primitives to accelerate iteration.</p>
<p>Each language captured the dominant abstraction of its era.</p>
<p>The agentic era introduces a new abstraction: systems that reason, remember, and decide at runtime.</p>
<p>The contract has changed.
The primitives have changed.
The execution model has changed.</p>
<p>The language must change too. That language is <a target="_blank" rel="noopener noreferrer" class="" href="https://github.com/agno-agi/agno">Agno</a>.</p>
<blockquote>
<p>There are many that argue that because Agno is written in Python, it cannot be a programming language.</p>
<p>If you wish to make an apple pie from scratch, you must first invent the universe.
— Carl Sagan</p>
</blockquote>]]></content:encoded>
            <author>hi@ashpreetbedi.com (Ashpreet Bedi)</author>
        </item>
        <item>
            <title><![CDATA[Learning Machines: Why AI Memory Hasn't Been Solved (Yet)]]></title>
            <link>https://ashpreetbedi.com/articles/learning-machines-v0</link>
            <guid isPermaLink="false">https://ashpreetbedi.com/articles/learning-machines-v0</guid>
            <pubDate>Wed, 07 Jan 2026 00:00:00 GMT</pubDate>
            <content:encoded><![CDATA[<p><strong>Every AI memory tool I've used is missing something.</strong></p>
<p>After reading hundreds (maybe thousands) of opinions, posts, and papers on agentic memory, I've come to three conclusions.</p>
<p><strong>1. No one has it figured out.</strong></p>
<p>Claude has the most impressive memory system I've seen. It feels natural. It never shouts. It knows what to reveal and when.</p>
<p>But we haven't figured out how to give developers the same capability for their own agents. The tools we have are... not there.</p>
<p><strong>2. Maybe we're looking at it wrong.</strong></p>
<p>Maybe memory is the wrong framing. <strong>What agents are really doing is learning.</strong> Learning about the user, the task at hand, learning insights and patterns, learning from decisions - good and bad, the feedback received. Learning from every interaction.</p>
<p>Everyone's rushing to build memory extraction systems — pull out facts, store them in a vector (or graph 🙄) database, retrieve them using complex mechanisms. But that's only half the problem.</p>
<p>But the hard part is integration: When does the learning happen? Before the response? After? In parallel? Is it automatic or does the agent control it? And critically — how do you teach the agent to use that information properly? Integration is what makes the system work.</p>
<p>You can't just tell an agent "you know XYZ about the user". You need to teach it how to use that knowledge. How to learn from it. How to prioritize it. How to act like a partner, a colleague, a companion who genuinely knows you — not a machine reciting facts from a database.</p>
<p><strong>3. User memory is only part of the story.</strong></p>
<p>User profiles and conversation summaries are just two types of learnings. But what about patterns and insights that worked? The entities involved - companies, people, projects? The decisions made and why? The feedback received? How should the agent use all these learnings to improve itself?</p>
<p>These aren't separate systems. They're all forms of learning.</p>
<hr>
<h2>Memory is Learning</h2>
<p>This realization led me to build something different: the Learning Machine, a unified learning system that helps agents continuously integrate information from their environment.</p>
<p>Here's the difference:</p>
<pre><code>Traditional "Memory":
Message → Extract → Store → Retrieve → Dump into Prompt → Repeat

Learning Machine:
User Message ──────► Recall from Stores ◄────────┐
                            │                    │
                            ▼                    │
                      Build Context              │
                            │                    │
                            ▼                    │ LearningMachine
                Agent Responds (with tools)      │
                            │                    │
                            ▼                    │
                   Extract &amp; Process             │
                            │                    │
                            ▼                    │
              Update Stores (agent learns) ──────┴──► Periodic Curation
</code></pre>
<p>The agent isn't just <strong>fed</strong> memories. It participates in learning, curating what it learns, and integrating that knowledge back into every response.</p>
<blockquote class="not-prose relative isolate pl-6 text-ink py-3 text-lg"><span aria-hidden="true" class="absolute inset-y-1 left-0 w-0.5 rounded-full bg-accent"></span><div class="flex gap-3"><div class="space-y-1"><div class="leading-relaxed"><p><strong>The goal: an agent on interaction 1000 is fundamentally better than it was on interaction 1 — across the board, not just with the same user.</strong></p></div></div></div></blockquote>
<hr>
<h2>What It Looks Like in Action</h2>
<p>A new employee on their first day asks: "I'm starting work on the cloud migration project. What should I know?"</p>
<p>The agent responds with full context, even though it's never talked to this person before. It knows Acme is migrating from AWS to GCP. It knows Alex (CTO) is leading it. It knows Phase 2 is the most compute-heavy. It shares migration patterns from similar past projects. It knows that the pricing is changing next quarter.</p>
<p><strong>How?</strong> Three types of learning from past interactions:</p>
<pre><code>Session 1 (Alex, CTO):
"I'm Alex, CTO at Acme. We're migrating from AWS to GCP and
I need help planning the timeline."

→ User Profile captures: Alex, CTO, involved in planning discussions
→ Entity Memory captures: Acme (company), AWS→GCP migration (project)
→ Session Context: Goal is migration timeline planning
</code></pre>
<pre><code>Session 2 (next day, same user, different session):
"Just heard GCP is changing their pricing next quarter.
How does that affect our migration?"

→ Agent recalls: Acme, AWS→GCP migration, Alex is CTO, 3-phase timeline
→ Agent responds: "That could impact your timeline. Last time we mapped
   out a 3-phase approach with Phase 2 being the most compute-heavy.
   Want me to model the cost implications for each phase?"
</code></pre>
<pre><code>Session 3 (different user, same org namespace):
"I just joined to help with the Acme cloud project. What should I know?"

→ Entity Memory: "Acme is migrating AWS to GCP. Alex (CTO) is leading it."
→ Learned Knowledge: Shares migration patterns from past projects
→ Agent responds with full context — even though it never talked to this user
</code></pre>
<p>Three sessions. Three types of learning. Cross-user knowledge sharing.</p>
<p>This is possible. Today.</p>
<hr>
<h2>The Architecture: Learning Stores</h2>
<p>The key innovation behind the Learning Machine is the <strong>learning protocol</strong> and <strong>learning stores</strong>. The protocol defines how stores capture, process, and integrate knowledge. Each store is configured independently. Mix and match as needed. The Learning Machine orchestrates it all.</p>
<p>These are the stores I'm working on:</p>
<table><thead><tr><th>Store</th><th>What It Captures</th><th>Scope</th></tr></thead><tbody><tr><td><strong>User Profile</strong></td><td>Preferences, memories, personal context</td><td>Per user</td></tr><tr><td><strong>Session Context</strong></td><td>Goal, plan, progress, summary</td><td>Per session</td></tr><tr><td><strong>Entity Memory</strong></td><td>Facts, events, relationships about external things</td><td>Configurable</td></tr><tr><td><strong>Learned Knowledge</strong></td><td>Insights, patterns, best practices</td><td>Configurable</td></tr><tr><td><strong>Decision Logs</strong></td><td>Why decisions were made</td><td>Configurable</td></tr><tr><td><strong>Behavioral Feedback</strong></td><td>What worked, what didn't</td><td>Per agent</td></tr><tr><td><strong>Self-Improvement</strong></td><td>Evolved instructions</td><td>Per agent</td></tr></tbody></table>
<h3>Show Me Some Code</h3>
<p>One agent. Four learning stores. Configured independently. Orchestrated by the Learning Machine.</p>
<pre class="language-python"><code class="language-python"><span class="token keyword">from</span> agno<span class="token punctuation">.</span>agent <span class="token keyword">import</span> Agent
<span class="token keyword">from</span> agno<span class="token punctuation">.</span>db<span class="token punctuation">.</span>postgres <span class="token keyword">import</span> PostgresDb
<span class="token keyword">from</span> agno<span class="token punctuation">.</span>models<span class="token punctuation">.</span>openai <span class="token keyword">import</span> OpenAIResponses

agent <span class="token operator">=</span> Agent<span class="token punctuation">(</span>
    model<span class="token operator">=</span>OpenAIResponses<span class="token punctuation">(</span><span class="token builtin">id</span><span class="token operator">=</span><span class="token string">"gpt-5.2"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
    db<span class="token operator">=</span>PostgresDb<span class="token punctuation">(</span>db_url<span class="token operator">=</span><span class="token string">"postgresql://..."</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
    learning<span class="token operator">=</span>LearningMachine<span class="token punctuation">(</span>
        knowledge<span class="token operator">=</span>my_vector_store<span class="token punctuation">,</span>  <span class="token comment"># or graph if that's your thing</span>
        user_profile<span class="token operator">=</span>UserProfileConfig<span class="token punctuation">(</span>
            mode<span class="token operator">=</span>LearningMode<span class="token punctuation">.</span>BACKGROUND<span class="token punctuation">,</span>
            enable_agent_tools<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span>
        <span class="token punctuation">)</span><span class="token punctuation">,</span>
        session_context<span class="token operator">=</span>SessionContextConfig<span class="token punctuation">(</span>
            enable_planning<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span>
        <span class="token punctuation">)</span><span class="token punctuation">,</span>
        learned_knowledge<span class="token operator">=</span>LearnedKnowledgeConfig<span class="token punctuation">(</span>
            mode<span class="token operator">=</span>LearningMode<span class="token punctuation">.</span>PROPOSE<span class="token punctuation">,</span>
        <span class="token punctuation">)</span><span class="token punctuation">,</span>
        entity_memory<span class="token operator">=</span>EntityMemoryConfig<span class="token punctuation">(</span>
            mode<span class="token operator">=</span>LearningMode<span class="token punctuation">.</span>BACKGROUND<span class="token punctuation">,</span>
        <span class="token punctuation">)</span><span class="token punctuation">,</span>
    <span class="token punctuation">)</span><span class="token punctuation">,</span>
<span class="token punctuation">)</span>
</code></pre>
<blockquote class="not-prose relative isolate pl-6 text-ink py-3 text-lg"><span aria-hidden="true" class="absolute inset-y-1 left-0 w-0.5 rounded-full bg-accent"></span><div class="flex gap-3"><div class="space-y-1"><div class="leading-relaxed"><p><strong>The best part?</strong> You can build custom learning stores by extending the LearningStore protocol. Need project context? Build a <code>ProjectContextStore</code>. Need to track accounts? Build an <code>AccountStore</code>.</p></div></div></div></blockquote>
<hr>
<h2>Taking Inspiration from Claude</h2>
<p>Claude's memory feels magical. It's natural, contextual, never announces "saving to memory". It just <strong>knows</strong> you.</p>
<p>But here's the thing: <strong>you can't build with it.</strong> Claude's memory is a consumer product feature. The API gives you nothing. If you want learning for your agents, you're on your own. Enter Learning Machine.</p>
<p>Here's what Claude does well, and what Learning Machine adds:</p>
<p><strong>Claude feels natural.</strong> It never announces "saving to memory". So does Learning Machine. We inject context based on each store and control how the agent learns from it. No fact dumps.</p>
<p><strong>Claude learns about its users over time.</strong> Preferences, history, personal context. So does Learning Machine. But we also add sessions, entities, patterns, and decisions. The full picture, not just the user.</p>
<p><strong>Claude is scoped to a single user.</strong> Makes sense for a consumer product. Learning Machine adds namespace scoping: keep it private to a user, share across a team, or make it global. You control the boundaries.</p>
<p><strong>Claude has fixed memory types.</strong> You can't change how it works. Learning Machine is extensible via protocol. Build your own stores for whatever your domain needs.</p>
<p><strong>Claude is a closed system.</strong> Its memory lives inside Claude. Learning Machine is open source, fully customizable, and yours to extend.</p>
<p>I studied what makes Claude's memory feel good. Then built something you can actually use and extend.</p>
<h2>What This Unlocks</h2>
<p>Here's what's possible when agents learn across users, sessions, and time:</p>
<ul>
<li>A <strong>support agent</strong> where ticket #1,000 gets resolved better and faster — because it learned from tickets #1-999.</li>
<li>A <strong>customer success agent</strong> that remembers every account's stack, contracts, and conversations — across your entire team.</li>
<li>A <strong>healthcare agent</strong> that knows your full history — not just what's in today's chart, but every conversation (with different doctors), symptom, and concern you've ever mentioned.</li>
<li>A <strong>financial advisor</strong> that remembers your risk tolerance, goals, and every "what if" scenario you've ever explored — across years of conversations.</li>
<li>An <strong>agent that rewrites itself</strong> — analyzing its failures and proposing: "I should stop doing X."</li>
</ul>
<blockquote class="not-prose relative isolate pl-6 text-ink py-3 text-lg"><span aria-hidden="true" class="absolute inset-y-1 left-0 w-0.5 rounded-full bg-accent"></span><div class="flex gap-3"><div class="space-y-1"><div class="leading-relaxed"><p>That last one is the endgame. Agents that learn from their own mistakes and rewrite their own instructions. Human approves. Agent evolves. Continuous improvement.</p></div></div></div></blockquote>
<hr>
<h2>Current Status</h2>
<p>Learning Machine is part of <a target="_blank" rel="noopener noreferrer" class="" href="https://github.com/agno-agi/agno">Agno</a> and I'm in the final stages of testing Phase 1. Here's where things stand:</p>
<table><thead><tr><th>Phase</th><th>What's Included</th><th>Status</th></tr></thead><tbody><tr><td><strong>Phase 1</strong></td><td>User Profile, Session Context, Entity Memory, Learned Knowledge</td><td>Built, testing now</td></tr><tr><td><strong>Phase 2</strong></td><td>Decision Logs, Behavioral Feedback</td><td>Planned</td></tr><tr><td><strong>Phase 3</strong></td><td>Self-Improvement</td><td>Planned</td></tr></tbody></table>
<p>If you're eager to dig in, here's the PR: <a target="_blank" rel="noopener noreferrer" class="" href="https://github.com/agno-agi/agno/pull/5897">learning-machine-v0</a></p>
<p>Want to get involved? DM me if you're interested in learning more or helping out.</p>
<hr>
<p>Memory was never the goal. Learning was.</p>
<p>If you enjoyed reading this, <a target="_blank" rel="noopener noreferrer" class="" href="https://github.com/agno-agi/agno">checkout Agno on GitHub</a>.</p>
<p>Questions or feedback? Reach out on <a target="_blank" rel="noopener noreferrer" class="" href="https://x.com/ashpreetbedi">X</a>.</p>]]></content:encoded>
            <author>hi@ashpreetbedi.com (Ashpreet Bedi)</author>
        </item>
        <item>
            <title><![CDATA[Learning Machines: Technical Design]]></title>
            <link>https://ashpreetbedi.com/articles/lm-technical-design</link>
            <guid isPermaLink="false">https://ashpreetbedi.com/articles/lm-technical-design</guid>
            <pubDate>Thu, 08 Jan 2026 00:00:00 GMT</pubDate>
            <content:encoded><![CDATA[<p><strong>On Monday I introduced <a target="_blank" rel="noopener noreferrer" class="" href="/articles/learning-machines-v0">Learning Machines</a> and yesterday I shared that it's finally working. Today I'll show you how it works under the hood.</strong></p>
<h2>First, Let's Recap</h2>
<p>After reading hundreds of papers on agentic memory and trying out every possible tool, I came to the simple conclusion that maybe we're looking at memory wrong.</p>
<p>Memory is just... learning. Learning about the user, the task at hand, learning insights and patterns, learning from decisions - good and bad, the feedback received. Learning from every interaction. Everything else is <strong>integration</strong> (how the agent uses these learnings) and <strong>curation</strong> (decay, pruning, deduplication).</p>
<p>So I built <strong>Learning Machines</strong>: A system that helps agents continuously learn from every interaction.</p>
<p>I started working on it dec 31, and got a basic working version yesterday. Here's the PR for those interested: <a target="_blank" rel="noopener noreferrer" class="" href="https://github.com/agno-agi/agno/pull/5897">learning-machine-v0</a></p>
<p>Now let's dig into the technical details.</p>
<h2>The Learning Protocol</h2>
<p>The key behind it all is the <strong>Learning Protocol</strong>. It's a simple interface for building <strong>Learning Stores</strong> -- user profiles, session context, learned knowledge, entity memory, etc.</p>
<p>Let's take a look at the protocol:</p>
<pre class="language-python"><code class="language-python"><span class="token decorator annotation punctuation">@runtime_checkable</span>
<span class="token keyword">class</span> <span class="token class-name">LearningStore</span><span class="token punctuation">(</span>Protocol<span class="token punctuation">)</span><span class="token punctuation">:</span>
    <span class="token triple-quoted-string string">"""Protocol that all learning stores must implement."""</span>

    <span class="token decorator annotation punctuation">@property</span>
    <span class="token keyword">def</span> <span class="token function">learning_type</span><span class="token punctuation">(</span>self<span class="token punctuation">)</span> <span class="token operator">-</span><span class="token operator">&gt;</span> <span class="token builtin">str</span><span class="token punctuation">:</span>
        <span class="token triple-quoted-string string">"""Unique identifier (e.g., 'user_profile')."""</span>
        <span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span>

    <span class="token keyword">def</span> <span class="token function">recall</span><span class="token punctuation">(</span>self<span class="token punctuation">,</span> <span class="token operator">**</span>context<span class="token punctuation">)</span> <span class="token operator">-</span><span class="token operator">&gt;</span> Optional<span class="token punctuation">[</span>Any<span class="token punctuation">]</span><span class="token punctuation">:</span>
        <span class="token triple-quoted-string string">"""Retrieve learnings from storage."""</span>
        <span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span>

    <span class="token keyword">def</span> <span class="token function">process</span><span class="token punctuation">(</span>self<span class="token punctuation">,</span> messages<span class="token punctuation">:</span> List<span class="token punctuation">[</span>Any<span class="token punctuation">]</span><span class="token punctuation">,</span> <span class="token operator">**</span>context<span class="token punctuation">)</span> <span class="token operator">-</span><span class="token operator">&gt;</span> <span class="token boolean">None</span><span class="token punctuation">:</span>
        <span class="token triple-quoted-string string">"""Extract and save learnings from conversation."""</span>
        <span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span>

    <span class="token keyword">def</span> <span class="token function">build_context</span><span class="token punctuation">(</span>self<span class="token punctuation">,</span> data<span class="token punctuation">:</span> Any<span class="token punctuation">)</span> <span class="token operator">-</span><span class="token operator">&gt;</span> <span class="token builtin">str</span><span class="token punctuation">:</span>
        <span class="token triple-quoted-string string">"""Build context string for agent's system prompt."""</span>
        <span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span>

    <span class="token keyword">def</span> <span class="token function">get_tools</span><span class="token punctuation">(</span>self<span class="token punctuation">,</span> <span class="token operator">**</span>context<span class="token punctuation">)</span> <span class="token operator">-</span><span class="token operator">&gt;</span> List<span class="token punctuation">[</span>Callable<span class="token punctuation">]</span><span class="token punctuation">:</span>
        <span class="token triple-quoted-string string">"""Get tools to expose to agent."""</span>
        <span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span>
</code></pre>
<p>Five functions. Everything else is optional.</p>
<p><strong>Why this matters:</strong> You can build your own learning store in ~50 lines. Most memory systems are thousands of lines of config. This is ~50. Legal docs. Medical records. Codebases. Sales pipelines. Whatever your domain needs.</p>
<p>You can even build personalized LearningStores for your writing styles, for your daily to-do's, for your emails, for  your shopping lists. The real value of this approach is its extensibility.</p>
<h2>The Learning Machine</h2>
<p>The protocol lets you build stores. But stores need to plug into the agent somehow. That's what <strong>LearningMachine</strong> does.</p>
<pre><code>User Message ──────► Recall from Stores ◄────────┐
                            │                    │
                            ▼                    │
                      Build Context              │
                            │                    │
                            ▼                    │ LearningMachine
                Agent Responds (with tools)      │
                            │                    │
                            ▼                    │
                   Extract &amp; Process             │
                            │                    │
                            ▼                    │
              Update Stores (agent learns) ──────┴──► Periodic Curation
</code></pre>
<p>Recall → Build context → Run agent → Extract → Store. That's the loop.</p>
<h2>Developer Experience</h2>
<p>Three levels of complexity:</p>
<h3>Dead Simple</h3>
<pre class="language-python"><code class="language-python">agent <span class="token operator">=</span> Agent<span class="token punctuation">(</span>
    model<span class="token operator">=</span>model<span class="token punctuation">,</span>
    db<span class="token operator">=</span>db<span class="token punctuation">,</span>
    learning<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span>  <span class="token comment"># Enables user_profile in BACKGROUND mode</span>
<span class="token punctuation">)</span>
</code></pre>
<h3>Pick What You Want</h3>
<pre class="language-python"><code class="language-python">agent <span class="token operator">=</span> Agent<span class="token punctuation">(</span>
    model<span class="token operator">=</span>model<span class="token punctuation">,</span>
    db<span class="token operator">=</span>db<span class="token punctuation">,</span>
    learning<span class="token operator">=</span>LearningMachine<span class="token punctuation">(</span>
        user_profile<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span>
        session_context<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span>
        learned_knowledge<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span>
        entity_memory<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span>
    <span class="token punctuation">)</span><span class="token punctuation">,</span>
<span class="token punctuation">)</span>
</code></pre>
<h3>Full Control</h3>
<pre class="language-python"><code class="language-python">agent <span class="token operator">=</span> Agent<span class="token punctuation">(</span>
    model<span class="token operator">=</span>model<span class="token punctuation">,</span>
    db<span class="token operator">=</span>db<span class="token punctuation">,</span>
    learning<span class="token operator">=</span>LearningMachine<span class="token punctuation">(</span>
        user_profile<span class="token operator">=</span>UserProfileConfig<span class="token punctuation">(</span>
            mode<span class="token operator">=</span>LearningMode<span class="token punctuation">.</span>AGENTIC<span class="token punctuation">,</span>
        <span class="token punctuation">)</span><span class="token punctuation">,</span>
        session_context<span class="token operator">=</span>SessionContextConfig<span class="token punctuation">(</span>
            enable_planning<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span>
        <span class="token punctuation">)</span><span class="token punctuation">,</span>
        learned_knowledge<span class="token operator">=</span>LearnedKnowledgeConfig<span class="token punctuation">(</span>
            mode<span class="token operator">=</span>LearningMode<span class="token punctuation">.</span>PROPOSE<span class="token punctuation">,</span>
            namespace<span class="token operator">=</span><span class="token string">"engineering"</span><span class="token punctuation">,</span>
        <span class="token punctuation">)</span><span class="token punctuation">,</span>
        entity_memory<span class="token operator">=</span>EntityMemoryConfig<span class="token punctuation">(</span>
            mode<span class="token operator">=</span>LearningMode<span class="token punctuation">.</span>BACKGROUND<span class="token punctuation">,</span>
        <span class="token punctuation">)</span><span class="token punctuation">,</span>
    <span class="token punctuation">)</span><span class="token punctuation">,</span>
<span class="token punctuation">)</span>
</code></pre>
<h2>Build Your Own Learning Store</h2>
<p>This is the win. Implement the protocol, plug it in:</p>
<pre class="language-python"><code class="language-python"><span class="token decorator annotation punctuation">@dataclass</span>
<span class="token keyword">class</span> <span class="token class-name">ProjectContextStore</span><span class="token punctuation">:</span>
    <span class="token triple-quoted-string string">"""Custom store for project-specific context."""</span>

    <span class="token decorator annotation punctuation">@property</span>
    <span class="token keyword">def</span> <span class="token function">learning_type</span><span class="token punctuation">(</span>self<span class="token punctuation">)</span> <span class="token operator">-</span><span class="token operator">&gt;</span> <span class="token builtin">str</span><span class="token punctuation">:</span>
        <span class="token keyword">return</span> <span class="token string">"project_context"</span>

    <span class="token keyword">def</span> <span class="token function">recall</span><span class="token punctuation">(</span>self<span class="token punctuation">,</span> project_id<span class="token punctuation">:</span> <span class="token builtin">str</span><span class="token punctuation">,</span> <span class="token operator">**</span>kwargs<span class="token punctuation">)</span> <span class="token operator">-</span><span class="token operator">&gt;</span> Optional<span class="token punctuation">[</span>ProjectContext<span class="token punctuation">]</span><span class="token punctuation">:</span>
        <span class="token comment"># Retrieve from your storage</span>
        <span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span>

    <span class="token keyword">def</span> <span class="token function">process</span><span class="token punctuation">(</span>self<span class="token punctuation">,</span> messages<span class="token punctuation">:</span> List<span class="token punctuation">[</span>Any<span class="token punctuation">]</span><span class="token punctuation">,</span> project_id<span class="token punctuation">:</span> <span class="token builtin">str</span><span class="token punctuation">,</span> <span class="token operator">**</span>kwargs<span class="token punctuation">)</span> <span class="token operator">-</span><span class="token operator">&gt;</span> <span class="token boolean">None</span><span class="token punctuation">:</span>
        <span class="token comment"># Extract and save</span>
        <span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span>

    <span class="token keyword">def</span> <span class="token function">build_context</span><span class="token punctuation">(</span>self<span class="token punctuation">,</span> data<span class="token punctuation">:</span> Any<span class="token punctuation">)</span> <span class="token operator">-</span><span class="token operator">&gt;</span> <span class="token builtin">str</span><span class="token punctuation">:</span>
        <span class="token keyword">if</span> <span class="token keyword">not</span> data<span class="token punctuation">:</span>
            <span class="token keyword">return</span> <span class="token string">""</span>
        <span class="token keyword">return</span> <span class="token string-interpolation"><span class="token string">f"&lt;project_context&gt;\n</span><span class="token interpolation"><span class="token punctuation">{</span>data<span class="token punctuation">.</span>summary<span class="token punctuation">}</span></span><span class="token string">\n&lt;/project_context&gt;"</span></span>

    <span class="token keyword">def</span> <span class="token function">get_tools</span><span class="token punctuation">(</span>self<span class="token punctuation">,</span> <span class="token operator">**</span>kwargs<span class="token punctuation">)</span> <span class="token operator">-</span><span class="token operator">&gt;</span> List<span class="token punctuation">[</span>Callable<span class="token punctuation">]</span><span class="token punctuation">:</span>
        <span class="token keyword">return</span> <span class="token punctuation">[</span><span class="token punctuation">]</span>  <span class="token comment"># Or return tools for agentic mode</span>

<span class="token comment"># Plug it in</span>
learning <span class="token operator">=</span> LearningMachine<span class="token punctuation">(</span>
    custom_stores<span class="token operator">=</span><span class="token punctuation">{</span>
        <span class="token string">"project"</span><span class="token punctuation">:</span> ProjectContextStore<span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
    <span class="token punctuation">}</span><span class="token punctuation">,</span>
<span class="token punctuation">)</span>
</code></pre>
<blockquote class="not-prose relative isolate pl-6 text-ink py-3 text-lg"><span aria-hidden="true" class="absolute inset-y-1 left-0 w-0.5 rounded-full bg-accent"></span><div class="flex gap-3"><div class="space-y-1"><div class="leading-relaxed"><p><strong>~50 lines. 5 functions. Your domain, your rules.</strong> Build a Learning Store for legal docs, medical records, codebases, sales pipelines. This is the whole point behind the Learning Machine.</p></div></div></div></blockquote>
<h2>Built-in Stores</h2>
<p>Phase 1 includes four stores:</p>
<table><thead><tr><th>Store</th><th>What It Captures</th><th>Scope</th><th>Storage</th></tr></thead><tbody><tr><td><strong>User Profile</strong></td><td>Name, work context, preferences, communication style</td><td>Per user (<code>user_id</code>)</td><td>Database (direct lookup)</td></tr><tr><td><strong>Session Context</strong></td><td>Summary of conversation, goal, plan steps, progress</td><td>Per session (<code>session_id</code>)</td><td>Database (direct lookup)</td></tr><tr><td><strong>Learned Knowledge</strong></td><td>Insights, patterns, best practices. Things that apply across users</td><td>Configurable namespace</td><td>Knowledge base (vector search)</td></tr><tr><td><strong>Entity Memory</strong></td><td>Facts, events, and relationships about external things — companies, people, projects</td><td>Configurable namespace</td><td>Database (direct lookup + search)</td></tr></tbody></table>
<h2>Key Design Decisions</h2>
<h3>Learning Modes</h3>
<p>Different use cases need different extraction modes.</p>
<pre class="language-python"><code class="language-python"><span class="token keyword">class</span> <span class="token class-name">LearningMode</span><span class="token punctuation">(</span>Enum<span class="token punctuation">)</span><span class="token punctuation">:</span>
    BACKGROUND <span class="token operator">=</span> <span class="token string">"background"</span>   <span class="token comment"># Automatic extraction after each conversation</span>
    AGENTIC <span class="token operator">=</span> <span class="token string">"agentic"</span>         <span class="token comment"># Agent decides via tools</span>
    PROPOSE <span class="token operator">=</span> <span class="token string">"propose"</span>         <span class="token comment"># Agent proposes, user confirms</span>
    HITL <span class="token operator">=</span> <span class="token string">"hitl"</span>               <span class="token comment"># Human-in-the-loop approval (future)</span>
</code></pre>
<p><strong>BACKGROUND</strong> is invisible. The user never sees extraction happening. This is what makes Claude's memory feel natural.</p>
<p><strong>AGENTIC</strong> gives control. The agent decides what's worth remembering. You can see the tool calls. Less noise, more transparency.</p>
<p><strong>PROPOSE</strong> is for medium-stakes learning. Agent suggests, human approves. Good for shared knowledge bases where bad data spreads.</p>
<p><strong>HITL</strong> is for the highest-stakes learning. Explicit human approval required.</p>
<h3>Namespace Scoping</h3>
<p>Some learnings should be private. Some should be shared. Namespaces enable this.</p>
<pre class="language-python"><code class="language-python"><span class="token comment"># Private to this user</span>
LearnedKnowledgeConfig<span class="token punctuation">(</span>namespace<span class="token operator">=</span><span class="token string">"user"</span><span class="token punctuation">)</span>

<span class="token comment"># Shared within engineering team</span>
LearnedKnowledgeConfig<span class="token punctuation">(</span>namespace<span class="token operator">=</span><span class="token string">"engineering"</span><span class="token punctuation">)</span>

<span class="token comment"># Shared with everyone</span>
LearnedKnowledgeConfig<span class="token punctuation">(</span>namespace<span class="token operator">=</span><span class="token string">"global"</span><span class="token punctuation">)</span>
</code></pre>
<p>This is what enables cross-user learning. This is what made yesterday's experiment work — Alice's insight helped Bob because they shared a namespace.</p>
<h3>Entity Memory: Three-Tier Memory System</h3>
<p>Entities (people, companies, projects) hold different types of information:</p>
<ul>
<li><strong>Facts</strong>: Semantic knowledge ("Uses PostgreSQL", "Based in London")</li>
<li><strong>Events</strong>: Episodic memories ("Launched v2 on Jan 15", "Raised Series A")</li>
<li><strong>Relationships</strong>: Graph connections ("Bob is CEO of Acme", "Acme acquired StartupX")</li>
</ul>
<p>Flat list doesn't work. You need to query "what do we know about Acme" differently than "what happened with Acme."</p>
<h2>What's Next</h2>
<table><thead><tr><th>Phase</th><th>What's Included</th><th>Status</th></tr></thead><tbody><tr><td><strong>Phase 1</strong></td><td>Learning Protocol, Learning Machine + 4 Learning Stores</td><td>Built, currently testing and fixing bugs</td></tr><tr><td><strong>Phase 2</strong></td><td>Decision Logs and Behavioral Feedback. Agents that know <em>why</em> they did what they did, and <em>what worked</em></td><td>Planned</td></tr><tr><td><strong>Phase 3</strong></td><td>Self-Improvement</td><td>Planned</td></tr></tbody></table>
<blockquote class="not-prose relative isolate pl-6 text-ink py-3 text-lg"><span aria-hidden="true" class="absolute inset-y-1 left-0 w-0.5 rounded-full bg-accent"></span><div class="flex gap-3"><div class="space-y-1"><div class="leading-relaxed"><p><strong>Phase 3 is the endgame.</strong> Agents that analyze their own failures and propose: "I should stop doing X." Human approves. Agent evolves. No retraining. No fine-tuning. Just learning.</p></div></div></div></blockquote>
<p>Want to dig in? Here's the PR: <a target="_blank" rel="noopener noreferrer" class="" href="https://github.com/agno-agi/agno/pull/5897">learning-machine-v0</a></p>
<p>Memory was step one. Learning is what comes next.</p>
<p>If you enjoyed reading this, <a target="_blank" rel="noopener noreferrer" class="" href="https://github.com/agno-agi/agno">checkout Agno on GitHub</a>.</p>
<p>Questions or feedback? Reach out on <a target="_blank" rel="noopener noreferrer" class="" href="https://x.com/ashpreetbedi">X</a>.</p>]]></content:encoded>
            <author>hi@ashpreetbedi.com (Ashpreet Bedi)</author>
        </item>
        <item>
            <title><![CDATA[Memory: How Agents Learn]]></title>
            <link>https://ashpreetbedi.com/articles/memory</link>
            <guid isPermaLink="false">https://ashpreetbedi.com/articles/memory</guid>
            <pubDate>Mon, 22 Dec 2025 00:00:00 GMT</pubDate>
            <content:encoded><![CDATA[<p>It's almost 2026. Agents can follow complex instructions, use dozens of tools, and work autonomously for hours. But ask them the same question twice and they start from scratch. They don't remember what worked, what failed, or what they figured out along the way.</p>
<p><strong>What makes ChatGPT and Claude great personal assistants? Memory.</strong></p>
<p>Here's the dirty secret: when building agents with the API, we've made them capable, but we haven't yet figured out how to make them learn.</p>
<h2>Table of Contents</h2>
<ol>
<li>What is memory</li>
<li>How memory enables learning</li>
<li>Three patterns (with code)</li>
<li>Video demo</li>
<li>What makes a good learning</li>
<li>Get started</li>
</ol>
<blockquote>
<p>Wanna jump straight to the code? <a target="_blank" rel="noopener noreferrer" class="" href="https://agno.link/getting-started">Here you go</a>. Cookbooks 2, 4 and 7 are what you're looking for.</p>
</blockquote>
<h2>1. What is memory?</h2>
<p>"Memory" gets thrown around loosely. Chat history? Context window? Vector database? Let's be precise.</p>
<p>There are three types of memory that matter for agents:</p>
<h3>Session Memory</h3>
<p>The conversation context. What was said five messages ago. This is a solved problem: store messages in a database, retrieve them before every response, add them to the context.</p>
<p>Session memory is useful but limited. It disappears when the conversation ends. It's not really memory, it's just context.</p>
<h3>User Memory</h3>
<p>Facts about a <strong>specific user</strong> that persist across sessions. Preferences, goals, constraints.</p>
<p>When a user says "I'm interested in AI stocks and have moderate risk tolerance", that's worth remembering, not just for this conversation, but for every future conversation with that user.</p>
<p>This is powerful, but it's still not learning. User memory is about <strong>recall</strong>, not <strong>improvement</strong>.</p>
<h3>Learned Memory</h3>
<p>This is where knowledge gets built. As agents interact with the world, they discover insights that apply <em>generally</em>, not just to one user, but to anyone asking similar questions.</p>
<p>When your finance agent discovers that "when comparing ETFs, check both expense ratio AND tracking error", this insight is worth saving, not just because one user asked, but because it makes the agent better at ETF comparisons for everyone.</p>
<p>Here's the beauty: <strong>knowledge compounds</strong>. The more the agent learns, the better it gets. And unlike weight updates, this knowledge is tangible: you can inspect it, edit it, delete it. No retraining required.</p>
<blockquote class="not-prose relative isolate pl-6 text-ink py-3 text-lg"><span aria-hidden="true" class="absolute inset-y-1 left-0 w-0.5 rounded-full bg-accent"></span><div class="flex gap-3"><div class="space-y-1"><div class="leading-relaxed"><p><strong>If you're building agents without learned memory, you're leaving performance on the table.</strong></p></div></div></div></blockquote>
<h2>2. How memory enables learning</h2>
<p>Here's the core insight: <strong>learning is remembering what worked</strong>.</p>
<p>Without memory, agents are stateless. Every session is day one:</p>
<table><thead><tr><th>Without Memory</th><th>With Memory</th></tr></thead><tbody><tr><td>Re-discovers the same patterns</td><td>Searches prior learnings before acting</td></tr><tr><td>Repeats the same mistakes</td><td>Applies insights from past sessions</td></tr><tr><td>Re-asks the same questions</td><td>Builds domain knowledge over time</td></tr><tr><td>Can't build on prior success</td><td>Gets better the more you use it</td></tr></tbody></table>
<p>The best part: <strong>the model doesn't need to get better for the system to improve</strong>. Learning happens in retrieval, not in weights. And as models improve, your system improves too — for free.</p>
<p>I call this <strong>GPU Poor Continuous Learning</strong>: continuous improvement without fine-tuning, retraining, or any of the infrastructure traditionally required for model updates. Just a knowledge base that grows smarter over time.</p>
<blockquote class="not-prose relative isolate pl-6 text-ink py-3 text-lg"><span aria-hidden="true" class="absolute inset-y-1 left-0 w-0.5 rounded-full bg-accent"></span><div class="flex gap-3"><div class="space-y-1"><div class="leading-relaxed"><p>The model doesn't get smarter. The system gets smarter.</p></div></div></div></blockquote>
<h2>3. Three patterns for agent memory</h2>
<p>Let me show you how to implement the three patterns, with a bonus at the end.</p>
<h3>Pattern 1: Session Memory</h3>
<p>Store messages in a database, retrieve them before every response, add them to the context. Agno gives you this out of the box — just give your agent a database.</p>
<pre class="language-python"><code class="language-python"><span class="token keyword">from</span> agno<span class="token punctuation">.</span>agent <span class="token keyword">import</span> Agent
<span class="token keyword">from</span> agno<span class="token punctuation">.</span>db<span class="token punctuation">.</span>sqlite <span class="token keyword">import</span> SqliteDb
<span class="token keyword">from</span> agno<span class="token punctuation">.</span>models<span class="token punctuation">.</span>google <span class="token keyword">import</span> Gemini
<span class="token keyword">from</span> agno<span class="token punctuation">.</span>tools<span class="token punctuation">.</span>yfinance <span class="token keyword">import</span> YFinanceTools

agent_db <span class="token operator">=</span> SqliteDb<span class="token punctuation">(</span>db_file<span class="token operator">=</span><span class="token string">"tmp/agents.db"</span><span class="token punctuation">)</span>

agent <span class="token operator">=</span> Agent<span class="token punctuation">(</span>
    model<span class="token operator">=</span>Gemini<span class="token punctuation">(</span><span class="token builtin">id</span><span class="token operator">=</span><span class="token string">"gemini-3-flash-preview"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
    tools<span class="token operator">=</span><span class="token punctuation">[</span>YFinanceTools<span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">]</span><span class="token punctuation">,</span>
    db<span class="token operator">=</span>agent_db<span class="token punctuation">,</span>
    add_history_to_context<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span>
    num_history_runs<span class="token operator">=</span><span class="token number">5</span><span class="token punctuation">,</span>
<span class="token punctuation">)</span>

<span class="token keyword">if</span> __name__ <span class="token operator">==</span> <span class="token string">"__main__"</span><span class="token punctuation">:</span>
    session_id <span class="token operator">=</span> <span class="token string">"finance-session"</span>

    <span class="token comment"># Turn 1: Analyze a stock</span>
    agent<span class="token punctuation">.</span>print_response<span class="token punctuation">(</span><span class="token string">"Quick investment brief on NVIDIA"</span><span class="token punctuation">,</span> session_id<span class="token operator">=</span>session_id<span class="token punctuation">)</span>

    <span class="token comment"># Turn 2: Agent remembers NVDA from turn 1</span>
    agent<span class="token punctuation">.</span>print_response<span class="token punctuation">(</span><span class="token string">"Compare that to Tesla"</span><span class="token punctuation">,</span> session_id<span class="token operator">=</span>session_id<span class="token punctuation">)</span>

    <span class="token comment"># Turn 3: Recommendation based on full conversation</span>
    agent<span class="token punctuation">.</span>print_response<span class="token punctuation">(</span><span class="token string">"Which looks like the better investment?"</span><span class="token punctuation">,</span> session_id<span class="token operator">=</span>session_id<span class="token punctuation">)</span>
</code></pre>
<p>Use a consistent <code>session_id</code> to persist conversation across runs.</p>
<h3>Pattern 2: User Memory</h3>
<p>Remember facts about the user across sessions. The <code>MemoryManager</code> extracts preferences automatically and stores them in the database.</p>
<pre class="language-python"><code class="language-python"><span class="token keyword">from</span> agno<span class="token punctuation">.</span>agent <span class="token keyword">import</span> Agent
<span class="token keyword">from</span> agno<span class="token punctuation">.</span>memory <span class="token keyword">import</span> MemoryManager
<span class="token keyword">from</span> agno<span class="token punctuation">.</span>models<span class="token punctuation">.</span>google <span class="token keyword">import</span> Gemini
<span class="token keyword">from</span> agno<span class="token punctuation">.</span>db<span class="token punctuation">.</span>sqlite <span class="token keyword">import</span> SqliteDb

agent_db <span class="token operator">=</span> SqliteDb<span class="token punctuation">(</span>db_file<span class="token operator">=</span><span class="token string">"tmp/agents.db"</span><span class="token punctuation">)</span>

memory_manager <span class="token operator">=</span> MemoryManager<span class="token punctuation">(</span>
    model<span class="token operator">=</span>Gemini<span class="token punctuation">(</span><span class="token builtin">id</span><span class="token operator">=</span><span class="token string">"gemini-3-flash-preview"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
    db<span class="token operator">=</span>agent_db<span class="token punctuation">,</span>
<span class="token punctuation">)</span>

agent <span class="token operator">=</span> Agent<span class="token punctuation">(</span>
    model<span class="token operator">=</span>Gemini<span class="token punctuation">(</span><span class="token builtin">id</span><span class="token operator">=</span><span class="token string">"gemini-3-flash-preview"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
    memory_manager<span class="token operator">=</span>memory_manager<span class="token punctuation">,</span>
    enable_user_memory<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span>
<span class="token punctuation">)</span>

<span class="token comment"># First conversation — preferences extracted and stored</span>
agent<span class="token punctuation">.</span>print_response<span class="token punctuation">(</span>
    <span class="token string">"I'm interested in AI stocks. My risk tolerance is moderate."</span><span class="token punctuation">,</span>
    user_id<span class="token operator">=</span><span class="token string">"investor@example.com"</span><span class="token punctuation">,</span>
<span class="token punctuation">)</span>

<span class="token comment"># Later conversation — agent remembers</span>
agent<span class="token punctuation">.</span>print_response<span class="token punctuation">(</span>
    <span class="token string">"What stocks would you recommend for me?"</span><span class="token punctuation">,</span>
    user_id<span class="token operator">=</span><span class="token string">"investor@example.com"</span><span class="token punctuation">,</span>
<span class="token punctuation">)</span>
</code></pre>
<p><code>enable_user_memory=True</code> runs the <code>MemoryManager</code> in parallel with every run. Use <code>enable_agentic_memory=True</code> to let the agent decide when to store memories via tool calls. More efficient, doesn't run on every response.</p>
<h3>Pattern 3: Learned Memory</h3>
<p>Now let's add learned memory: insights that apply beyond just one user. The key is a custom tool that saves learnings to a knowledge base:</p>
<pre class="language-python"><code class="language-python"><span class="token keyword">import</span> json
<span class="token keyword">from</span> datetime <span class="token keyword">import</span> datetime<span class="token punctuation">,</span> timezone

<span class="token keyword">from</span> agno<span class="token punctuation">.</span>agent <span class="token keyword">import</span> Agent
<span class="token keyword">from</span> agno<span class="token punctuation">.</span>db<span class="token punctuation">.</span>sqlite <span class="token keyword">import</span> SqliteDb
<span class="token keyword">from</span> agno<span class="token punctuation">.</span>knowledge <span class="token keyword">import</span> Knowledge
<span class="token keyword">from</span> agno<span class="token punctuation">.</span>models<span class="token punctuation">.</span>google <span class="token keyword">import</span> Gemini
<span class="token keyword">from</span> agno<span class="token punctuation">.</span>vectordb<span class="token punctuation">.</span>chroma <span class="token keyword">import</span> ChromaDb

agent_db <span class="token operator">=</span> SqliteDb<span class="token punctuation">(</span>db_file<span class="token operator">=</span><span class="token string">"tmp/agents.db"</span><span class="token punctuation">)</span>

learnings_kb <span class="token operator">=</span> Knowledge<span class="token punctuation">(</span>
    name<span class="token operator">=</span><span class="token string">"Agent Learnings"</span><span class="token punctuation">,</span>
    vector_db<span class="token operator">=</span>ChromaDb<span class="token punctuation">(</span>
        name<span class="token operator">=</span><span class="token string">"learnings"</span><span class="token punctuation">,</span>
        persistent_client<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span>
        search_type<span class="token operator">=</span>SearchType<span class="token punctuation">.</span>hybrid<span class="token punctuation">,</span>
    <span class="token punctuation">)</span><span class="token punctuation">,</span>
<span class="token punctuation">)</span>

<span class="token keyword">def</span> <span class="token function">save_learning</span><span class="token punctuation">(</span>title<span class="token punctuation">:</span> <span class="token builtin">str</span><span class="token punctuation">,</span> learning<span class="token punctuation">:</span> <span class="token builtin">str</span><span class="token punctuation">)</span> <span class="token operator">-</span><span class="token operator">&gt;</span> <span class="token builtin">str</span><span class="token punctuation">:</span>
    <span class="token triple-quoted-string string">"""
    Save a reusable insight to the knowledge base.

    Args:
        title: Short descriptive title
        learning: The insight — specific and actionable
    """</span>
    payload <span class="token operator">=</span> <span class="token punctuation">{</span>
        <span class="token string">"title"</span><span class="token punctuation">:</span> title<span class="token punctuation">.</span>strip<span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
        <span class="token string">"learning"</span><span class="token punctuation">:</span> learning<span class="token punctuation">.</span>strip<span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
        <span class="token string">"saved_at"</span><span class="token punctuation">:</span> datetime<span class="token punctuation">.</span>now<span class="token punctuation">(</span>timezone<span class="token punctuation">.</span>utc<span class="token punctuation">)</span><span class="token punctuation">.</span>isoformat<span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
    <span class="token punctuation">}</span>

    learnings_kb<span class="token punctuation">.</span>add_content<span class="token punctuation">(</span>
        name<span class="token operator">=</span>payload<span class="token punctuation">[</span><span class="token string">"title"</span><span class="token punctuation">]</span><span class="token punctuation">,</span>
        text_content<span class="token operator">=</span>json<span class="token punctuation">.</span>dumps<span class="token punctuation">(</span>payload<span class="token punctuation">)</span><span class="token punctuation">,</span>
    <span class="token punctuation">)</span>

    <span class="token keyword">return</span> <span class="token string-interpolation"><span class="token string">f"Saved: '</span><span class="token interpolation"><span class="token punctuation">{</span>title<span class="token punctuation">}</span></span><span class="token string">'"</span></span>

agent <span class="token operator">=</span> Agent<span class="token punctuation">(</span>
    model<span class="token operator">=</span>Gemini<span class="token punctuation">(</span><span class="token builtin">id</span><span class="token operator">=</span><span class="token string">"gemini-3-flash-preview"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
    tools<span class="token operator">=</span><span class="token punctuation">[</span>save_learning<span class="token punctuation">]</span><span class="token punctuation">,</span>
    knowledge<span class="token operator">=</span>learnings_kb<span class="token punctuation">,</span>
    search_knowledge<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span>
    db<span class="token operator">=</span>agent_db<span class="token punctuation">,</span>
<span class="token punctuation">)</span>
</code></pre>
<p>The agent now has two capabilities:</p>
<ol>
<li><strong>Search first</strong> — Before answering, it searches for relevant prior learnings</li>
<li><strong>Save learnings</strong> — When it discovers something reusable, it saves it</li>
</ol>
<p>But how do you prevent the agent from saving garbage?</p>
<h3>Bonus: Human-in-the-Loop Gating</h3>
<p>The quality of your knowledge base determines the quality of learning. Garbage in, garbage out.</p>
<p>The solution: the agent proposes learnings, but only saves with explicit user approval.</p>
<pre class="language-python"><code class="language-python"><span class="token keyword">from</span> agno<span class="token punctuation">.</span>tools <span class="token keyword">import</span> tool

<span class="token decorator annotation punctuation">@tool</span><span class="token punctuation">(</span>requires_confirmation<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">)</span>
<span class="token keyword">def</span> <span class="token function">save_learning</span><span class="token punctuation">(</span>title<span class="token punctuation">:</span> <span class="token builtin">str</span><span class="token punctuation">,</span> learning<span class="token punctuation">:</span> <span class="token builtin">str</span><span class="token punctuation">)</span> <span class="token operator">-</span><span class="token operator">&gt;</span> <span class="token builtin">str</span><span class="token punctuation">:</span>
    <span class="token triple-quoted-string string">"""Save a reusable insight. Requires user confirmation."""</span>
    <span class="token comment"># ... same implementation</span>
</code></pre>
<p>Handle the confirmation flow:</p>
<pre class="language-python"><code class="language-python">run_response <span class="token operator">=</span> agent<span class="token punctuation">.</span>run<span class="token punctuation">(</span><span class="token string">"Analyze NVDA and save any insights"</span><span class="token punctuation">)</span>

<span class="token keyword">for</span> requirement <span class="token keyword">in</span> run_response<span class="token punctuation">.</span>active_requirements<span class="token punctuation">:</span>
    <span class="token keyword">if</span> requirement<span class="token punctuation">.</span>needs_confirmation<span class="token punctuation">:</span>
        <span class="token keyword">print</span><span class="token punctuation">(</span><span class="token string-interpolation"><span class="token string">f"Tool: </span><span class="token interpolation"><span class="token punctuation">{</span>requirement<span class="token punctuation">.</span>tool_execution<span class="token punctuation">.</span>tool_name<span class="token punctuation">}</span></span><span class="token string">"</span></span><span class="token punctuation">)</span>
        <span class="token keyword">print</span><span class="token punctuation">(</span><span class="token string-interpolation"><span class="token string">f"Args: </span><span class="token interpolation"><span class="token punctuation">{</span>requirement<span class="token punctuation">.</span>tool_execution<span class="token punctuation">.</span>tool_args<span class="token punctuation">}</span></span><span class="token string">"</span></span><span class="token punctuation">)</span>

        <span class="token keyword">if</span> user_approves<span class="token punctuation">:</span>
            requirement<span class="token punctuation">.</span>confirm<span class="token punctuation">(</span><span class="token punctuation">)</span>
        <span class="token keyword">else</span><span class="token punctuation">:</span>
            requirement<span class="token punctuation">.</span>reject<span class="token punctuation">(</span><span class="token punctuation">)</span>

run_response <span class="token operator">=</span> agent<span class="token punctuation">.</span>continue_run<span class="token punctuation">(</span>
    run_id<span class="token operator">=</span>run_response<span class="token punctuation">.</span>run_id<span class="token punctuation">,</span>
    requirements<span class="token operator">=</span>run_response<span class="token punctuation">.</span>requirements<span class="token punctuation">,</span>
<span class="token punctuation">)</span>
</code></pre>
<p>The agent proposes, the human gates. High-signal knowledge only.</p>
<h2>5. Video demo</h2>
<p>Here's a video demo that starts by showcasing user memory, then learned memory with user confirmation.</p>
<video width="700" height="700" class="rounded-2xl" loop="" autoplay="" muted="" playsinline="" controls=""><source src="/videos/memory-how-agents-learn.mp4">Your browser does not support the video tag.</video>
<h2>5. What makes a good learning</h2>
<p>A learning is worth saving if it's:</p>
<ul>
<li><strong>Specific</strong>: "Tech P/E ratios typically range 20-35x" not "P/E varies"</li>
<li><strong>Actionable</strong>: Can be applied to future queries</li>
<li><strong>Generalizable</strong>: Useful beyond this one conversation</li>
</ul>
<p>Don't save: raw data, one-off facts, summaries, speculation.</p>
<blockquote class="not-prose relative isolate pl-6 text-ink py-3 text-lg"><span aria-hidden="true" class="absolute inset-y-1 left-0 w-0.5 rounded-full bg-accent"></span><div class="flex gap-3"><div class="space-y-1"><div class="leading-relaxed"><p>Most queries should NOT produce a learning, and that's OK.</p></div></div></div></blockquote>
<h3>Where to store</h3>
<table><thead><tr><th>Memory Type</th><th>Key</th><th>Agno Component</th></tr></thead><tbody><tr><td>Session</td><td><code>session_id</code></td><td><code>SqliteDb</code>, <code>PostgresDb</code>, <code>MongoDB</code></td></tr><tr><td>User</td><td><code>user_id</code></td><td><code>MemoryManager</code> + Database</td></tr><tr><td>Learned</td><td><code>learning_id</code></td><td><code>Knowledge</code> + <code>ChromaDb</code>, <code>PgVector</code>, <code>Qdrant</code>, <code>Pinecone</code></td></tr></tbody></table>
<h3>Avoiding bloat</h3>
<p>The biggest mistake is storing too much. A bloated knowledge base hurts retrieval and makes the agent worse.</p>
<p>The upside: because learnings are stored explicitly (not in weights), they're auditable and reversible. Bad learning? Delete it. System immediately improves.</p>
<h2>6. Get started</h2>
<p>This blog comes with complete working code. Here are <a target="_blank" rel="noopener noreferrer" class="" href="https://agno.link/getting-started">12 cookbooks</a> that take you from "what is an agent" to building agents with memory, knowledge, state, guardrails, and more. <a target="_blank" rel="noopener noreferrer" class="" href="https://agno.link/getting-started">Link again for reference</a>.</p>
<table><thead><tr><th style="text-align:left">#</th><th style="text-align:left">Cookbook</th><th style="text-align:left">What You'll Learn</th></tr></thead><tbody><tr><td style="text-align:left">01</td><td style="text-align:left">Tools</td><td style="text-align:left">Give agents the ability to fetch real-time data</td></tr><tr><td style="text-align:left">02</td><td style="text-align:left">Storage</td><td style="text-align:left">Persist conversations across runs</td></tr><tr><td style="text-align:left">03</td><td style="text-align:left">Knowledge</td><td style="text-align:left">Load documents and search with hybrid retrieval</td></tr><tr><td style="text-align:left">04</td><td style="text-align:left">Custom Tools</td><td style="text-align:left">Write your own tools, add self-learning</td></tr><tr><td style="text-align:left">05</td><td style="text-align:left">Structured Output</td><td style="text-align:left">Return typed Pydantic objects</td></tr><tr><td style="text-align:left">06</td><td style="text-align:left">Typed I/O</td><td style="text-align:left">Full type safety on input and output</td></tr><tr><td style="text-align:left">07</td><td style="text-align:left">Memory</td><td style="text-align:left">Remember user preferences across sessions</td></tr><tr><td style="text-align:left">08</td><td style="text-align:left">State Management</td><td style="text-align:left">Track and persist structured state</td></tr><tr><td style="text-align:left">09</td><td style="text-align:left">Multi-Agent Teams</td><td style="text-align:left">Coordinate specialized agents</td></tr><tr><td style="text-align:left">10</td><td style="text-align:left">Workflows</td><td style="text-align:left">Sequential pipelines with predictable data flow</td></tr><tr><td style="text-align:left">11</td><td style="text-align:left">Guardrails</td><td style="text-align:left">Input validation, PII detection, prompt injection defense</td></tr><tr><td style="text-align:left">12</td><td style="text-align:left">Human in the Loop</td><td style="text-align:left">Require confirmation before sensitive actions</td></tr></tbody></table>
<p>Each builds on fundamentals, but you can jump to any one.</p>
<h3>Setup</h3>
<pre class="language-bash"><code class="language-bash"><span class="token function">git</span> clone https://github.com/agno-agi/agno.git
<span class="token builtin class-name">cd</span> agno

uv venv .getting-started --python <span class="token number">3.12</span>
<span class="token builtin class-name">source</span> .getting-started/bin/activate

uv pip <span class="token function">install</span> -r cookbook/00_getting_started/requirements.txt

<span class="token builtin class-name">export</span> <span class="token assign-left variable">GOOGLE_API_KEY</span><span class="token operator">=</span>your-google-api-key
</code></pre>
<h3>Run an example</h3>
<p>Each cookbook is self-contained:</p>
<pre class="language-bash"><code class="language-bash">python cookbook/00_getting_started/agent_with_tools.py
</code></pre>
<p>Want a visual interface? Agent OS gives you a web UI for chatting with agents, exploring sessions, and monitoring traces:</p>
<pre class="language-bash"><code class="language-bash">python cookbook/00_getting_started/run.py
</code></pre>
<p>Then visit <a target="_blank" rel="noopener noreferrer" class="" href="https://os.agno.com">os.agno.com</a> and add <code>http://localhost:7777</code> as an endpoint.</p>
<h3>Swapping models</h3>
<p>These examples use Gemini 3 Flash by default — fast, reliable tool calling, cheap enough to experiment freely. But Agno is model-agnostic:</p>
<pre class="language-python"><code class="language-python"><span class="token comment"># Gemini (default)</span>
<span class="token keyword">from</span> agno<span class="token punctuation">.</span>models<span class="token punctuation">.</span>google <span class="token keyword">import</span> Gemini
model <span class="token operator">=</span> Gemini<span class="token punctuation">(</span><span class="token builtin">id</span><span class="token operator">=</span><span class="token string">"gemini-3-flash-preview"</span><span class="token punctuation">)</span>

<span class="token comment"># OpenAI</span>
<span class="token keyword">from</span> agno<span class="token punctuation">.</span>models<span class="token punctuation">.</span>openai <span class="token keyword">import</span> OpenAIChat
model <span class="token operator">=</span> OpenAIChat<span class="token punctuation">(</span><span class="token builtin">id</span><span class="token operator">=</span><span class="token string">"gpt-5.2"</span><span class="token punctuation">)</span>

<span class="token comment"># Anthropic</span>
<span class="token keyword">from</span> agno<span class="token punctuation">.</span>models<span class="token punctuation">.</span>anthropic <span class="token keyword">import</span> Claude
model <span class="token operator">=</span> Claude<span class="token punctuation">(</span><span class="token builtin">id</span><span class="token operator">=</span><span class="token string">"claude-sonnet-4-5"</span><span class="token punctuation">)</span>
</code></pre>
<p>One line change. Everything else stays the same.</p>
<hr>
<p>If you enjoyed reading this, <a target="_blank" rel="noopener noreferrer" class="" href="https://github.com/agno-agi/agno">star Agno on GitHub</a>. It helps more than you'd think. Questions or feedback? Reach out on <a target="_blank" rel="noopener noreferrer" class="" href="https://x.com/ashpreetbedi">X</a>.</p>]]></content:encoded>
            <author>hi@ashpreetbedi.com (Ashpreet Bedi)</author>
        </item>
        <item>
            <title><![CDATA[Build Your Own Multi-Agent System]]></title>
            <link>https://ashpreetbedi.com/articles/multi-agent-system-railway</link>
            <guid isPermaLink="false">https://ashpreetbedi.com/articles/multi-agent-system-railway</guid>
            <pubDate>Thu, 29 Jan 2026 00:00:00 GMT</pubDate>
            <content:encoded><![CDATA[<p>Instead of a hello world tutorial, let me show you how to build a live multi-agent system. We'll run it locally on Docker and deploy to production on <a target="_blank" rel="noopener noreferrer" class="" href="https://railway.com">Railway</a>.</p>
<p>This is a production-grade system that includes:</p>
<table><thead><tr><th>Feature</th><th>Description</th></tr></thead><tbody><tr><td><strong>Learning</strong></td><td>Agents remember and improve over time</td></tr><tr><td><strong>Persistence</strong></td><td>State, sessions, and memory backed by PostgreSQL</td></tr><tr><td><strong>Agentic RAG</strong></td><td>Knowledge retrieval that knows when and how to search</td></tr><tr><td><strong>MCP Tools</strong></td><td>Connect to external services via Model Context Protocol</td></tr><tr><td><strong>Monitoring</strong></td><td>Full visibility via the AgentOS control plane</td></tr></tbody></table>
<p>You'll also learn how to extend it with your own agents.</p>
<p>5 minute read. Running locally in 5. Deployed to production in 20.</p>
<h2>The Agents</h2>
<p>We'll build three agents, each demonstrating a different pattern:</p>
<ul>
<li><strong>Pal</strong> - AI-powered second brain. Captures notes, bookmarks, people, meetings. Researches the web. Learns over time.</li>
<li><strong>Knowledge Agent</strong> - Answers questions from a knowledge base.</li>
<li><strong>MCP Agent</strong> - Connects to external services via MCP.</li>
</ul>
<p>Each agent can be extended to fit your needs.</p>
<h2>Run Locally (5 minutes)</h2>
<h3>Prerequisites</h3>
<ul>
<li>Install <a target="_blank" rel="noopener noreferrer" class="" href="https://www.docker.com/products/docker-desktop">Docker Desktop</a></li>
<li>Get an <a target="_blank" rel="noopener noreferrer" class="" href="https://platform.openai.com/api-keys">OpenAI API key</a></li>
</ul>
<h3>Setup</h3>
<p>Clone the repo and export your OpenAI API key:</p>
<pre class="language-bash"><code class="language-bash"><span class="token function">git</span> clone <span class="token punctuation">\</span>
    https://github.com/agno-agi/agentos-railway-template.git <span class="token punctuation">\</span>
    agentos-railway

<span class="token builtin class-name">cd</span> agentos-railway

<span class="token builtin class-name">export</span> <span class="token assign-left variable">OPENAI_API_KEY</span><span class="token operator">=</span><span class="token string">"sk-***"</span>
</code></pre>
<p>Start the application (API + Database):</p>
<pre class="language-bash"><code class="language-bash"><span class="token function">docker</span> compose up -d --build
</code></pre>
<p>That's it. Your system is running. Here's how it looks:</p>
<video width="700" height="700" class="rounded-2xl" loop="" autoplay="" muted="" playsinline="" controls=""><source src="/videos/agentos-local-setup.mp4">Your browser does not support the video tag.</video>
<h3>Connect to the UI</h3>
<ol>
<li>Open <a target="_blank" rel="noopener noreferrer" class="" href="https://os.agno.com">os.agno.com</a></li>
<li>Click <strong>Add OS</strong> → <strong>Local</strong></li>
<li>Enter <code>http://localhost:8000</code> as the URL</li>
</ol>
<p>Now chat with Pal:</p>
<pre class="language-shell"><code class="language-shell"><span class="token operator">&gt;</span> Note: decided to use Postgres <span class="token keyword">for</span> the new project - better JSON support

<span class="token operator">&gt;</span> Research event sourcing patterns and save the key findings

<span class="token operator">&gt;</span> What <span class="token keyword">do</span> I know about event sourcing?
</code></pre>
<h2>Deploy to Production (10 minutes)</h2>
<p>I've made it easy to deploy to Railway - just login and run a script.</p>
<h3>Prerequisites</h3>
<ul>
<li>Install the <a target="_blank" rel="noopener noreferrer" class="" href="https://docs.railway.com/guides/cli">Railway CLI</a></li>
</ul>
<h3>Deploy</h3>
<p>Login to Railway and run the deploy script:</p>
<pre class="language-bash"><code class="language-bash">railway login

./scripts/railway_up.sh
</code></pre>
<p>The script provisions PostgreSQL, configures environment variables, and deploys your system. Give it a few minutes for the services to spin up.</p>
<h3>Connect to the UI</h3>
<ol>
<li>Open <a target="_blank" rel="noopener noreferrer" class="" href="https://os.agno.com">os.agno.com</a></li>
<li>Click <strong>Add OS</strong> → <strong>Live</strong></li>
<li>Enter your Railway domain</li>
</ol>
<p>You now have a production multi-agent system. Watch it go live in ~2 mins:</p>
<video width="700" height="700" class="rounded-2xl" loop="" autoplay="" muted="" playsinline="" controls=""><source src="/videos/agentos-railway-deploy.mp4">Your browser does not support the video tag.</video>
<h2>What's Included</h2>
<h3>Pal (Personal Agent that Learns)</h3>
<p>Your AI-powered second brain. Captures notes, bookmarks, people, meetings. Researches the web and saves findings. Learns from errors so it doesn't repeat them.</p>
<p>I wrote more about Pal here: <a target="_blank" rel="noopener noreferrer" class="" href="https://x.com/ashpreetbedi/status/2016702682925334818">Building Pal: Personal Agent that Learns</a></p>
<h3>Knowledge Agent (Agentic RAG)</h3>
<p>Store any type of docs in a vector store, chat with it using Agentic RAG.</p>
<pre class="language-python"><code class="language-python">knowledge_agent <span class="token operator">=</span> Agent<span class="token punctuation">(</span>
    model<span class="token operator">=</span>OpenAIResponses<span class="token punctuation">(</span><span class="token builtin">id</span><span class="token operator">=</span><span class="token string">"gpt-5.2"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
    knowledge<span class="token operator">=</span>knowledge<span class="token punctuation">,</span>
    search_knowledge<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span>
<span class="token punctuation">)</span>
</code></pre>
<h3>MCP Agent (MCP Tools)</h3>
<p>Connects to external tools via the Model Context Protocol. Point it at any MCP server and it gets access to those tools.</p>
<pre class="language-python"><code class="language-python">mcp_agent <span class="token operator">=</span> Agent<span class="token punctuation">(</span>
    model<span class="token operator">=</span>OpenAIResponses<span class="token punctuation">(</span><span class="token builtin">id</span><span class="token operator">=</span><span class="token string">"gpt-5.2"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
    tools<span class="token operator">=</span><span class="token punctuation">[</span>MCPTools<span class="token punctuation">(</span>url<span class="token operator">=</span><span class="token string">"https://docs.agno.com/mcp"</span><span class="token punctuation">)</span><span class="token punctuation">]</span><span class="token punctuation">,</span>
<span class="token punctuation">)</span>
</code></pre>
<h2>Create Your Own Agent</h2>
<p>Now let's add a custom agent to the system. We'll build a research agent that uses the <a target="_blank" rel="noopener noreferrer" class="" href="https://exa.ai">Exa</a> MCP server.</p>
<p>Create <code>agents/research_agent.py</code>:</p>
<pre class="language-python"><code class="language-python"><span class="token keyword">from</span> agno<span class="token punctuation">.</span>agent <span class="token keyword">import</span> Agent
<span class="token keyword">from</span> agno<span class="token punctuation">.</span>models<span class="token punctuation">.</span>openai <span class="token keyword">import</span> OpenAIResponses
<span class="token keyword">from</span> agno<span class="token punctuation">.</span>tools<span class="token punctuation">.</span>mcp <span class="token keyword">import</span> MCPTools

<span class="token keyword">from</span> db <span class="token keyword">import</span> get_postgres_db

<span class="token comment"># Exa MCP for research</span>
EXA_MCP_URL <span class="token operator">=</span> <span class="token punctuation">(</span>
    <span class="token string-interpolation"><span class="token string">f"https://mcp.exa.ai/mcp?tools="</span></span>
    <span class="token string">"web_search_exa,company_research_exa,people_search_exa"</span>
<span class="token punctuation">)</span>

research_agent <span class="token operator">=</span> Agent<span class="token punctuation">(</span>
    <span class="token builtin">id</span><span class="token operator">=</span><span class="token string">"research-agent"</span><span class="token punctuation">,</span>
    name<span class="token operator">=</span><span class="token string">"Research Agent"</span><span class="token punctuation">,</span>
    model<span class="token operator">=</span>OpenAIResponses<span class="token punctuation">(</span><span class="token builtin">id</span><span class="token operator">=</span><span class="token string">"gpt-5.2"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
    db<span class="token operator">=</span>get_postgres_db<span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
    tools<span class="token operator">=</span><span class="token punctuation">[</span>MCPTools<span class="token punctuation">(</span>url<span class="token operator">=</span>EXA_MCP_URL<span class="token punctuation">)</span><span class="token punctuation">]</span><span class="token punctuation">,</span>
    instructions<span class="token operator">=</span><span class="token triple-quoted-string string">"""\
You are a research agent. You help users find information about:
- Companies and startups
- People and their backgrounds
- Topics and trends

Be thorough but concise. Cite your sources.
"""</span><span class="token punctuation">,</span>
<span class="token punctuation">)</span>
</code></pre>
<p>Register it in <code>app/main.py</code>:</p>
<pre class="language-python"><code class="language-python"><span class="token keyword">from</span> agents<span class="token punctuation">.</span>research_agent <span class="token keyword">import</span> research_agent

agent_os <span class="token operator">=</span> AgentOS<span class="token punctuation">(</span>
    agents<span class="token operator">=</span><span class="token punctuation">[</span>pal<span class="token punctuation">,</span> knowledge_agent<span class="token punctuation">,</span> mcp_agent<span class="token punctuation">,</span> research_agent<span class="token punctuation">]</span><span class="token punctuation">,</span>
<span class="token punctuation">)</span>
</code></pre>
<p>Your agent is now part of the system. Chat with it:</p>
<video width="700" height="700" class="rounded-2xl" loop="" autoplay="" muted="" playsinline="" controls=""><source src="/videos/agentos-research-agent.mp4">Your browser does not support the video tag.</video>
<blockquote class="not-prose relative isolate pl-6 text-ink py-3 text-lg"><span aria-hidden="true" class="absolute inset-y-1 left-0 w-0.5 rounded-full"></span><div class="flex gap-3"><div class="space-y-1"><div class="leading-relaxed"><p>If the agent doesn't show up, press refresh on the UI (top right corner) or restart containers with <code>docker compose restart</code>.</p></div></div></div></blockquote>
<h2>Wrapping Up</h2>
<p>You now have a live multi-agent system with:</p>
<table><thead><tr><th>Feature</th><th>Description</th></tr></thead><tbody><tr><td><strong>Learning</strong></td><td>Agents that remember and improve over time</td></tr><tr><td><strong>Persistence</strong></td><td>PostgreSQL for storing agent sessions, state, and memory</td></tr><tr><td><strong>Research</strong></td><td>Web search, company lookup, people search via Exa</td></tr><tr><td><strong>Monitoring</strong></td><td>Full visibility via the AgentOS control plane</td></tr><tr><td><strong>Extensibility</strong></td><td>Add agents, tools, and integrations as needed</td></tr></tbody></table>
<h2>What's Next</h2>
<ul>
<li><strong>Build more agents</strong> - Add specialized <a target="_blank" rel="noopener noreferrer" class="" href="https://docs.agno.com/agents">agents</a> for your use case</li>
<li><strong>Add tools</strong> - Extend your agents with <a target="_blank" rel="noopener noreferrer" class="" href="https://docs.agno.com/tools/toolkits">100+ toolkits</a></li>
<li><strong>Go multi-agent</strong> - Create multi-agent <a target="_blank" rel="noopener noreferrer" class="" href="https://docs.agno.com/teams">teams</a> and <a target="_blank" rel="noopener noreferrer" class="" href="https://docs.agno.com/workflows">workflows</a></li>
<li><strong>Go multi-channel</strong> - Expose your agents via Slack, Discord, WhatsApp</li>
<li><strong>Build an AI product</strong> - From 2-person startups to Fortune 500 companies, AgentOS is the foundation for agentic products</li>
</ul>
<blockquote class="not-prose relative isolate pl-6 text-ink py-3 text-lg"><span aria-hidden="true" class="absolute inset-y-1 left-0 w-0.5 rounded-full bg-accent"></span><div class="flex gap-3"><div class="space-y-1"><div class="leading-relaxed"><p>The system is yours. You have a head start - make it count.</p></div></div></div></blockquote>
<hr>
<h2>Learn More</h2>
<ul>
<li>
<a target="_blank" rel="noopener noreferrer" class="" href="https://github.com/agno-agi/agentos-railway-template">GitHub repo</a>
</li>
<li>
<a target="_blank" rel="noopener noreferrer" class="" href="https://docs.agno.com">Agno documentation</a>
</li>
</ul>
<p>Built with <a target="_blank" rel="noopener noreferrer" class="" href="https://github.com/agno-agi/agno">Agno</a>. Give it a ⭐️</p>]]></content:encoded>
            <author>hi@ashpreetbedi.com (Ashpreet Bedi)</author>
        </item>
        <item>
            <title><![CDATA[Scaling Agentic Software]]></title>
            <link>https://ashpreetbedi.com/articles/scaling-agentic-software-part-1</link>
            <guid isPermaLink="false">https://ashpreetbedi.com/articles/scaling-agentic-software-part-1</guid>
            <pubDate>Thu, 16 Apr 2026 00:00:00 GMT</pubDate>
            <content:encoded><![CDATA[<span class="text-xl font-semibold"><p><strong>What is the simplest architecture for running a multi-agent system at scale?</strong></p></span>
<p>I want to deploy agents as a real service. Multi-user, RBAC, JWT-based auth. Sessions, memory, and knowledge backed by a database. Horizontally Scalable. Able to serve thousands of concurrent requests. The kind of product you'd actually ship to users.</p>
<p>Could the answer be: <strong>a FastAPI app and a Postgres database?</strong></p>
<p>So I spent some time building one to find out. 14 agents, 11 multi-agent teams, 5 workflows. Hundreds of tools, approvals, evals, schedules. All running in a single FastAPI process against a single PostgreSQL database. It's open source: <a target="_blank" rel="noopener noreferrer" class="" href="https://github.com/agno-agi/demo-os">Demo AgentOS</a>.</p>
<p>I'll walk through the architecture in this post. In the next one we'll dive into what breaks when you push it.</p>
<h2>The Bar</h2>
<p>"Scale" gets thrown around quite a bit. In this case, scale means breadth. The surface area of a real product. Every concern a CTO would actually need to address before shipping a product to users:</p>
<p><strong>Multi-user and multi-tenancy.</strong> Every user gets their own sessions, memory, and context. The system isolates every resource an agent touches, across every user, on every request.</p>
<p>Note: Context bleeding is a data breach, not a bug.</p>
<p><strong>Auth and RBAC.</strong> JWT verification, role-based access control, scoped permissions. This applies to the API layer, the agents, the tools they call, and the data they can access. Dev and production should have different security postures.</p>
<p><strong>Real persistence.</strong> Sessions, memory, and knowledge stored in a database, with regular backups and data access policies. Everything needs to comply with user-data protection laws like GDPR and CCPA.</p>
<p><strong>Serving requests at scale.</strong> The system should be able to handle thousands of concurrent requests. Streaming responses should be held open. Background work (memory extraction, summarization, learning) running alongside the primary model call. All of it competing for the same HTTP transports, connection pools, and database connections. The hard part is not serving one request. It is serving the thousandth one without stalling the ninth one.</p>
<p><strong>Observability.</strong> Tracing every agent run, every tool call, every delegation in a multi-agent team. When something goes wrong at step 7 of a 12-step workflow, you need to see exactly what happened and why.</p>
<p><strong>Governance.</strong> Layered authority over what agents can do. Some tools run freely. Some need user approval. Some need admin sign-off. Approval flows, audit trails, and the ability to pause execution mid-run.</p>
<p><strong>Reliability and evals.</strong> Agents are testable software. You need smoke tests, tool call validation, LLM-judged accuracy, performance baselines. Without evals, every change is a guess.</p>
<p>If this is the bar, the question is: what's the simplest architecture that clears it?</p>
<h2>The Architecture</h2>
<p>One FastAPI process. One Postgres database. That's it.</p>
<p>The FastAPI app serves 14 agents, 11 multi-agent teams, 5 workflows using REST endpoints. Every request is a POST, every response is a server-sent event stream.</p>
<p>The database does more than you'd think. The Postgres database stores agent sessions, user memory, knowledge contents, learnings, schedules, and eval results. Pgvector handles embeddings for knowledge bases.</p>
<h2>The Components</h2>
<p>The 30+ components in the AgentOS showcase different agentic patterns.</p>
<img alt="Demo AgentOS" loading="lazy" width="700" height="700" decoding="async" data-nimg="1" class="rounded-2xl" style="color:transparent" srcset="/_next/image?url=%2Fimages%2Fdemo-agentos-ui.png&amp;w=750&amp;q=75 1x, /_next/image?url=%2Fimages%2Fdemo-agentos-ui.png&amp;w=1920&amp;q=75 2x" src="/_next/image?url=%2Fimages%2Fdemo-agentos-ui.png&amp;w=1920&amp;q=75">
<p>Some showcase <strong>HITL patterns</strong>. The Helpdesk agent wraps three tools: one that requires operator confirmation before restarting a service, one that pauses for user input on ticket priority, one that executes outside the agent runtime. The Approvals agent uses Agno's <code>@approval</code> decorator for blocking approval gates and audit-trailed operations. Both agents pause execution mid-run and resume on approval.</p>
<p>Some showcase <strong>guardrails</strong>. The Helpdesk agent has three pre-hooks: OpenAI moderation, PII detection, prompt injection detection. It also has a post-hook that scans responses for secret patterns (API keys, connection strings, SSNs) and rewrites them before they leave the process. An audit log hook records every run for compliance.</p>
<p>Some showcase <strong>multi-agent teams</strong>. Pal is a personal knowledge agent with five specialists. Dash is a data analyst with an Analyst/Engineer split. Coda is a coding agent with five specialists including a Planner and a Triager. The Research and Investment teams each ship in four modes (coordinate, route, broadcast, tasks) so you can see how the same set of members produces different behavior under different coordination patterns.</p>
<p>Some showcase <strong>step-based workflows</strong>. Morning Brief gathers calendar, email, and news in parallel and synthesizes a briefing. AI Research runs four parallel researchers and synthesizes their findings. Content Pipeline does parallel research plus a loop that iterates until an editor approves. Support Triage classifies a ticket, routes it to a specialist, and escalates if severity is high.</p>
<p>Some showcase <strong>state management</strong>. Taskboard demonstrates session state with agentic state updates. Injector demonstrates dependency injection through <code>RunContext</code>. Compressor demonstrates tool result compression with a cheaper model.</p>
<p>Some showcase <strong>scheduling</strong>. Morning Brief runs every weekday at 8am ET. AI Research runs every day at 7am UTC. The Scheduler agent lets users create, list, disable, and delete schedules at runtime through natural language.</p>
<p>The point is not that you need all of these. The point is that a single FastAPI process can host them without the architecture getting complicated.</p>
<h2>Governance as First-Class Infrastructure</h2>
<p>Three layers of governance sit on top of every agent.</p>
<p><strong>Pre-hooks</strong> run before the model sees the input. Moderation, PII detection, injection detection. If any hook raises, the request is rejected before a single token is generated.</p>
<p><strong>Approval gates</strong> pause the run mid-execution. A tool decorated with <code>requires_confirmation=True</code> or <code>@approval</code> streams a <code>RunPaused</code> event to the client with the tool name and arguments. The client shows the user an approve/reject UI. On approval, the run resumes from where it paused. This works because the session state is durable (stored in db).</p>
<p><strong>Post-hooks</strong> run on the output. The Helpdesk agent has an output guardrail that scans responses for secret patterns and rewrites them before they leave. Every run is audit-logged through a separate hook.</p>
<h2>What's Not Here</h2>
<p>No message queue. No worker pool. No separate vector database. No Redis. No microservices. No orchestrator service standing in front of the agents. No separate auth service.</p>
<p>Could you add them? Sure. Are they necessary to clear the bar I defined? Not yet. The point of this exercise is to find out where the simple architecture breaks, so the next decision (what to add) is grounded in actual load, not in speculation.</p>
<h2>What's Next</h2>
<p>Part 2 is what breaks when you scale this.</p>
<p>I'm going to load test it. Thousands of concurrent requests. Streaming responses held open. Background memory extraction competing with primary runs. Connection pools under pressure. I expect to find a few obvious bottlenecks and a couple of surprising ones.</p>
<p>Links:</p>
<ul>
<li>
<a target="_blank" rel="noopener noreferrer" class="" href="https://github.com/agno-agi/demo-os">Demo AgentOS on GitHub</a>
</li>
<li>
<a target="_blank" rel="noopener noreferrer" class="" href="https://github.com/agno-agi/agno">Agno GitHub</a>
</li>
<li>
<a target="_blank" rel="noopener noreferrer" class="" href="https://docs.agno.com/">Agno Docs</a>
</li>
</ul>]]></content:encoded>
            <author>hi@ashpreetbedi.com (Ashpreet Bedi)</author>
        </item>
        <item>
            <title><![CDATA[Self Learning Research Agent That Tracks Consensus Over Time]]></title>
            <link>https://ashpreetbedi.com/articles/self-learning-researcher</link>
            <guid isPermaLink="false">https://ashpreetbedi.com/articles/self-learning-researcher</guid>
            <pubDate>Tue, 16 Dec 2025 00:00:00 GMT</pubDate>
            <content:encoded><![CDATA[<p>In this post, we’ll build a <strong>self-learning research agent</strong> that does something more useful than one-off web searches. It captures the <em>current consensus</em>, compares it to past runs, explains what changed and why, and stores a clean snapshot so future runs get better.</p>
<p>No fine-tuning. No retraining. Just good system design.</p>
<h2>Table of Contents</h2>
<ol>
<li>Why research agents break down in practice</li>
<li>Research is about consensus, not answers</li>
<li>What is "self-learning"</li>
<li>Snapshot-based learning architecture</li>
<li>What we store in the knowledge base (and what we don’t)</li>
<li>End-to-end agent flow</li>
<li>Production Codebase (deployable anywhere)</li>
<li>Steps to run your own Self Learning Research Agent</li>
<li>Why this pattern works</li>
</ol>
<h2>1. Why research agents break down in practice</h2>
<p>Most research agents are <strong>stateless</strong>.</p>
<p>You ask a question today and get a well-written answer. You ask the same question tomorrow and get another well-written answer, but totally disconnected from the first one.</p>
<p>What's missing:</p>
<ul>
<li>No memory of prior conclusions</li>
<li>No notion of what changed</li>
<li>No way to tell if the answer is stabilizing or shifting</li>
</ul>
<blockquote class="not-prose relative isolate pl-6 text-ink py-3 text-lg"><span aria-hidden="true" class="absolute inset-y-1 left-0 w-0.5 rounded-full bg-accent"></span><div class="flex gap-3"><div class="space-y-1"><div class="leading-relaxed"><p>Research without memory is just search with formatting.</p></div></div></div></blockquote>
<p>Humans don't work this way. We remember what we believed before and pay attention when new information contradicts it.</p>
<p>That's the missing layer.</p>
<h2>2. Research is about consensus, not answers</h2>
<p>A single answer is rarely the goal of research.</p>
<p>What we actually care about is:</p>
<ul>
<li>what most credible sources agree on</li>
<li>where there is disagreement</li>
<li>how confident we should be</li>
</ul>
<p>That's why our agent doesn't store prose. It stores <strong>structured consensus</strong>. Consensus is represented as a set of claims that are:</p>
<ul>
<li>short and explicit</li>
<li>backed by sources</li>
<li>labeled with confidence</li>
<li>stable enough to diff over time</li>
</ul>
<p>This structure is what makes comparison possible.</p>
<p>It also lays the foundation for reasoning about sources over time, including which sources tend to be reliable or volatile.</p>
<h2>3. What is "self-learning"</h2>
<p>Self-learning means the agent improves based on its own experience.</p>
<p>In this case, improvement comes from capturing <strong>snapshots of consensus over time</strong> and using those snapshots as context in future runs.</p>
<p>The agent does <strong>not</strong>:</p>
<ul>
<li>retrain models</li>
<li>update weights</li>
<li>fine-tune embeddings</li>
</ul>
<p>Instead, it learns by <strong>capturing experience as data</strong> and reusing it deliberately. This is what I refer to as <em>poor-man’s continuous learning</em>.</p>
<p>The model stays fixed. The system improves by accumulating validated snapshots of understanding.</p>
<h2>4. Snapshot-based learning architecture</h2>
<p>The system is built around a simple idea: <strong>append-only snapshots</strong>.</p>
<p>Each snapshot represents:</p>
<ul>
<li>the question that was asked</li>
<li>the internet's consensus at that moment</li>
<li>the claims that define that consensus</li>
<li>the sources used to support it</li>
<li>a short report summary for semantic retrieval</li>
</ul>
<p>Snapshots are never mutated. We only add new ones and compare.</p>
<p>Each stored snapshot includes:</p>
<ul>
<li><code>question</code></li>
<li><code>created_at</code></li>
<li><code>report_summary</code> (short, human-readable)</li>
<li><code>consensus_summary</code> (1–2 sentences)</li>
<li><code>claims</code> (structured and diffable)</li>
<li><code>sources</code></li>
<li>optional <code>notes</code></li>
</ul>
<p>This keeps the knowledge base compact, searchable, and stable over time.</p>
<h2>5. What we store in the knowledge base (and what we don’t)</h2>
<p>The biggest mistake we can make is storing too much.</p>
<p>We deliberately <strong>do not store</strong>:</p>
<ul>
<li>full markdown reports</li>
<li>raw scraped content</li>
<li>long explanations</li>
</ul>
<p>We <strong>do store</strong>:</p>
<ul>
<li>concise summaries</li>
<li>structured claims</li>
<li>deduplicated source lists</li>
</ul>
<p>Each claim looks like:</p>
<ul>
<li><code>claim_id</code> (stable slug)</li>
<li><code>claim</code> (short statement)</li>
<li><code>confidence</code> (Low | Medium | High)</li>
<li><code>source_urls</code></li>
</ul>
<blockquote class="not-prose relative isolate pl-6 text-ink py-3 text-lg"><span aria-hidden="true" class="absolute inset-y-1 left-0 w-0.5 rounded-full bg-accent"></span><div class="flex gap-3"><div class="space-y-1"><div class="leading-relaxed"><p>If you can't diff it, you shouldn't store it.</p></div></div></div></blockquote>
<p>This keeps retrieval high-signal and comparisons reliable.</p>
<h2>6. End-to-end agent flow</h2>
<p>Here's what happens on every run:</p>
<ol>
<li>
<p><strong>Parallel research</strong>
The agent uses parallel search tools to gather information across multiple source types.</p>
</li>
<li>
<p><strong>Consensus extraction</strong>
Findings are synthesized into 4–10 structured claims with confidence and citations.</p>
</li>
<li>
<p><strong>Snapshot retrieval</strong>
The agent searches the knowledge base for the most recent snapshot of a similar question.</p>
</li>
<li>
<p><strong>Diff</strong>
Current claims are compared to the previous snapshot:</p>
<ul>
<li>new or strengthened claims</li>
<li>weakened or disputed claims</li>
<li>removed claims</li>
</ul>
<p>Each change includes a brief explanation and supporting sources.</p>
</li>
<li>
<p><strong>Human-in-the-loop save</strong>
The agent asks whether to save the new snapshot. Only explicit approval persists it.</p>
</li>
</ol>
<p>This keeps learning controlled, auditable, and intentional.</p>
<h2>7. Production Codebase (deployable anywhere)</h2>
<p>I'm providing a production codebase for running our self-learning research agent, built using:</p>
<ul>
<li>A FastAPI application for running our agents.</li>
<li>A Postgres database for storing sessions, memory and knowledge.</li>
</ul>
<p>Here's the link to the <a target="_blank" rel="noopener noreferrer" class="" href="https://github.com/agno-agi/agentos-railway">repository</a> containing the production codebase.</p>
<p>Here's the structure of the repository:</p>
<pre class="language-bash"><code class="language-bash"><span class="token builtin class-name">.</span>
├── agents
│&nbsp;&nbsp; ├── self_learning_research_agent.py
│&nbsp;&nbsp; └── <span class="token punctuation">..</span>. <span class="token function">more</span> agents
├── app
│&nbsp;&nbsp; └── main.py
├── compose.yaml
├── db
├── Dockerfile
├── pyproject.toml
├── railway.json
├── README.md
├── teams
│&nbsp;&nbsp; └── finance_team.py
└── workflows
    └── research_workflow.py
</code></pre>
<h2>8. Steps to run your own Self Learning Research Agent</h2>
<h3>Clone the repo</h3>
<pre class="language-shell"><code class="language-shell"><span class="token function">git</span> clone https://github.com/agno-agi/agentos-railway.git
<span class="token builtin class-name">cd</span> agentos-railway
</code></pre>
<h3>Configure API keys</h3>
<p>We'll use OpenAI for the agent and Parallel Search for search tools. Please export the following environment variables:</p>
<pre class="language-shell"><code class="language-shell"><span class="token builtin class-name">export</span> <span class="token assign-left variable">OPENAI_API_KEY</span><span class="token operator">=</span><span class="token string">"YOUR_API_KEY_HERE"</span>
<span class="token builtin class-name">export</span> <span class="token assign-left variable">PARALLEL_API_KEY</span><span class="token operator">=</span><span class="token string">"YOUR_API_KEY_HERE"</span>
</code></pre>
<blockquote class="not-prose relative isolate pl-6 text-ink py-3 text-lg"><span aria-hidden="true" class="absolute inset-y-1 left-0 w-0.5 rounded-full bg-sky-500 dark:bg-sky-400"></span><div class="flex gap-3"><div class="space-y-1"><div class="leading-relaxed"><p>You can copy the <code>example.env</code> file and rename it to <code>.env</code> to get started.</p></div></div></div></blockquote>
<h3>Install Docker</h3>
<p>We'll use docker to run the application locally and deploy it to Railway. Please install <a target="_blank" rel="noopener noreferrer" class="" href="https://www.docker.com/products/docker-desktop">Docker Desktop</a> if needed.</p>
<h3>Run the application locally</h3>
<p>Run the application using docker compose:</p>
<pre class="language-shell"><code class="language-shell"><span class="token function">docker</span> compose up --build -d
</code></pre>
<p>This command builds the Docker image and starts the application:</p>
<ul>
<li>The <strong>FastAPI application</strong>, running on <a target="_blank" rel="noopener noreferrer" class="" href="http://localhost:8000">localhost:8000</a>.</li>
<li>The <strong>PostgreSQL database</strong> for storing agent sessions, knowledge, and memories, accessible on <code>localhost:5432</code>.</li>
</ul>
<p>Once started, you can:</p>
<ul>
<li>View the FastAPI application at <a target="_blank" rel="noopener noreferrer" class="" href="http://localhost:8000/docs">localhost:8000/docs</a>.</li>
</ul>
<h3>Connect the AgentOS UI to the FastAPI application</h3>
<ul>
<li>Open the <a target="_blank" rel="noopener noreferrer" class="" href="https://os.agno.com/">AgentOS UI</a></li>
<li>Login and add <code>http://localhost:8000</code> as a new AgentOS. You can call it <code>Local AgentOS</code> (or any name you prefer).</li>
</ul>
<h3>Demo</h3>
<p>Here's a demo of the Self Learning Research Agent in action.</p>
<video width="700" height="700" class="rounded-2xl" loop="" autoplay="" muted="" playsinline="" controls=""><source src="/videos/self-learning-research-agent.mp4">Your browser does not support the video tag.</video>
<h3>Stop the application</h3>
<p>When you're done, stop the application using:</p>
<pre class="language-shell"><code class="language-shell"><span class="token function">docker</span> compose down
</code></pre>
<h3>Deploy the application to Railway</h3>
<p>To deploy the application to Railway, run the following commands:</p>
<ol>
<li>Install Railway CLI:</li>
</ol>
<pre class="language-shell"><code class="language-shell">brew <span class="token function">install</span> railway
</code></pre>
<ol start="2">
<li>Login to Railway:</li>
</ol>
<pre class="language-shell"><code class="language-shell">railway login
</code></pre>
<ol start="3">
<li>Deploy the application:</li>
</ol>
<pre class="language-shell"><code class="language-shell">./scripts/railway_up.sh
</code></pre>
<p>This command will:</p>
<ul>
<li>Create a new Railway project.</li>
<li>Deploy a PgVector database service to your Railway project.</li>
<li>Build and deploy the docker image to your Railway project.</li>
<li>Set environment variables in your AgentOS service.</li>
<li>Create a new domain for your AgentOS service.</li>
</ul>
<h2>9. Why this pattern works</h2>
<p>This approach generalizes far beyond traditional research, you can use it for:</p>
<ul>
<li>market analysis</li>
<li>policy tracking</li>
<li>competitive intelligence</li>
<li>technical standards</li>
<li>internal decision logs</li>
</ul>
<p>Anywhere beliefs evolve, <strong>snapshots beat stateless answers</strong>. By separating:</p>
<ul>
<li>online reasoning</li>
<li>from offline learning</li>
<li>and storing only what matters</li>
</ul>
<p>we get agents that feel more trustworthy, more explainable, and more useful over time.</p>
<hr>
<p>Thank you for reading! I hope you found this useful. Feel free to reach out to me on <a target="_blank" rel="noopener noreferrer" class="" href="https://x.com/ashpreetbedi">X</a> if you have any questions or feedback</p>]]></content:encoded>
            <author>hi@ashpreetbedi.com (Ashpreet Bedi)</author>
        </item>
        <item>
            <title><![CDATA[Self Improving Text2Sql Agent with Dynamic Context and Continuous Learning]]></title>
            <link>https://ashpreetbedi.com/articles/sql-agent</link>
            <guid isPermaLink="false">https://ashpreetbedi.com/articles/sql-agent</guid>
            <pubDate>Mon, 15 Dec 2025 00:00:00 GMT</pubDate>
            <content:encoded><![CDATA[<p>This post shows how to build a self-improving Text-to-SQL agent using dynamic context and "poor-man's continuous learning". We'll break the problem into two parts:</p>
<ul>
<li><strong>Text-to-SQL Agent (Online Path):</strong> answers questions by retrieving schema + query patterns from a knowledge base (dynamic context).</li>
<li><strong>Continuous Learning (Offline Path):</strong> learns from successful runs and adds new entries to the knowledge base.</li>
</ul>
<p>When the Agent finds a successful result, it stores it in its knowledge base for future use. This gives the text-to-sql agent a self-improving feedback loop, but keeps the online path stable.</p>
<h2>Table of Contents</h2>
<ol>
<li>Why Text-to-SQL fails in practice</li>
<li>What is "dynamic context"</li>
<li>What is "poor man's continuous learning" (and why it works)</li>
<li>Unified Agent Architecture</li>
<li>Knowledge Base Design (keep it structured)</li>
<li>Production Harness (deployable anywhere)</li>
<li>Steps to run your own Text-to-SQL Agent</li>
</ol>
<h2>1. Why Text-to-SQL fails in practice</h2>
<p>Most Test-to-SQL agents fail in practice because they start from scratch every time, describing tables, columns, finding join keys. Repeating every mistake, every time.</p>
<p>Now compare this with how senior analysts or data engineers operate: do they start from scratch every time? No, they use tribal knowledge and experience and dig through past queries to find the right one. Once they find a useful query, they capture it in their knowledge base for future reference. Our text-to-sql agent works the same way.</p>
<p>I've found that most Text-to-SQL failures are not "model is dumb", they're "model is missing context and tribal knowledge" issues. Let's break down the common mistakes:</p>
<ul>
<li>The model starts from scratch every time, describing tables, columns, finding join keys. Repeating every mistake, every time.</li>
<li>The model guesses column names, usage patterns, or doesn't know the right join keys.</li>
<li>The model misses domain definitions (active user, churn, ARR, etc.) or doesn't know the right business rules (eg: "status lives in orders.state, not orders.status").</li>
<li>The model is missing common gotchas (date in the wrong format, nulls in the wrong place, etc.).</li>
<li>The model re-invents queries that already exist in your organization's knowledge base.</li>
</ul>
<p><strong>The biggest improvement you can make to your text-to-sql agent is to provide it with the same tribal knowledge that human engineers have. This enables them to re-use queries that we know work and let the model search established usage patterns at runtime.</strong> Call it RAG, Agentic RAG, or Dynamic Context, it's the same thing: the model, at runtime, has access to the right context to generate the right SQL.</p>
<p>Our goal is straightforward:</p>
<ol>
<li>Give our agent the tools to retrieve the <em>right</em> context at runtime (schemas, joins, past queries, metric definitions, gotchas).</li>
<li>Generate SQL grounded in well established usage patterns (no guessing and no re-inventing the wheel).</li>
<li>Validate the SQL (query is parseable, schema checks, etc.).</li>
<li>Run the SQL and "analyze" the results. Don't just give me the data, give me the insights.</li>
<li>Capture learnings so the next run is better (new join path, corrected column mapping, query template, metric definition).</li>
<li>Repeat.</li>
</ol>
<h2>2. What is "dynamic context"</h2>
<p>Dynamic context is simply: <strong>the agent retrieves the relevant knowledge at query time, which enables it to generate SQL grounded in well established usage patterns</strong>. The context is dynamic because it changes based on the query, the data, and the user's intent.</p>
<p>Examples of what the agent can retrieve:</p>
<ul>
<li>Table schemas and relationships</li>
<li>Common join keys and relationships</li>
<li>Known queries for common use cases</li>
<li>Metric definitions and business rules</li>
<li>Known gotchas ("status lives in orders.state, not orders.status")</li>
</ul>
<blockquote class="not-prose relative isolate pl-6 text-ink py-3 text-lg"><span aria-hidden="true" class="absolute inset-y-1 left-0 w-0.5 rounded-full bg-accent"></span><div class="flex gap-3"><div class="space-y-1"><div class="leading-relaxed"><p>If your KB contains a query for "weekly active users", your agent should retrieve it, not re-invent it.</p></div></div></div></blockquote>
<h2>3. What is "poor man's continuous learning" (and why it works)</h2>
<p>By "poor man's continuous learning", I mean:</p>
<ul>
<li>We do <strong>not</strong> update model weights.</li>
<li>We do <strong>update retrieval knowledge</strong> when we find a successful result.</li>
<li>The system improves by capturing experience as reusable artifacts.</li>
</ul>
<blockquote>
<p>Every good query becomes future context.
Every mistake becomes a rule.
Every clarification becomes shared knowledge.</p>
</blockquote>
<p>Poor man's continuous learning works because it provides a pragmatic learning loop: stable online behavior, controlled improvements. The best part is that you can always explore the knowledge base manually and fix issues or mistakes, imaging updating model weights by hand.</p>
<h2>4. Unified Agent Architecture</h2>
<p>The systems is broken into 2 parts:</p>
<ol>
<li><strong>Text-to-SQL Agent:</strong> answers questions by retrieving schema + query patterns from a knowledge base (dynamic context).</li>
<li><strong>Continuous Learning:</strong> learns from successful runs and adds new entries to the knowledge base.</li>
</ol>
<h3>Query Flow</h3>
<ol>
<li><strong>User asks a question</strong></li>
<li>Agent <strong>retrieves context</strong> from KB (hybrid search) using:<!-- -->
<ul>
<li>question text</li>
<li>detected entities (tables, columns, metrics)</li>
<li>optional database introspection results</li>
</ul>
</li>
<li>This knowledge <strong>augments the input</strong> with dynamic context:<!-- -->
<ul>
<li>retrieved knowledge snippets</li>
<li>rules and constraints (read-only, limit, etc.)</li>
</ul>
</li>
<li>This knowledge <strong>guides the generation of SQL</strong>.</li>
<li>Agent <strong>executes the query</strong> in a safe environment.</li>
<li>Agent analyzes the results and <strong>returns the answer</strong>.</li>
<li>If the result is successful, the agent asks the user if they want to save the query to the knowledge base.</li>
<li>If the user agrees, the agent stores the query in the knowledge base.</li>
<li>If the user disagrees, the agent revists the query, update it and try again.</li>
</ol>
<p>There are 2 improvments you can make to the learning path:</p>
<ol>
<li>Run the continuous learning separately after every run of the text-to-sql agent. This way, the continuous learning is always up to date with the latest queries and results.</li>
<li>Add a regression harness to the continuous learning. This way, you can test the knowledge base before and after updates to ensure it's still working.</li>
</ol>
<h2>5. Knowledge Base Design (keep it structured)</h2>
<p>We want our knowledge base to store 3 kinds of information:</p>
<ol>
<li>Table information: this includes the table schema, column metadata, query rules , common gotchas (eg: date column contains a rule: "Use the <code>TO_DATE</code> function when filtering by date").</li>
<li>Sample queries: this include common query patterns and best practices. Along with how to retrieve common metrics and KPIs. There's no need to re-invent the wheel.</li>
<li>Business semantics and relationships: the layer that maps how your organization talks about data to how the database is structured.</li>
</ol>
<p>The sample codebase I'm providing contains the following files (table information and common queries):</p>
<pre class="language-shell"><code class="language-shell">agents/sql/knowledge/
├── constructors_championship.json
├── drivers_championship.json
├── fastest_laps.json
├── race_results.json
├── race_wins.json
└── common_queries.sql
</code></pre>
<h2>6. Production Harness</h2>
<p>I'm providing a production-ready harness for our system, built using:</p>
<ul>
<li>A FastAPI application for running our agents.</li>
<li>A Postgres database for storing sessions, memory and knowledge.</li>
</ul>
<p>Here's the link to the <a target="_blank" rel="noopener noreferrer" class="" href="https://github.com/agno-agi/agentos-railway">repository</a> containing the production codebase.</p>
<p>Here's the structure of the repository:</p>
<pre class="language-bash"><code class="language-bash"><span class="token builtin class-name">.</span>
├── agents
│&nbsp;&nbsp; ├── __init__.py
│&nbsp;&nbsp; ├── sql
│&nbsp;&nbsp; │&nbsp;&nbsp; ├── __init__.py
│&nbsp;&nbsp; │&nbsp;&nbsp; ├── knowledge
│&nbsp;&nbsp; │&nbsp;&nbsp; ├── load_f1_data.py
│&nbsp;&nbsp; │&nbsp;&nbsp; ├── load_sql_knowledge.py
│&nbsp;&nbsp; │&nbsp;&nbsp; ├── sql_agent.py
│&nbsp;&nbsp; │&nbsp;&nbsp; └── test_questions.txt
│&nbsp;&nbsp; └── <span class="token punctuation">..</span>. <span class="token function">more</span> agents
├── app
│&nbsp;&nbsp; ├── __init__.py
│&nbsp;&nbsp; └── main.py
├── compose.yaml
├── db
│&nbsp;&nbsp; └── <span class="token punctuation">..</span>. database configuration
├── Dockerfile
├── pyproject.toml
├── railway.json
├── README.md
├── requirements.txt
├── scripts
│&nbsp;&nbsp; ├── dev_setup.sh
│&nbsp;&nbsp; ├── entrypoint.sh
│&nbsp;&nbsp; ├── railway_up.sh
│&nbsp;&nbsp; ├── format.sh
│&nbsp;&nbsp; └── validate.sh
├── teams
│&nbsp;&nbsp; └── finance_team.py
└── workflows
    └── research_workflow.py
</code></pre>
<h2>7. Steps to run your own Text-to-SQL Agent</h2>
<h3>Clone the repo</h3>
<pre class="language-shell"><code class="language-shell"><span class="token function">git</span> clone https://github.com/agno-agi/agentos-railway.git
<span class="token builtin class-name">cd</span> agentos-railway
</code></pre>
<h3>Configure API keys</h3>
<p>We'll use OpenAI for the text-to-sql agent, (we also use Anthropic and Parallel Search for other agents in the service). Please export the following environment variables:</p>
<pre class="language-shell"><code class="language-shell"><span class="token comment"># Required</span>
<span class="token builtin class-name">export</span> <span class="token assign-left variable">OPENAI_API_KEY</span><span class="token operator">=</span><span class="token string">"YOUR_API_KEY_HERE"</span>

<span class="token comment"># Optional</span>
<span class="token builtin class-name">export</span> <span class="token assign-left variable">ANTHROPIC_API_KEY</span><span class="token operator">=</span><span class="token string">"YOUR_API_KEY_HERE"</span>
<span class="token builtin class-name">export</span> <span class="token assign-left variable">PARALLEL_API_KEY</span><span class="token operator">=</span><span class="token string">"YOUR_API_KEY_HERE"</span>
</code></pre>
<blockquote class="not-prose relative isolate pl-6 text-ink py-3 text-lg"><span aria-hidden="true" class="absolute inset-y-1 left-0 w-0.5 rounded-full bg-sky-500 dark:bg-sky-400"></span><div class="flex gap-3"><div class="space-y-1"><div class="leading-relaxed"><p>You can copy the <code>example.env</code> file and rename it to <code>.env</code> to get started.</p></div></div></div></blockquote>
<h3>Install Docker</h3>
<p>We'll use docker to run the application locally and deploy it to Railway. Please install <a target="_blank" rel="noopener noreferrer" class="" href="https://www.docker.com/products/docker-desktop">Docker Desktop</a> if needed.</p>
<h3>Run the application locally</h3>
<p>Run the application using docker compose:</p>
<pre class="language-shell"><code class="language-shell"><span class="token function">docker</span> compose up --build -d
</code></pre>
<p>This command builds the Docker image and starts the application:</p>
<ul>
<li>The <strong>FastAPI application</strong>, running on <a target="_blank" rel="noopener noreferrer" class="" href="http://localhost:8000">localhost:8000</a>.</li>
<li>The <strong>PostgreSQL database</strong> for storing agent sessions, knowledge, and memories, accessible on <code>localhost:5432</code>.</li>
</ul>
<p>Once started, you can:</p>
<ul>
<li>View the FastAPI application at <a target="_blank" rel="noopener noreferrer" class="" href="http://localhost:8000/docs">localhost:8000/docs</a>.</li>
</ul>
<h3>Load data for the SQL Agent</h3>
<p>To load the data for the SQL Agent, run:</p>
<pre class="language-shell"><code class="language-shell"><span class="token function">docker</span> <span class="token builtin class-name">exec</span> -it agentos-railway-agent-os-1 python -m agents.sql.load_f1_data
</code></pre>
<p>To populate the knowledge base, run:</p>
<pre class="language-shell"><code class="language-shell"><span class="token function">docker</span> <span class="token builtin class-name">exec</span> -it agentos-railway-agent-os-1 python -m agents.sql.load_sql_knowledge
</code></pre>
<h3>Connect the AgentOS UI to the FastAPI application</h3>
<ul>
<li>Open the <a target="_blank" rel="noopener noreferrer" class="" href="https://os.agno.com/">AgentOS UI</a></li>
<li>Login and add <code>http://localhost:8000</code> as a new AgentOS. You can call it <code>Local AgentOS</code> (or any name you prefer).</li>
</ul>
<h3>Demo</h3>
<p>Here's a demo of the Text-to-SQL Agent in action. Notice how I add a query to the knowledge base and the agent uses it to generate the SQL when i ask the same question again.</p>
<video width="700" height="700" class="rounded-2xl" loop="" autoplay="" muted="" playsinline="" controls=""><source src="/videos/sql-agent-demo.mp4">Your browser does not support the video tag.</video>
<h3>Stop the application</h3>
<p>When you're done, stop the application using:</p>
<pre class="language-shell"><code class="language-shell"><span class="token function">docker</span> compose down
</code></pre>
<h3>Deploy the application to Railway</h3>
<p>To deploy the application to Railway, run the following commands:</p>
<ol>
<li>Install Railway CLI:</li>
</ol>
<pre class="language-shell"><code class="language-shell">brew <span class="token function">install</span> railway
</code></pre>
<ol start="2">
<li>Login to Railway:</li>
</ol>
<pre class="language-shell"><code class="language-shell">railway login
</code></pre>
<ol start="3">
<li>Deploy the application:</li>
</ol>
<pre class="language-shell"><code class="language-shell">./scripts/railway_up.sh
</code></pre>
<p>This command will:</p>
<ul>
<li>Create a new Railway project.</li>
<li>Deploy a PgVector database service to your Railway project.</li>
<li>Build and deploy the docker image to your Railway project.</li>
<li>Set environment variables in your AgentOS service.</li>
<li>Create a new domain for your AgentOS service.</li>
</ul>
<hr>
<p>Thank you for reading! I hope you found this useful. Feel free to reach out to me on <a target="_blank" rel="noopener noreferrer" class="" href="https://x.com/ashpreetbedi">X</a> if you have any questions or feedback</p>]]></content:encoded>
            <author>hi@ashpreetbedi.com (Ashpreet Bedi)</author>
        </item>
        <item>
            <title><![CDATA[Systems Engineering]]></title>
            <link>https://ashpreetbedi.com/articles/systems-engineering</link>
            <guid isPermaLink="false">https://ashpreetbedi.com/articles/systems-engineering</guid>
            <pubDate>Tue, 14 Apr 2026 00:00:00 GMT</pubDate>
            <content:encoded><![CDATA[<span class="text-xl font-semibold"><p><strong>The Key To Building Agentic Software That Works</strong></p></span>
<p>In the early 1940s, Bell Labs was building the national telephone network, the most complex technical system in the world at the time. Millions of switches, cables, relays, and operators had to work together. The engineers discovered something that would become an 80-year-old lesson: you can't optimize a system by optimizing individual components. The behavior of the whole (call routing, reliability, capacity, cost) emerged from how the parts interacted. They needed a discipline focused on the interactions between components.</p>
<p><strong>They called it systems engineering.</strong></p>
<h2>Agentic Software Is a Systems Engineering Problem</h2>
<p>Coding agents have lowered the barrier to writing code, <strong>but they haven't lowered the requirements of production software</strong>.</p>
<p>Software engineering is, and has always been, systems engineering and agentic software is no different. If you're building agentic software, your system needs to bridge five layers:</p>
<p><strong>1. Agent Engineering.</strong> Your agent or multi-agent logic and execution flow. Model, system instructions, tool configurations, handoffs, context management, observability. This is where you define what your agent does, how it runs, and how it responds. Your agent's behavior should be deterministic where possible and observable where it isn't.</p>
<p><strong>2. Data Engineering.</strong> Your agent is only as good as the context it has access to, and context is just data under the hood.</p>
<p>Call it memory, storage, knowledge. Your Agent's data should be managed with data engineering principles. Well designed schemas, structured querying, databases for fast read/writes, object storage for long-term storage, and workflows that keep your knowledge and memory up to date. The patterns are decades old. Use them.</p>
<p><strong>3. Security Engineering.</strong> Auth, RBAC, governance, data isolation, audit trails. Your agent's capabilities are defined by its tools, and those tools should be scoped with JWT-backed permissions. Read-only access IS NOT a prompt instruction, it's a tool configuration.</p>
<p>Actions should have approval tiers: reads run freely, writes need user approval, sensitive operations need admin sign-off. Most actions should be logged and queryable for the life of the product.</p>
<p>And please, isolate requests. One user's context bleeding into another's is a data breach, not a "bug". It has serious consequences and there are laws protecting user data.</p>
<p><strong>4. Interface Engineering.</strong> How users and other agents reach your agent.</p>
<p>REST API, Slack, MCP server, terminal, Chat UI. In the old world, you had one API and one client. Now you have multiple surfaces, each with its own identity system. A Slack user ID is not your product's user ID. An MCP client authenticating as another agent is not a human user. Interface engineering is about making sure your auth, policies, and access controls hold consistently across every surface your agent is reachable from.</p>
<p><strong>5. Infrastructure Engineering.</strong> How you run and scale your software. Containers, cloud deployment, horizontal scaling. Generally called DevOps.</p>
<p>The good news: 95% of this is identical to running any other service. Re-use existing patterns, they'll serve you well. The 5% that's different: agent requests take longer (increase your load balancer timeouts), responses stream (plan for SSE or WebSockets), and the best agents are proactive (scheduled tasks, background execution). None of this is new.</p>
<hr>
<p>The key unlock for AI engineers is realizing that agentic software is just regular software, with the business logic replaced by agents, and interfaces going from request/response to streaming across multiple surfaces.</p>
<p><strong>Systems engineering is the discipline of making these parts work together, and is the key to building agentic software that works.</strong></p>
<p>When you look at your software from a systems perspective, the right decisions become obvious. You give your agent well-scoped tools, not unfettered bash access. You store sessions, memory, and knowledge in a database, not in files, so you can utilize decades of multi-tenant patterns.</p>
<p>When you design one layer in isolation, you inherit constraints that cascade through the rest of the system. When you design from the system's perspective, each layer reinforces the others.</p>
<h2>Systems Engineering in Practice</h2>
<p>I can't make a claim like this and not give you working code.</p>
<p><a target="_blank" rel="noopener noreferrer" class="" href="https://github.com/agno-agi/dash">Dash</a> is an open-source, self-learning data agent. You ask it questions in plain English, it writes SQL, runs it, and tells you what the numbers mean. Simple enough to clone and adapt. Real enough to demonstrate all five layers. Here's how it looks (2x speedup)</p>
<video width="700" height="700" class="rounded-2xl" loop="" autoplay="" muted="" playsinline="" controls=""><source src="/videos/dash-agentos-ui.mp4">Your browser does not support the video tag.</video>
<p>Dash is live in many companies and works incredibly well. The difference is the system behind it. Here's how each layer works.</p>
<h3>Agent Engineering</h3>
<p>Dash is a team of three agents. A Leader routes requests to two specialists: an Analyst that queries data (read-only) and an Engineer that builds computed assets like views and summary tables.</p>
<p>Each specialist gets similar tools, but wired up for different purposes. The Analyst's SQL tools connect to a read-only database engine. The Engineer's SQL tools connect to a writable engine scoped to a single schema. Same interface, different permissions, determined by configuration, not prompts.</p>
<p>Instructions are assembled at runtime from table metadata and business rules stored as structured files.</p>
<h3>Interface Engineering</h3>
<p>One system, multiple surfaces.</p>
<p>Dash serves a REST API, a Slack bot, a web UI, and a CLI. Each surface handles identity differently: Slack maps thread timestamps to sessions, the API uses JWT tokens in production. But all four hit the same agents, same tools, same knowledge. Adding a new interface does not require rebuilding the agent logic.</p>
<p>Your auth and access controls need to hold across every surface, because the agent doesn't know which one it's being called from. Here's dash being used in slack.</p>
<video width="700" height="700" class="rounded-2xl" loop="" autoplay="" muted="" playsinline="" controls=""><source src="/videos/dash-in-slack.mp4">Your browser does not support the video tag.</video>
<h3>Data Engineering</h3>
<p>Six layers of context, and tools for learning.</p>
<p>Raw LLMs writing SQL hit a wall fast: schemas lack meaning, types are misleading, tribal knowledge is missing, there's no way to learn from mistakes. Dash solves this with six layers of grounded context:</p>
<ol>
<li>Table metadata (schema, columns, relationships)</li>
<li>Human annotations (metrics, definitions, business rules)</li>
<li>Query patterns (SQL that is known to work)</li>
<li>Institutional knowledge (docs, wikis)</li>
<li>Learnings (error patterns and discovered fixes)</li>
<li>Runtime context (live schema inspection)</li>
</ol>
<p>These layers feed two systems.</p>
<ul>
<li>The first is curated knowledge: table schemas, validated queries, and business rules loaded into PostgreSQL.</li>
<li>The second is discovered learnings: error patterns and fixes that the agent saves automatically when it hits problems and recalls on future queries.</li>
</ul>
<p>The learning loop is simple: the agent runs a query, gets a type error, diagnoses the fix, saves it. Next time it sees a similar column, it gets it right the first time. And when the Engineer creates a new view, it records the schema and example queries into the knowledge base. The Analyst discovers it on the next search and starts using it.</p>
<p>Query 100 is better than query 1, not because the model improved, but because the data layer got better.</p>
<h3>Security Engineering</h3>
<p>Enforced by the system, not the prompt.</p>
<p>Production auth uses RBAC with JWT verification. Every query is scoped to <code>user_id</code>. An eval suite tests these boundaries directly: it prompts the agents to leak credentials, execute destructive SQL, and cross schema boundaries, then verifies they can't.</p>
<p>Security is a system property tested across layers.</p>
<p>The Analyst's read-only access is a PostgreSQL connection parameter. The database itself rejects writes regardless of what the model generates. The Engineer can write, but only to a single schema: a query-level guard blocks any operation targeting the source data.</p>
<h3>Infrastructure Engineering</h3>
<p>Boring on purpose.</p>
<p>Standard Python container. Docker Compose for local development. One-command cloud deployment. Streaming via SSE through a standard ASGI server. The 95% that's identical to any other service is identical. The 5% that's different (longer timeouts, streaming, scheduled tasks) is handled with standard tools.</p>
<p>You can clone it, run <code>docker compose up</code>, and have the entire system running in minutes. One command, five layers, a working product.</p>
<pre class="language-bash"><code class="language-bash"><span class="token comment"># Clone the repo</span>
<span class="token function">git</span> clone https://github.com/agno-agi/dash.git

<span class="token builtin class-name">cd</span> dash

<span class="token comment"># Set your keys</span>
<span class="token function">cp</span> example.env .env
<span class="token comment"># Edit .env and add your model provider key</span>

<span class="token comment"># Start the system</span>
<span class="token function">docker</span> compose up -d --build
</code></pre>
<h2>TLDR</h2>
<p>Agentic software is just software. The agent replaces business logic. Everything else is systems engineering. Five layers: agent, data, security, interface, infrastructure. Each layer affects the others. Design them together and the system compounds. Design them in isolation and you spend your time patching around constraints that shouldn't exist. We walk through all five with <a target="_blank" rel="noopener noreferrer" class="" href="https://github.com/agno-agi/dash">Dash</a>, a real open-source data agent you can run yourself.</p>
<p>Links:</p>
<ul>
<li>
<a target="_blank" rel="noopener noreferrer" class="" href="https://github.com/agno-agi/dash">Dash on Github</a>
</li>
<li>
<a target="_blank" rel="noopener noreferrer" class="" href="https://docs.agno.com/">Agno Docs</a>
</li>
<li>
<a target="_blank" rel="noopener noreferrer" class="" href="https://github.com/agno-agi/agno">Agno Github</a>
</li>
<li>
<a target="_blank" rel="noopener noreferrer" class="" href="https://docs.agno.com/deploy/introduction">AgentOS Templates</a>
</li>
</ul>]]></content:encoded>
            <author>hi@ashpreetbedi.com (Ashpreet Bedi)</author>
        </item>
        <item>
            <title><![CDATA[WTF are Agents?]]></title>
            <link>https://ashpreetbedi.com/articles/wtf-is-an-agent</link>
            <guid isPermaLink="false">https://ashpreetbedi.com/articles/wtf-is-an-agent</guid>
            <pubDate>Fri, 24 Oct 2025 00:00:00 GMT</pubDate>
            <content:encoded><![CDATA[<h2>Most people overcomplicate Agents.</h2>
<p>Are they workflows? are they graphs? are they LLMs in a loop or just expensive while-loops? Are they deterministic, autonomous, or confused? Some say if you whisper "agent" three times, a VC appears with a term sheet.</p>
<p>How about we cut through the noise and understand what an Agent is by mapping out how they work. Let's demystify it — without the hype.</p>
<h2>What is an Agent?</h2>
<p>Regular programs execute a fixed set of instructions, written as code, in a predetermined order. If you write a program to add two numbers, that's exactly what it will do, every time. It won't add three, or four, or decide to do something else. The outcome is always the same because the logic is hardcoded.</p>
<p>Agents, on the other hand, are <span class="text-teal-400">AI programs where a language model decides the flow of execution</span>. You give it instructions, a set of tools, and the model decides what to do. If you give an Agent tools to add numbers, it can add two, three, or ten. If you also give it tools to subtract, multiply, and divide, it can perform any combination of operations — without you writing that logic explicitly.</p>
<p>If that explanation sounded abstract, that's because it is. Let's make sense of it by walking through what happens when you run an Agent:</p>
<ol>
<li>The Agent first builds the <strong>context</strong> for the model: system messages, user messages, adds chat history, memory, knowledge, state.</li>
<li>It sends that context to the model (the <strong>execution loop begins</strong>).</li>
<li>The model replies with a message, a <strong>tool call</strong>, or both.</li>
<li>If a tool is called, the Agent executes it and returns the results to the model. This is what I think makes a program "agentic".</li>
<li>The loop continues until the model produces a final message.</li>
<li>The Agent returns that response to the caller.</li>
</ol>
<p>That's it, this is an Agent. What'll be different is the context, the tools, and the model's reasoning, but the core remains the same.</p>
<blockquote>
<p>We're moving from deterministic execution to reasoning-based execution — from code that follows instructions to software that decides what to do. Will it do it well? We'll find out.</p>
</blockquote>
<h2>Minimal Example</h2>
<p>Let's build a simple agent to demo how it works, we'll add a few capabilities to make it more interesting:</p>
<ul>
<li>A database to store and maintain conversation history</li>
<li>Tools via MCP that it can call to answer questions</li>
<li>Respond in markdown so it looks pretty</li>
</ul>
<p>We'll also turn it into a FastAPI app so we can deploy it as a service. You can read the full instructions <a target="_blank" rel="noopener noreferrer" class="" href="https://docs.agno.com/introduction/quickstart">here</a>.</p>
<pre class="language-javascript"><code class="language-javascript"><span class="token keyword module">from</span> agno<span class="token punctuation">.</span><span class="token property-access">agent</span> <span class="token keyword module">import</span> <span class="token imports"><span class="token maybe-class-name">Agent</span></span>
<span class="token keyword module">from</span> agno<span class="token punctuation">.</span><span class="token property-access">db</span><span class="token punctuation">.</span><span class="token property-access">sqlite</span> <span class="token keyword module">import</span> <span class="token imports"><span class="token maybe-class-name">SqliteDb</span></span>
<span class="token keyword module">from</span> agno<span class="token punctuation">.</span><span class="token property-access">models</span><span class="token punctuation">.</span><span class="token property-access">anthropic</span> <span class="token keyword module">import</span> <span class="token imports"><span class="token maybe-class-name">Claude</span></span>
<span class="token keyword module">from</span> agno<span class="token punctuation">.</span><span class="token property-access">os</span> <span class="token keyword module">import</span> <span class="token imports"><span class="token maybe-class-name">AgentOS</span></span>
<span class="token keyword module">from</span> agno<span class="token punctuation">.</span><span class="token property-access">tools</span><span class="token punctuation">.</span><span class="token property-access">mcp</span> <span class="token keyword module">import</span> <span class="token maybe-class-name">MCPTools</span>

# <span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">*</span> <span class="token maybe-class-name">Create</span> <span class="token maybe-class-name">Agent</span> <span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">*</span>
agno_agent <span class="token operator">=</span> <span class="token function"><span class="token maybe-class-name">Agent</span></span><span class="token punctuation">(</span>
    name<span class="token operator">=</span><span class="token string">"Agno Agent"</span><span class="token punctuation">,</span>
    model<span class="token operator">=</span><span class="token function"><span class="token maybe-class-name">Claude</span></span><span class="token punctuation">(</span>id<span class="token operator">=</span><span class="token string">"claude-sonnet-4-5"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
    db<span class="token operator">=</span><span class="token function"><span class="token maybe-class-name">SqliteDb</span></span><span class="token punctuation">(</span>db_file<span class="token operator">=</span><span class="token string">"agno.db"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
    tools<span class="token operator">=</span><span class="token punctuation">[</span><span class="token function"><span class="token maybe-class-name">MCPTools</span></span><span class="token punctuation">(</span>url<span class="token operator">=</span><span class="token string">"https://docs.agno.com/mcp"</span><span class="token punctuation">,</span> transport<span class="token operator">=</span><span class="token string">"streamable-http"</span><span class="token punctuation">)</span><span class="token punctuation">]</span><span class="token punctuation">,</span>
    add_history_to_context<span class="token operator">=</span><span class="token maybe-class-name">True</span><span class="token punctuation">,</span>
    markdown<span class="token operator">=</span><span class="token maybe-class-name">True</span><span class="token punctuation">,</span>
<span class="token punctuation">)</span>

# <span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">*</span> <span class="token maybe-class-name">Create</span> <span class="token maybe-class-name">AgentOS</span> <span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">**</span><span class="token operator">*</span>
agent_os <span class="token operator">=</span> <span class="token function"><span class="token maybe-class-name">AgentOS</span></span><span class="token punctuation">(</span>agents<span class="token operator">=</span><span class="token punctuation">[</span>agno_agent<span class="token punctuation">]</span><span class="token punctuation">)</span>
app <span class="token operator">=</span> agent_os<span class="token punctuation">.</span><span class="token method function property-access">get_app</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
</code></pre>
<p>You can run this Agent using <code>fastapi dev agno_agent.py</code> and chat with it on the <a target="_blank" rel="noopener noreferrer" class="" href="https://os.agno.com">AgentOS UI</a>. Here's how it looks:</p>
<video width="700" height="700" class="rounded-2xl" loop="" autoplay="" muted="" playsinline="" controls=""><source src="/videos/agentos-chat.mp4">Your browser does not support the video tag.</video>
<p>Deploy your FastAPI app to your cloud of choice, and you're live!</p>
<h2>Are we done?</h2>
<p>Not even close. The hard part isn't building the Agent, its building the system that runs these Agents in production, and building a product around it with a great UX (or rather, <span class="text-teal-400">AX — Agent Experience</span>).</p>
<p>Ensuring reliability, durability, and a smooth experience across thousands of concurrent sessions is where the real engineering happens. These are long-running processes that demand isolated state management, persistent storage, and strong fault tolerance.</p>
<p>Here's what you'll need to consider when building Agents:</p>
<ol>
<li><strong>Runtime architecture</strong>: how agents are orchestrated, manage state, and handle execution loops.</li>
<li><strong>Memory systems</strong>: how agents retain and manage context, session history, memory, knowledge and culture.</li>
<li><strong>Tooling integration</strong>: how agents connect to APIs, databases, or internal functions (MCPs are popular here).</li>
<li><strong>Safety &amp; Security</strong>: how to ensure data, application and user-level security.</li>
<li><strong>Evaluation &amp; performance</strong>: measuring usefulness, latency, cost, and reliability of the agentic system.</li>
</ol>
<p>Each of these is a discipline of its own, with entire startups (sometimes dozens) dedicated to solving. But stitching it all together into a single, cohesive system is still a massive pain.</p>
<p>That's where Agno comes in.</p>
<h2>What is Agno?</h2>
<p><strong>Agno is a multi-agent framework, runtime, and control plane.</strong> It solves the 5 problems mentioned above via 3 tightly coupled components:</p>
<ol>
<li><strong>Framework for building Agents, Multi-Agent Teams and Workflows.</strong> It comes with an incredibly rich set of features like persistent storage, memory management, knowledge retrieval, 100+ toolkits, guardrails, dependency injection, dynamic context management, human in the loop, and much, much more.</li>
<li><strong>Pre-built FastAPI Runtime for deploying multi-agent systems.</strong> This runtime, called AgentOS, exposes pre-built endpoints you can build your product on top of. It handles concurrency, state management, and error recovery out of the box — plus extras like initializing MCP connections via lifecycle hooks and securing every request with a security-key.</li>
<li><strong>Control Plane for testing, monitoring, debugging and evaluating multi-agent systems.</strong> This is a web interface that allows you to manage your multi-agent systems in real-time. It's a powerful tool that helps you understand what your agents are doing, and why.</li>
</ol>
<p>If you're building Agents, give Agno a try:</p>
<ul>
<li><strong>GitHub:</strong> <a target="_blank" rel="noopener noreferrer" class="" href="https://agno.link/gh">agno.link/gh</a></li>
<li><strong>Documentation:</strong> <a target="_blank" rel="noopener noreferrer" class="" href="https://agno.link/docs">agno.link/docs</a></li>
<li><strong>Website:</strong> <a target="_blank" rel="noopener noreferrer" class="" href="https://www.agno.com">agno.com</a></li>
</ul>
<hr>
<p>Agents aren't magic. They're just a new kind of software. Once you understand that, everything else falls into place.</p>
<span class="text-teal-400">Agent Engineering is just Software Engineering</span>]]></content:encoded>
            <author>hi@ashpreetbedi.com (Ashpreet Bedi)</author>
        </item>
    </channel>
</rss>