Memory: How Agents Learn

It's almost 2026. Agents can follow complex instructions, use dozens of tools, and work autonomously for hours. But ask them the same question twice and they start from scratch. They don't remember what worked, what failed, or what they figured out along the way.

What makes ChatGPT and Claude great personal assistants? Memory.

Here's the dirty secret: when building agents with the API, we've made them capable, but we haven't yet figured out how to make them learn.

Table of Contents

  1. What is memory
  2. How memory enables learning
  3. Three patterns (with code)
  4. Video demo
  5. What makes a good learning
  6. Get started

Wanna jump straight to the code? Here you go. Cookbooks 2, 4 and 7 are what you're looking for.

1. What is memory?

"Memory" gets thrown around loosely. Chat history? Context window? Vector database? Let's be precise.

There are three types of memory that matter for agents:

Session Memory

The conversation context. What was said five messages ago. This is a solved problem: store messages in a database, retrieve them before every response, add them to the context.

Session memory is useful but limited. It disappears when the conversation ends. It's not really memory, it's just context.

User Memory

Facts about a specific user that persist across sessions. Preferences, goals, constraints.

When a user says "I'm interested in AI stocks and have moderate risk tolerance", that's worth remembering, not just for this conversation, but for every future conversation with that user.

This is powerful, but it's still not learning. User memory is about recall, not improvement.

Learned Memory

This is where knowledge gets built. As agents interact with the world, they discover insights that apply generally, not just to one user, but to anyone asking similar questions.

When your finance agent discovers that "when comparing ETFs, check both expense ratio AND tracking error", this insight is worth saving, not just because one user asked, but because it makes the agent better at ETF comparisons for everyone.

Here's the beauty: knowledge compounds. The more the agent learns, the better it gets. And unlike weight updates, this knowledge is tangible: you can inspect it, edit it, delete it. No retraining required.

If you're building agents without learned memory, you're leaving performance on the table.

2. How memory enables learning

Here's the core insight: learning is remembering what worked.

Without memory, agents are stateless. Every session is day one:

Without MemoryWith Memory
Re-discovers the same patternsSearches prior learnings before acting
Repeats the same mistakesApplies insights from past sessions
Re-asks the same questionsBuilds domain knowledge over time
Can't build on prior successGets better the more you use it

The best part: the model doesn't need to get better for the system to improve. Learning happens in retrieval, not in weights. And as models improve, your system improves too — for free.

I call this GPU Poor Continuous Learning: continuous improvement without fine-tuning, retraining, or any of the infrastructure traditionally required for model updates. Just a knowledge base that grows smarter over time.

The model doesn't get smarter. The system gets smarter.

3. Three patterns for agent memory

Let me show you how to implement the three patterns, with a bonus at the end.

Pattern 1: Session Memory

Store messages in a database, retrieve them before every response, add them to the context. Agno gives you this out of the box — just give your agent a database.

from agno.agent import Agent
from agno.db.sqlite import SqliteDb
from agno.models.google import Gemini
from agno.tools.yfinance import YFinanceTools

agent_db = SqliteDb(db_file="tmp/agents.db")

agent = Agent(
    model=Gemini(id="gemini-3-flash-preview"),
    tools=[YFinanceTools()],
    db=agent_db,
    add_history_to_context=True,
    num_history_runs=5,
)

if __name__ == "__main__":
    session_id = "finance-session"

    # Turn 1: Analyze a stock
    agent.print_response("Quick investment brief on NVIDIA", session_id=session_id)

    # Turn 2: Agent remembers NVDA from turn 1
    agent.print_response("Compare that to Tesla", session_id=session_id)

    # Turn 3: Recommendation based on full conversation
    agent.print_response("Which looks like the better investment?", session_id=session_id)

Use a consistent session_id to persist conversation across runs.

Pattern 2: User Memory

Remember facts about the user across sessions. The MemoryManager extracts preferences automatically and stores them in the database.

from agno.agent import Agent
from agno.memory import MemoryManager
from agno.models.google import Gemini
from agno.db.sqlite import SqliteDb

agent_db = SqliteDb(db_file="tmp/agents.db")

memory_manager = MemoryManager(
    model=Gemini(id="gemini-3-flash-preview"),
    db=agent_db,
)

agent = Agent(
    model=Gemini(id="gemini-3-flash-preview"),
    memory_manager=memory_manager,
    enable_user_memory=True,
)

# First conversation — preferences extracted and stored
agent.print_response(
    "I'm interested in AI stocks. My risk tolerance is moderate.",
    user_id="investor@example.com",
)

# Later conversation — agent remembers
agent.print_response(
    "What stocks would you recommend for me?",
    user_id="investor@example.com",
)

enable_user_memory=True runs the MemoryManager in parallel with every run. Use enable_agentic_memory=True to let the agent decide when to store memories via tool calls. More efficient, doesn't run on every response.

Pattern 3: Learned Memory

Now let's add learned memory: insights that apply beyond just one user. The key is a custom tool that saves learnings to a knowledge base:

import json
from datetime import datetime, timezone

from agno.agent import Agent
from agno.db.sqlite import SqliteDb
from agno.knowledge import Knowledge
from agno.models.google import Gemini
from agno.vectordb.chroma import ChromaDb

agent_db = SqliteDb(db_file="tmp/agents.db")

learnings_kb = Knowledge(
    name="Agent Learnings",
    vector_db=ChromaDb(
        name="learnings",
        persistent_client=True,
        search_type=SearchType.hybrid,
    ),
)

def save_learning(title: str, learning: str) -> str:
    """
    Save a reusable insight to the knowledge base.

    Args:
        title: Short descriptive title
        learning: The insight — specific and actionable
    """
    payload = {
        "title": title.strip(),
        "learning": learning.strip(),
        "saved_at": datetime.now(timezone.utc).isoformat(),
    }

    learnings_kb.add_content(
        name=payload["title"],
        text_content=json.dumps(payload),
    )

    return f"Saved: '{title}'"

agent = Agent(
    model=Gemini(id="gemini-3-flash-preview"),
    tools=[save_learning],
    knowledge=learnings_kb,
    search_knowledge=True,
    db=agent_db,
)

The agent now has two capabilities:

  1. Search first — Before answering, it searches for relevant prior learnings
  2. Save learnings — When it discovers something reusable, it saves it

But how do you prevent the agent from saving garbage?

Bonus: Human-in-the-Loop Gating

The quality of your knowledge base determines the quality of learning. Garbage in, garbage out.

The solution: the agent proposes learnings, but only saves with explicit user approval.

from agno.tools import tool

@tool(requires_confirmation=True)
def save_learning(title: str, learning: str) -> str:
    """Save a reusable insight. Requires user confirmation."""
    # ... same implementation

Handle the confirmation flow:

run_response = agent.run("Analyze NVDA and save any insights")

for requirement in run_response.active_requirements:
    if requirement.needs_confirmation:
        print(f"Tool: {requirement.tool_execution.tool_name}")
        print(f"Args: {requirement.tool_execution.tool_args}")

        if user_approves:
            requirement.confirm()
        else:
            requirement.reject()

run_response = agent.continue_run(
    run_id=run_response.run_id,
    requirements=run_response.requirements,
)

The agent proposes, the human gates. High-signal knowledge only.

5. Video demo

Here's a video demo that starts by showcasing user memory, then learned memory with user confirmation.

5. What makes a good learning

A learning is worth saving if it's:

  • Specific: "Tech P/E ratios typically range 20-35x" not "P/E varies"
  • Actionable: Can be applied to future queries
  • Generalizable: Useful beyond this one conversation

Don't save: raw data, one-off facts, summaries, speculation.

Most queries should NOT produce a learning, and that's OK.

Where to store

Memory TypeKeyAgno Component
Sessionsession_idSqliteDb, PostgresDb, MongoDB
Useruser_idMemoryManager + Database
Learnedlearning_idKnowledge + ChromaDb, PgVector, Qdrant, Pinecone

Avoiding bloat

The biggest mistake is storing too much. A bloated knowledge base hurts retrieval and makes the agent worse.

The upside: because learnings are stored explicitly (not in weights), they're auditable and reversible. Bad learning? Delete it. System immediately improves.

6. Get started

This blog comes with complete working code. Here are 12 cookbooks that take you from "what is an agent" to building agents with memory, knowledge, state, guardrails, and more. Link again for reference.

#CookbookWhat You'll Learn
01ToolsGive agents the ability to fetch real-time data
02StoragePersist conversations across runs
03KnowledgeLoad documents and search with hybrid retrieval
04Custom ToolsWrite your own tools, add self-learning
05Structured OutputReturn typed Pydantic objects
06Typed I/OFull type safety on input and output
07MemoryRemember user preferences across sessions
08State ManagementTrack and persist structured state
09Multi-Agent TeamsCoordinate specialized agents
10WorkflowsSequential pipelines with predictable data flow
11GuardrailsInput validation, PII detection, prompt injection defense
12Human in the LoopRequire confirmation before sensitive actions

Each builds on fundamentals, but you can jump to any one.

Setup

git clone https://github.com/agno-agi/agno.git
cd agno

uv venv .getting-started --python 3.12
source .getting-started/bin/activate

uv pip install -r cookbook/00_getting_started/requirements.txt

export GOOGLE_API_KEY=your-google-api-key

Run an example

Each cookbook is self-contained:

python cookbook/00_getting_started/agent_with_tools.py

Want a visual interface? Agent OS gives you a web UI for chatting with agents, exploring sessions, and monitoring traces:

python cookbook/00_getting_started/run.py

Then visit os.agno.com and add http://localhost:7777 as an endpoint.

Swapping models

These examples use Gemini 3 Flash by default — fast, reliable tool calling, cheap enough to experiment freely. But Agno is model-agnostic:

# Gemini (default)
from agno.models.google import Gemini
model = Gemini(id="gemini-3-flash-preview")

# OpenAI
from agno.models.openai import OpenAIChat
model = OpenAIChat(id="gpt-5.2")

# Anthropic
from agno.models.anthropic import Claude
model = Claude(id="claude-sonnet-4-5")

One line change. Everything else stays the same.


If you enjoyed reading this, star Agno on GitHub. It helps more than you'd think. Questions or feedback? Reach out on X.