Memory: How Agents Learn
It's almost 2026. Agents can follow complex instructions, use dozens of tools, and work autonomously for hours. But ask them the same question twice and they start from scratch. They don't remember what worked, what failed, or what they figured out along the way.
What makes ChatGPT and Claude great personal assistants? Memory.
Here's the dirty secret: when building agents with the API, we've made them capable, but we haven't yet figured out how to make them learn.
Table of Contents
- What is memory
- How memory enables learning
- Three patterns (with code)
- Video demo
- What makes a good learning
- Get started
Wanna jump straight to the code? Here you go. Cookbooks 2, 4 and 7 are what you're looking for.
1. What is memory?
"Memory" gets thrown around loosely. Chat history? Context window? Vector database? Let's be precise.
There are three types of memory that matter for agents:
Session Memory
The conversation context. What was said five messages ago. This is a solved problem: store messages in a database, retrieve them before every response, add them to the context.
Session memory is useful but limited. It disappears when the conversation ends. It's not really memory, it's just context.
User Memory
Facts about a specific user that persist across sessions. Preferences, goals, constraints.
When a user says "I'm interested in AI stocks and have moderate risk tolerance", that's worth remembering, not just for this conversation, but for every future conversation with that user.
This is powerful, but it's still not learning. User memory is about recall, not improvement.
Learned Memory
This is where knowledge gets built. As agents interact with the world, they discover insights that apply generally, not just to one user, but to anyone asking similar questions.
When your finance agent discovers that "when comparing ETFs, check both expense ratio AND tracking error", this insight is worth saving, not just because one user asked, but because it makes the agent better at ETF comparisons for everyone.
Here's the beauty: knowledge compounds. The more the agent learns, the better it gets. And unlike weight updates, this knowledge is tangible: you can inspect it, edit it, delete it. No retraining required.
If you're building agents without learned memory, you're leaving performance on the table.
2. How memory enables learning
Here's the core insight: learning is remembering what worked.
Without memory, agents are stateless. Every session is day one:
| Without Memory | With Memory |
|---|---|
| Re-discovers the same patterns | Searches prior learnings before acting |
| Repeats the same mistakes | Applies insights from past sessions |
| Re-asks the same questions | Builds domain knowledge over time |
| Can't build on prior success | Gets better the more you use it |
The best part: the model doesn't need to get better for the system to improve. Learning happens in retrieval, not in weights. And as models improve, your system improves too — for free.
I call this GPU Poor Continuous Learning: continuous improvement without fine-tuning, retraining, or any of the infrastructure traditionally required for model updates. Just a knowledge base that grows smarter over time.
The model doesn't get smarter. The system gets smarter.
3. Three patterns for agent memory
Let me show you how to implement the three patterns, with a bonus at the end.
Pattern 1: Session Memory
Store messages in a database, retrieve them before every response, add them to the context. Agno gives you this out of the box — just give your agent a database.
from agno.agent import Agent
from agno.db.sqlite import SqliteDb
from agno.models.google import Gemini
from agno.tools.yfinance import YFinanceTools
agent_db = SqliteDb(db_file="tmp/agents.db")
agent = Agent(
model=Gemini(id="gemini-3-flash-preview"),
tools=[YFinanceTools()],
db=agent_db,
add_history_to_context=True,
num_history_runs=5,
)
if __name__ == "__main__":
session_id = "finance-session"
# Turn 1: Analyze a stock
agent.print_response("Quick investment brief on NVIDIA", session_id=session_id)
# Turn 2: Agent remembers NVDA from turn 1
agent.print_response("Compare that to Tesla", session_id=session_id)
# Turn 3: Recommendation based on full conversation
agent.print_response("Which looks like the better investment?", session_id=session_id)
Use a consistent session_id to persist conversation across runs.
Pattern 2: User Memory
Remember facts about the user across sessions. The MemoryManager extracts preferences automatically and stores them in the database.
from agno.agent import Agent
from agno.memory import MemoryManager
from agno.models.google import Gemini
from agno.db.sqlite import SqliteDb
agent_db = SqliteDb(db_file="tmp/agents.db")
memory_manager = MemoryManager(
model=Gemini(id="gemini-3-flash-preview"),
db=agent_db,
)
agent = Agent(
model=Gemini(id="gemini-3-flash-preview"),
memory_manager=memory_manager,
enable_user_memory=True,
)
# First conversation — preferences extracted and stored
agent.print_response(
"I'm interested in AI stocks. My risk tolerance is moderate.",
user_id="investor@example.com",
)
# Later conversation — agent remembers
agent.print_response(
"What stocks would you recommend for me?",
user_id="investor@example.com",
)
enable_user_memory=True runs the MemoryManager in parallel with every run. Use enable_agentic_memory=True to let the agent decide when to store memories via tool calls. More efficient, doesn't run on every response.
Pattern 3: Learned Memory
Now let's add learned memory: insights that apply beyond just one user. The key is a custom tool that saves learnings to a knowledge base:
import json
from datetime import datetime, timezone
from agno.agent import Agent
from agno.db.sqlite import SqliteDb
from agno.knowledge import Knowledge
from agno.models.google import Gemini
from agno.vectordb.chroma import ChromaDb
agent_db = SqliteDb(db_file="tmp/agents.db")
learnings_kb = Knowledge(
name="Agent Learnings",
vector_db=ChromaDb(
name="learnings",
persistent_client=True,
search_type=SearchType.hybrid,
),
)
def save_learning(title: str, learning: str) -> str:
"""
Save a reusable insight to the knowledge base.
Args:
title: Short descriptive title
learning: The insight — specific and actionable
"""
payload = {
"title": title.strip(),
"learning": learning.strip(),
"saved_at": datetime.now(timezone.utc).isoformat(),
}
learnings_kb.add_content(
name=payload["title"],
text_content=json.dumps(payload),
)
return f"Saved: '{title}'"
agent = Agent(
model=Gemini(id="gemini-3-flash-preview"),
tools=[save_learning],
knowledge=learnings_kb,
search_knowledge=True,
db=agent_db,
)
The agent now has two capabilities:
- Search first — Before answering, it searches for relevant prior learnings
- Save learnings — When it discovers something reusable, it saves it
But how do you prevent the agent from saving garbage?
Bonus: Human-in-the-Loop Gating
The quality of your knowledge base determines the quality of learning. Garbage in, garbage out.
The solution: the agent proposes learnings, but only saves with explicit user approval.
from agno.tools import tool
@tool(requires_confirmation=True)
def save_learning(title: str, learning: str) -> str:
"""Save a reusable insight. Requires user confirmation."""
# ... same implementation
Handle the confirmation flow:
run_response = agent.run("Analyze NVDA and save any insights")
for requirement in run_response.active_requirements:
if requirement.needs_confirmation:
print(f"Tool: {requirement.tool_execution.tool_name}")
print(f"Args: {requirement.tool_execution.tool_args}")
if user_approves:
requirement.confirm()
else:
requirement.reject()
run_response = agent.continue_run(
run_id=run_response.run_id,
requirements=run_response.requirements,
)
The agent proposes, the human gates. High-signal knowledge only.
5. Video demo
Here's a video demo that starts by showcasing user memory, then learned memory with user confirmation.
5. What makes a good learning
A learning is worth saving if it's:
- Specific: "Tech P/E ratios typically range 20-35x" not "P/E varies"
- Actionable: Can be applied to future queries
- Generalizable: Useful beyond this one conversation
Don't save: raw data, one-off facts, summaries, speculation.
Most queries should NOT produce a learning, and that's OK.
Where to store
| Memory Type | Key | Agno Component |
|---|---|---|
| Session | session_id | SqliteDb, PostgresDb, MongoDB |
| User | user_id | MemoryManager + Database |
| Learned | learning_id | Knowledge + ChromaDb, PgVector, Qdrant, Pinecone |
Avoiding bloat
The biggest mistake is storing too much. A bloated knowledge base hurts retrieval and makes the agent worse.
The upside: because learnings are stored explicitly (not in weights), they're auditable and reversible. Bad learning? Delete it. System immediately improves.
6. Get started
This blog comes with complete working code. Here are 12 cookbooks that take you from "what is an agent" to building agents with memory, knowledge, state, guardrails, and more. Link again for reference.
| # | Cookbook | What You'll Learn |
|---|---|---|
| 01 | Tools | Give agents the ability to fetch real-time data |
| 02 | Storage | Persist conversations across runs |
| 03 | Knowledge | Load documents and search with hybrid retrieval |
| 04 | Custom Tools | Write your own tools, add self-learning |
| 05 | Structured Output | Return typed Pydantic objects |
| 06 | Typed I/O | Full type safety on input and output |
| 07 | Memory | Remember user preferences across sessions |
| 08 | State Management | Track and persist structured state |
| 09 | Multi-Agent Teams | Coordinate specialized agents |
| 10 | Workflows | Sequential pipelines with predictable data flow |
| 11 | Guardrails | Input validation, PII detection, prompt injection defense |
| 12 | Human in the Loop | Require confirmation before sensitive actions |
Each builds on fundamentals, but you can jump to any one.
Setup
git clone https://github.com/agno-agi/agno.git
cd agno
uv venv .getting-started --python 3.12
source .getting-started/bin/activate
uv pip install -r cookbook/00_getting_started/requirements.txt
export GOOGLE_API_KEY=your-google-api-key
Run an example
Each cookbook is self-contained:
python cookbook/00_getting_started/agent_with_tools.py
Want a visual interface? Agent OS gives you a web UI for chatting with agents, exploring sessions, and monitoring traces:
python cookbook/00_getting_started/run.py
Then visit os.agno.com and add http://localhost:7777 as an endpoint.
Swapping models
These examples use Gemini 3 Flash by default — fast, reliable tool calling, cheap enough to experiment freely. But Agno is model-agnostic:
# Gemini (default)
from agno.models.google import Gemini
model = Gemini(id="gemini-3-flash-preview")
# OpenAI
from agno.models.openai import OpenAIChat
model = OpenAIChat(id="gpt-5.2")
# Anthropic
from agno.models.anthropic import Claude
model = Claude(id="claude-sonnet-4-5")
One line change. Everything else stays the same.
If you enjoyed reading this, star Agno on GitHub. It helps more than you'd think. Questions or feedback? Reach out on X.