Self Learning Research Agent That Tracks Consensus Over Time
In this post, we’ll build a self-learning research agent that does something more useful than one-off web searches. It captures the current consensus, compares it to past runs, explains what changed and why, and stores a clean snapshot so future runs get better.
No fine-tuning. No retraining. Just good system design.
Table of Contents
- Why research agents break down in practice
- Research is about consensus, not answers
- What is "self-learning"
- Snapshot-based learning architecture
- What we store in the knowledge base (and what we don’t)
- End-to-end agent flow
- Production Codebase (deployable anywhere)
- Steps to run your own Self Learning Research Agent
- Why this pattern works
1. Why research agents break down in practice
Most research agents are stateless.
You ask a question today and get a well-written answer. You ask the same question tomorrow and get another well-written answer, but totally disconnected from the first one.
What's missing:
- No memory of prior conclusions
- No notion of what changed
- No way to tell if the answer is stabilizing or shifting
Research without memory is just search with formatting.
Humans don't work this way. We remember what we believed before and pay attention when new information contradicts it.
That's the missing layer.
2. Research is about consensus, not answers
A single answer is rarely the goal of research.
What we actually care about is:
- what most credible sources agree on
- where there is disagreement
- how confident we should be
That's why our agent doesn't store prose. It stores structured consensus. Consensus is represented as a set of claims that are:
- short and explicit
- backed by sources
- labeled with confidence
- stable enough to diff over time
This structure is what makes comparison possible.
It also lays the foundation for reasoning about sources over time, including which sources tend to be reliable or volatile.
3. What is "self-learning"
Self-learning means the agent improves based on its own experience.
In this case, improvement comes from capturing snapshots of consensus over time and using those snapshots as context in future runs.
The agent does not:
- retrain models
- update weights
- fine-tune embeddings
Instead, it learns by capturing experience as data and reusing it deliberately. This is what I refer to as poor-man’s continuous learning.
The model stays fixed. The system improves by accumulating validated snapshots of understanding.
4. Snapshot-based learning architecture
The system is built around a simple idea: append-only snapshots.
Each snapshot represents:
- the question that was asked
- the internet's consensus at that moment
- the claims that define that consensus
- the sources used to support it
- a short report summary for semantic retrieval
Snapshots are never mutated. We only add new ones and compare.
Each stored snapshot includes:
questioncreated_atreport_summary(short, human-readable)consensus_summary(1–2 sentences)claims(structured and diffable)sources- optional
notes
This keeps the knowledge base compact, searchable, and stable over time.
5. What we store in the knowledge base (and what we don’t)
The biggest mistake we can make is storing too much.
We deliberately do not store:
- full markdown reports
- raw scraped content
- long explanations
We do store:
- concise summaries
- structured claims
- deduplicated source lists
Each claim looks like:
claim_id(stable slug)claim(short statement)confidence(Low | Medium | High)source_urls
If you can't diff it, you shouldn't store it.
This keeps retrieval high-signal and comparisons reliable.
6. End-to-end agent flow
Here's what happens on every run:
-
Parallel research The agent uses parallel search tools to gather information across multiple source types.
-
Consensus extraction Findings are synthesized into 4–10 structured claims with confidence and citations.
-
Snapshot retrieval The agent searches the knowledge base for the most recent snapshot of a similar question.
-
Diff Current claims are compared to the previous snapshot:
- new or strengthened claims
- weakened or disputed claims
- removed claims
Each change includes a brief explanation and supporting sources.
-
Human-in-the-loop save The agent asks whether to save the new snapshot. Only explicit approval persists it.
This keeps learning controlled, auditable, and intentional.
7. Production Codebase (deployable anywhere)
I'm providing a production codebase for running our self-learning research agent, built using:
- A FastAPI application for running our agents.
- A Postgres database for storing sessions, memory and knowledge.
Here's the link to the repository containing the production codebase.
Here's the structure of the repository:
.
├── agents
│ ├── self_learning_research_agent.py
│ └── ... more agents
├── app
│ └── main.py
├── compose.yaml
├── db
├── Dockerfile
├── pyproject.toml
├── railway.json
├── README.md
├── teams
│ └── finance_team.py
└── workflows
└── research_workflow.py
8. Steps to run your own Self Learning Research Agent
Clone the repo
git clone https://github.com/agno-agi/agentos-railway.git
cd agentos-railway
Configure API keys
We'll use OpenAI for the agent and Parallel Search for search tools. Please export the following environment variables:
export OPENAI_API_KEY="YOUR_API_KEY_HERE"
export PARALLEL_API_KEY="YOUR_API_KEY_HERE"
You can copy the
example.envfile and rename it to.envto get started.
Install Docker
We'll use docker to run the application locally and deploy it to Railway. Please install Docker Desktop if needed.
Run the application locally
Run the application using docker compose:
docker compose up --build -d
This command builds the Docker image and starts the application:
- The FastAPI application, running on localhost:8000.
- The PostgreSQL database for storing agent sessions, knowledge, and memories, accessible on
localhost:5432.
Once started, you can:
- View the FastAPI application at localhost:8000/docs.
Connect the AgentOS UI to the FastAPI application
- Open the AgentOS UI
- Login and add
http://localhost:8000as a new AgentOS. You can call itLocal AgentOS(or any name you prefer).
Demo
Here's a demo of the Self Learning Research Agent in action.
Stop the application
When you're done, stop the application using:
docker compose down
Deploy the application to Railway
To deploy the application to Railway, run the following commands:
- Install Railway CLI:
brew install railway
- Login to Railway:
railway login
- Deploy the application:
./scripts/railway_up.sh
This command will:
- Create a new Railway project.
- Deploy a PgVector database service to your Railway project.
- Build and deploy the docker image to your Railway project.
- Set environment variables in your AgentOS service.
- Create a new domain for your AgentOS service.
9. Why this pattern works
This approach generalizes far beyond traditional research, you can use it for:
- market analysis
- policy tracking
- competitive intelligence
- technical standards
- internal decision logs
Anywhere beliefs evolve, snapshots beat stateless answers. By separating:
- online reasoning
- from offline learning
- and storing only what matters
we get agents that feel more trustworthy, more explainable, and more useful over time.
Thank you for reading! I hope you found this useful. Feel free to reach out to me on X if you have any questions or feedback