Self Learning Research Agent That Tracks Consensus Over Time

In this post, we’ll build a self-learning research agent that does something more useful than one-off web searches. It captures the current consensus, compares it to past runs, explains what changed and why, and stores a clean snapshot so future runs get better.

No fine-tuning. No retraining. Just good system design.

Why research agents break down in practice
Research is about consensus, not answers
What is "self-learning"
Snapshot-based learning architecture
What we store in the knowledge base (and what we don’t)
End-to-end agent flow
Production Codebase (deployable anywhere)
Steps to run your own Self Learning Research Agent
Why this pattern works

1. Why research agents break down in practice

Most research agents are stateless.

You ask a question today and get a well-written answer. You ask the same question tomorrow and get another well-written answer, but totally disconnected from the first one.

What's missing:

No memory of prior conclusions
No notion of what changed
No way to tell if the answer is stabilizing or shifting

Research without memory is just search with formatting.

Humans don't work this way. We remember what we believed before and pay attention when new information contradicts it.

That's the missing layer.

2. Research is about consensus, not answers

A single answer is rarely the goal of research.

What we actually care about is:

what most credible sources agree on
where there is disagreement
how confident we should be

That's why our agent doesn't store prose. It stores structured consensus. Consensus is represented as a set of claims that are:

short and explicit
backed by sources
labeled with confidence
stable enough to diff over time

This structure is what makes comparison possible.

It also lays the foundation for reasoning about sources over time, including which sources tend to be reliable or volatile.

3. What is "self-learning"

Self-learning means the agent improves based on its own experience.

In this case, improvement comes from capturing snapshots of consensus over time and using those snapshots as context in future runs.

The agent does not:

retrain models
update weights
fine-tune embeddings

Instead, it learns by capturing experience as data and reusing it deliberately. This is what I refer to as poor-man’s continuous learning.

The model stays fixed. The system improves by accumulating validated snapshots of understanding.

4. Snapshot-based learning architecture

The system is built around a simple idea: append-only snapshots.

Each snapshot represents:

the question that was asked
the internet's consensus at that moment
the claims that define that consensus
the sources used to support it
a short report summary for semantic retrieval

Snapshots are never mutated. We only add new ones and compare.

Each stored snapshot includes:

question
created_at
report_summary (short, human-readable)
consensus_summary (1–2 sentences)
claims (structured and diffable)
sources
optional notes

This keeps the knowledge base compact, searchable, and stable over time.

5. What we store in the knowledge base (and what we don’t)

The biggest mistake we can make is storing too much.

We deliberately do not store:

full markdown reports
raw scraped content
long explanations

We do store:

concise summaries
structured claims
deduplicated source lists

Each claim looks like:

claim_id (stable slug)
claim (short statement)
confidence (Low | Medium | High)
source_urls

If you can't diff it, you shouldn't store it.

This keeps retrieval high-signal and comparisons reliable.

6. End-to-end agent flow

Here's what happens on every run:

Parallel research The agent uses parallel search tools to gather information across multiple source types.
Consensus extraction Findings are synthesized into 4–10 structured claims with confidence and citations.
Snapshot retrieval The agent searches the knowledge base for the most recent snapshot of a similar question.
Diff Current claims are compared to the previous snapshot:
- new or strengthened claims
- weakened or disputed claims
- removed claims
Each change includes a brief explanation and supporting sources.
Human-in-the-loop save The agent asks whether to save the new snapshot. Only explicit approval persists it.

This keeps learning controlled, auditable, and intentional.

7. Production Codebase (deployable anywhere)

I'm providing a production codebase for running our self-learning research agent, built using:

A FastAPI application for running our agents.
A Postgres database for storing sessions, memory and knowledge.

Here's the link to the repository containing the production codebase.

Here's the structure of the repository:

.
├── agents
│   ├── self_learning_research_agent.py
│   └── ... more agents
├── app
│   └── main.py
├── compose.yaml
├── db
├── Dockerfile
├── pyproject.toml
├── railway.json
├── README.md
├── teams
│   └── finance_team.py
└── workflows
    └── research_workflow.py

8. Steps to run your own Self Learning Research Agent

Clone the repo

git clone https://github.com/agno-agi/agentos-railway.git
cd agentos-railway

Configure API keys

We'll use OpenAI for the agent and Parallel Search for search tools. Please export the following environment variables:

export OPENAI_API_KEY="YOUR_API_KEY_HERE"
export PARALLEL_API_KEY="YOUR_API_KEY_HERE"

You can copy the example.env file and rename it to .env to get started.

Install Docker

We'll use docker to run the application locally and deploy it to Railway. Please install Docker Desktop if needed.

Run the application locally

Run the application using docker compose:

docker compose up --build -d

This command builds the Docker image and starts the application:

The FastAPI application, running on localhost:8000.
The PostgreSQL database for storing agent sessions, knowledge, and memories, accessible on localhost:5432.

Once started, you can:

View the FastAPI application at localhost:8000/docs.

Connect the AgentOS UI to the FastAPI application

Open the AgentOS UI
Login and add http://localhost:8000 as a new AgentOS. You can call it Local AgentOS (or any name you prefer).

Demo

Here's a demo of the Self Learning Research Agent in action.

Stop the application

When you're done, stop the application using:

docker compose down

Deploy the application to Railway

To deploy the application to Railway, run the following commands:

Install Railway CLI:

brew install railway

railway login

Deploy the application:

./scripts/railway_up.sh

This command will:

Create a new Railway project.
Deploy a PgVector database service to your Railway project.
Build and deploy the docker image to your Railway project.
Set environment variables in your AgentOS service.
Create a new domain for your AgentOS service.

9. Why this pattern works

This approach generalizes far beyond traditional research, you can use it for:

market analysis
policy tracking
competitive intelligence
technical standards
internal decision logs

Anywhere beliefs evolve, snapshots beat stateless answers. By separating:

online reasoning
from offline learning
and storing only what matters

we get agents that feel more trustworthy, more explainable, and more useful over time.

Thank you for reading! I hope you found this useful. Feel free to reach out to me on X if you have any questions or feedback

Table of Contents