Ashpreet Bedi

All articles
Context Providers

Context Providers

In 1973, Doug McIlroy added pipes to Unix. The idea was small. Each program reads stdin, writes stdout, and the shell composes them. The shell didn't know what grep or awk did. It just wired them together. Fifty years later we still type grep | awk | sort without thinking. That's how good the abstraction was.

In 2026 we are building agents the other way around. If you believe we're at the cusp of a new operating system, then the agent is the shell and every tool is a program. The question is: why do we keep stuffing the shell's prompt with the man pages of every program it might call?

The most powerful technology of this decade is bottlenecked by the number of tools it can hold in its "RAM" and every agent with multiple toolkits hits the same three problems:

  1. Context pollution from too many tools
  2. Degrading performance from blurry scopes (eg: search operations in multiple toolkits)
  3. The main agent getting confused because its context is all tool instructions

I've been testing a protocol that fixes all three.

The three walls

Context pollution. Every tool takes up precious context. Schemas, descriptions, example usage, all of it lands in the system prompt. A Slack toolkit is 8 to 12 tools. Gmail is 6 to 10. Calendar another 6. Drive, GitHub, your CRM, the web. You're at 50 tools before adding anything custom. From what I've seen, somewhere past 20 tools models start hallucinating tools that don't exist, calling tools with the wrong shape, or skipping the right tool because its description got buried.

Blurry scopes don't compose. Two tools both take a workspace argument: one is Slack's, one is Google's. search in one MCP collides with search in another. send_message could be Slack, email, or your CRM. The agent picks wrong half the time, and no naming convention fixes it because the same word legitimately means different things in different sources. The minute you compose tools from sources you don't control (MCP servers, third-party SDKs), you get overlap, and the model has no reliable way to disambiguate.

Tool-use logic lives with the main agent. This is the deepest wall.

For an agent to use Slack well, the system prompt has to explain Slack: look up the user ID before you DM them, resolve a channel name to an ID before you post, prefer conversations.history for channels and conversations.replies for threads, paginate by cursor instead of offset. That's hundreds of tokens of Slack-specific guidance. Now do that for Gmail. For Calendar. For Drive. For your database. For GitHub.

The system prompt becomes the union of every API's quirks. Every turn carries every rule, even when the user just asked about Slack. The main agent is stuck reasoning about both the user's question and the mechanics of every API. Adding a source means editing the prompt and praying nothing breaks.

The missing layer

Today the canonical agent shape is some variant of:

Agent → Tools                             # raw
Agent → MCP server → Tools                # MCP
Agent + Skill instructions → Tools        # Skills

In all three cases the agent sees the raw tool surface of every source. Every Slack tool, every Drive tool, every CRM tool. The agent's prompt has to contain how to use every one of them.

The shape I've been testing puts a thin layer in between:

Agent ↔ ContextProvider ↔ Tools

A ContextProvider wraps one source: Slack, GitHub, Drive, Filesystem, your DB. To the calling agent, it exposes exactly two tools:

  • query_<source>(question) for natural-language reads
  • update_<source>(instruction) for natural-language writes (or a clean read-only error)

The main agent doesn't see Slack's twelve tools. It sees query_slack and update_slack. It doesn't see Drive's quirks. It sees query_drive. Add ten more sources and the agent's tool surface stays linear at 2N.

Behind each tool is a sub-agent scoped to that one source. The sub-agent owns the source's tools, the source's quirks, the lookup-before-write patterns, the pagination weirdness. It runs in its own context, returns an answer, and the main agent gets a clean result.

from agno.agent import Agent
from agno.context.slack import SlackContextProvider
from agno.context.gdrive import GDriveContextProvider
from agno.context.database import DatabaseContextProvider
from agno.models.openai import OpenAIResponses

# Sub-agents do source-specific tool work — a cheaper model is plenty.
provider_model = OpenAIResponses(id="gpt-5.4-mini")

slack = SlackContextProvider(id="slack", token=..., model=provider_model)
drive = GDriveContextProvider(id="drive", service_account_file=..., model=provider_model)
crm   = DatabaseContextProvider(id="crm", sql_engine=engine, model=provider_model)

agent = Agent(
    model=OpenAIResponses(id="gpt-5.4"),
    tools=[*slack.get_tools(), *drive.get_tools(), *crm.get_tools()],
    instructions="\n".join([slack.instructions(), drive.instructions(), crm.instructions()]),
)

The agent sees four tools: query_slack, query_drive, query_crm, update_crm. Three sources, two of them read-only. Five years of API quirks for those three sources, summarized into four tool descriptions.

Btw the quirks didn't vanish, they just moved into the sub-agent's scope, where they belong, and only load on turns that actually touch that source.

The biggest advantage is that the main agent doesn't see the gunk of intermediate tool calls, or the fact that slack returned 1000 channel names to find the correct channel id.

A walkthrough

Here's an example from Scout, the agent I've been testing this with. The user asks:

What's the verdict on the v3 migration, and is the spec doc up to date?

Scout has the following tools: query_slack, query_drive, query_filesystem, query_web.

It picks two.

query_slack("recent engineering channel discussion about v3 migration")

The Slack sub-agent runs search_messages, walks the thread with conversations_replies, and returns:

On Tuesday, engineering decided to phase the v3 migration over two weeks, starting with the read path. James raised concerns about the rollback procedure that haven't been resolved in-thread.

query_drive("v3 migration spec document, last updated when, current contents")

The Drive sub-agent runs files.list, fetches the doc, returns:

The v3 migration spec was last edited Friday by Mike. The current version describes a single-phase migration. No mention of phased rollout or rollback procedure.

The main agent writes the answer:

Engineering agreed Tuesday on a phased migration starting with the read path, but the spec was last edited Friday and still describes a single-phase approach. James's rollback concerns aren't reflected in the doc either. The spec needs an update.

The main agent never touched a Slack API tool. It never paginated a Drive listing. It composed two natural-language calls and wrote one paragraph. The Slack and Drive specialists handled their own messes in their own contexts.

What about Skills?

Skills are another serious attempt at problem #3. A Skill packages task-specific instructions ("here's how to use Slack") into a module the model loads when relevant, instead of carrying it in the system prompt full-time. They move task knowledge out of the always-on prompt and into something more conditional. If problem #3 were the only issues, we could solve this skill issue.

But Skills don't fix walls 1 and 2. The strongest version — Skills that bring scoped tools when loaded — still leaves you with N tool surfaces in the agent's context once N skills are active, and search still collides with search. There's also a higher chance of conflicting Skills quietly degrading the agent without you noticing.

ContextProvider and Skills compose. A Slack ContextProvider's sub-agent can itself load a Slack Skill, and that's where the Skill does its best work: in the context of the thing actually executing against Slack, not in the main agent that just wanted an answer.

Roughly: Skills compress how to do a task. ContextProvider hides that there's a task until the main agent decides to delegate one. Different layers, both useful.

Examples

As always, I come bearing code. The full set of examples lives in cookbook/12_context. A few worth pointing at:

Sources covered out of the box. Filesystem (00), database (04), Slack (05), Google Drive (07), GitHub (12), and web via Exa or Parallel, direct SDK or MCP endpoint (01, 02, 03, 11). Every provider follows the same query_<id> / update_<id> shape.

Read/write split with real security. 04_database_read_write.py spins up a SQLite DB and has the agent insert a contact, read it back, and verify with direct SQL. Read and write go through separate sub-agents with separate engines, same shape I used in Dash, same reason: the database itself rejects writes the model isn't allowed to make, regardless of what the prompt says. 12_github.py does the same shape over a real repo: reads through a read-only sub-agent on a clone, writes through a sub-agent that operates on a per-session worktree on a <prefix>/<task> branch and ends in a PR. The agent cannot push to the default branch.

Compositional multi-source. 09_web_plus_slack.py is the shape flat tool layouts can't do without orchestration code. The agent pulls topics from a Slack channel, runs a per-topic web search, and returns a briefing tying each internal thread to an external reference. Two providers, one prompt, and the main agent stitches the synthesis itself. 12_engineering_briefing.py takes it one step further: Slack topics → codebase workspace → web fallback, all in one prompt.

MCP wrapper. 06_mcp_server.py wraps any MCP server (stdio or HTTP) as a single query_<id> tool. The sub-agent's instructions are built from the server's list_tools() response at connect time, so the calling agent never sees stale tool docs. Staleness is bounded by sub-agent lifetime, not eliminated. But for any reasonable session, that's the same difference. This is the move that collapses a 50-tool MCP server to 1 tool from the main agent's view.

Surprises and open questions

Sub-agents are cheaper than expected. I assumed the extra hops would dominate. They don't. The main agent's context is so much smaller that its calls are faster, and the sub-agent only fires on turns that touch its source. Anecdotal on Scout's workload: total tokens are roughly flat at low source counts and improve as the source count grows; wall-clock latency drops at every source count I've measured. I haven't generalized this off Scout yet.

The main agent's prompt got smaller. I expected to add orchestration logic. I removed it. With a uniform surface, the routing rules collapse to "pick the right query_<source>". gpt-5.4 just works out of the box, with zero guidance on how to use a source.

A few things I'm still working through:

  • How thin can the main agent's prompt get? I've been hill-climbing this with evals.
  • Caching across calls in a session. The same query_<source>("who's on the X channel") shouldn't re-do the work two turns later.
  • Per-user authentication that survives the hop. Partially solved. Scout passes user_id, session_id, metadata, and dependencies through to the sub-agent. More to do for OAuth flows.
  • When to expose underlying tools instead. Some sources benefit from the agent driving the tool calls directly, usually when the source is small enough that the schema cost is low. The protocol has a mode for this. I'm still figuring out where the line sits.

TL;DR

The agent's tool surface should be its job description, not the union of every API it might touch. Context Providers move the API mess to where it belongs and leave the main agent free to reason about what the user actually asked.