Back to Notes
Newsletter

Quanta Bits: Context Management Is the New Prompt Engineering

AI agents fail 65% of multi-turn tasks because they lose context. In 2026, context management, not prompt engineering, is the skill that separates winners from experimenters.

January 30, 2026

Salesforce found that AI agents fail 65% of multi-turn customer tasks. Not because the models are bad, but because they lose context.

Multi-turn means any conversation longer than one question. "What's my order status?" works fine. But add "Can you change the address?" and then "Actually, cancel that item instead," and the AI forgets what you were talking about.

We spent 2024 obsessing over prompts. In 2026, context management is the skill that matters.

Context Engineering Is Becoming a Discipline

In August, Cognizant announced they're deploying 1,000 context engineers. Their CEO put it plainly: "In the microprocessor era, the lever was code. In the cloud era, it was workload migration. In the LLM era, the lever is context."

This isn't a quirk of one company. File storage companies are solving context at the storage layer, creating vector databases that make it faster for AI to access organizational knowledge. Foundation Capital calls context graphs "AI's trillion-dollar opportunity." The pattern is clear: context management is where the industry is moving.

The Technical Reality

Anyone building with AI tools today has already learned this lesson. Stuff everything into one massive context window and performance degrades well before you hit the limit. The instinct to give AI "everything it needs" is actually the problem.

MIT recently proved this at scale. They found GPT-5 scores near zero on complex reasoning when context gets too large. But using sub-agents, each with its own focused context window? They handled 10 million tokens (40x beyond the native limit) with 58% accuracy. The approach: break work into smaller tasks, give each one a clean context, and coordinate at the top. The detailed work happens in isolated windows that don't pollute each other.

The lesson: managing context isn't optional overhead. It's how you make AI actually work.

The Harder Problem: Organizational Context

But here's where it gets interesting. Technical context management, token windows and session isolation, is only half the story.

Consider a Deal Desk. Your AI can process a pricing request. It knows the rules and policies. But does it know that last quarter, the CFO approved a 25% discount for a manufacturing company because "their procurement cycles are brutal"? Does it know that decision set a precedent?

That's organizational context. The "why" behind decisions. The tribal knowledge that lives in Slack threads, hallway conversations, and people's heads. Your systems of record capture what happened (discount approved). They don't capture why (strategic relationship, executive judgment, precedent set).

Without this layer, your AI can execute but can't think like your best or most relevant people do. It can't incorporate prior organizational decisions into its reasoning. It can't act like a human CFO would.

The Curation Challenge

So how do you capture organizational context as it grows? You can't feed everything to a model. The answer: curation.

Not every decision needs to be captured. Start with the most representative and critical ones, the decisions that describe a pattern of thought, that show how your organization reasons. A pricing exception that became a policy. An escalation that revealed a gap. The edge cases that taught your team something.

This requires a new kind of role: someone who understands both the business logic and the technical constraints. Not every company will hire "context engineers." But every company will need someone asking: what context does our AI actually need to make good decisions?

So What?

The technical problem is increasingly solved. The harder question: what does your AI actually need to know about how your organization thinks?

Start with one use case. Map the decisions that matter. Ask: if our AI had to make this call, what context would it need? Not what data. What "why."

The companies getting ROI from AI aren't the ones with the biggest context windows. They're the ones treating organizational context as an asset, not an afterthought.


Keys to a Ferrari, Stuck in Reverse

Sarah Friar (OpenAI's CFO) and Vinod Khosla discussed AI adoption in a recent podcast. Khosla's stat: a single-digit percentage of users leverage even 30% of AI's capabilities. Friar's framing: "We've handed them the keys to a Ferrari. They're still learning to back out of the driveway."

Their best examples of AI ROI? Contract review (AI flags non-standard terms, suggests rev-rec treatment). Accounting automation (a company doing $150M ARR with one accountant). SDR workflows (10 SDRs replaced by 1 supervising AI).

The tension in this conversation is revealing. They predict a 10-year adoption journey, but also claim 2026 is when multi-agent systems mature. I'm not sure both can be true, especially when their best examples are simple automation. These aren't moonshots. They're systematic, boring wins.

What I take from this: simple automation is the entry point. It's where you build the muscle, where people "refind their jobs" as Sarah puts it. Getting to agents requires sophistication most organizations haven't developed: the ability to handle decisions, context management, and governance. Only 14% of enterprises use agentic features today. Not because the tech isn't ready. Because they haven't earned the right to use it.

Earn your complexity before you buy it.

Source: State of the AI Industry, OpenAI Podcast Ep. 12

Architecture Beats Scale

The standard approach to giving AI more information: feed everything into the context window and hope for the best. MIT tried something different. They treated documents as environment data, let the model write code to explore them, and spawned smaller sub-LLMs to process relevant chunks before aggregating results. The outcome: 10M+ tokens processed (40x beyond the native window), 58% accuracy vs near-zero for the base model, at comparable or cheaper cost.

Here's what most coverage of this paper misses. The approach isn't just a research trick. It mirrors how the best human organizations already work: you don't put every employee in one room with all the information. You create teams with specific responsibilities and clear handoff points.

For enterprises evaluating AI investments, the question isn't "how big is the context window?" It's "how well have we designed the information flow?" The companies getting value from AI aren't buying bigger models. They're thinking more carefully about what each part of their system actually needs to know.

Source: Recursive Language Models, MIT CSAIL


What I'm Learning

I finally watched Fargo (1996). I know, three decades late. It earned its reputation.

Frances McDormand's Marge Gunderson is the quiet center of a violent film. She's pregnant, patient, speaks with a thick Minnesota accent, and radiates empathy. She's also the most competent person in the movie. The Coen Brothers built a character that breaks every stereotype about what competence looks like. No sharp edges, no Type-A intensity, no "serious professional" posturing. Just methodical, observant work wrapped in genuine warmth. It left me thinking about how often we mistake intensity for capability, in movies and in business.

The other thread that stuck with me: greed as self-destruction. Every character who wants "just a little more" ends up building schemes they can't control. The final scene isn't just horror. It's the logical endpoint of a plan that started with "this one simple thing."


News in Brief

Apple picks Google Gemini to power Siri — The model is becoming the commodity. The interface is becoming the moat.

Gartner: 40% of enterprise apps will embed AI agents by end of 2026 — Up from 5% in 2025. Agent sprawl is the new shadow IT.

Manufacturing survey: 98% exploring AI, only 20% prepared — The readiness gap is the story of 2026.

McKinsey: 39% experimenting with agents, 23% actually scaling — Pilot purgatory continues.

Anthropic launches Claude in Excel and new Tasks system for Claude Code — AI meets users where they work. Citizen development in spreadsheets is here.

Clawdbot goes viral as 24/7 personal AI agent — Memory is the feature. Proactive outreach, preference learning, always on. Open-source, but buyer beware.

New iOS apps grew 60% in 2025 after years flat — Vibe coding is lowering barriers. More people can ship.

Want More Like This?

Quanta Bits delivers curated automation insights to your inbox.