The Attention Bottleneck: Managing 10 AI Coding Agents at Once
Developers now run multiple AI coding agents in parallel, but the real constraint is not the agents. It is the human trying to keep track of which one needs attention right now.
You Are the Bottleneck
Picture an air traffic controller staring at a radar screen. Ten planes are in the air. Most are cruising smoothly, but two are circling, waiting for clearance to land. The controller's job is not to fly the planes. It is to figure out which one needs a decision right now. That is roughly what it feels like to run seven to ten AI coding agents at the same time.
Since late 2025, developers have started running multiple instances of tools like Claude Code, Cursor, Aider, and Amazon Q Developer in parallel. One agent refactors a module. Another writes tests. A third fixes a bug. The agents work fast, but they all independently hit moments where they need a human: a permission prompt, a clarifying question, a finished task waiting for review, an error that needs a judgment call. The developer becomes the dispatcher, scanning terminal windows and trying to remember which agent is stuck and which one is still thinking.
What emerged is a new category of software: terminal session managers designed specifically for this problem. Our research found over fifteen tools and frameworks that have appeared in just the last few months, each attempting to answer a deceptively simple question: which agent needs me right now?
What the Research Uncovered
- Every major orchestration tool converged on the same three-state model. Agent status is tracked as Idle (ready for input), Busy (processing), or Waiting (blocked on human action). CCManager, Agent of Empires, and OpenAI's Codex app all independently arrived at this pattern.
- Git worktrees became the standard isolation mechanism. At least five tools (ccswarm, Codex, Orcha, Cursor, and Claude Code) use Git worktrees to give each agent its own copy of the codebase, preventing file conflicts without duplicating entire repositories.
- Claude Code's hooks API is the richest signal source available. Its 14 lifecycle events, including permission_prompt and idle_prompt notifications, give external tools detailed real-time data about agent state. No other coding tool exposes comparable programmatic signals.
- Humans can actively focus on only 3-4 things at once. While Miller's classic research suggested 7 plus or minus 2 items in working memory, later work by Cowan narrowed the true attention limit to about 3-4 items, a critical constraint for anyone trying to monitor 10 agents.
- Task switching costs up to 40% of productive time. Jumping between agent terminals is not free. Research on cognitive load shows each context switch imposes a measurable penalty, and it takes over 23 minutes to fully refocus after an interruption.
- Alert fatigue is a proven risk at scale. 28% of operations teams report missing critical alerts because of notification overload. Agent orchestrators must avoid simply adding more notifications and instead route attention intelligently.
- The field evolved from grassroots hacks to vendor features in under six months. What started as developers manually juggling tmux panes became official product features at Anthropic, OpenAI, and Cursor between October 2025 and February 2026.
- Detecting terminal input-readiness is a genuinely hard technical problem. PTY architecture allows writes to buffers regardless of process state, so no single technique reliably detects when a CLI agent is waiting. Robust detection requires combining process state inspection, output pattern matching, and tool-specific hooks.
- Risk-based approval routing is emerging as a solution. Frameworks like AURA score each agent action on a risk scale and auto-approve low-risk decisions while escalating high-risk ones to humans, reducing the total number of interruptions without sacrificing oversight.
- One Rust compiler was built by 16 coordinated AI agents. The agents produced a 100,000-line compiler capable of building Linux 6.9, demonstrating that multi-agent coordination at scale is not theoretical. It is already happening.
The Cambrian Explosion of Agent Orchestrators
Between November 2025 and February 2026, the developer tooling ecosystem went through something remarkable. More than fifteen distinct tools appeared to solve the same problem: how do you manage multiple AI coding agents running at the same time?
The list reads like a startup directory. claude-flow scaled up to 60+ agent swarms with 170+ tool integrations. ccswarm, written in Rust, introduced Git worktree isolation with native PTY sessions. AgentBase built a command center where pending approval requests from all agents appear in a single queue. CCManager took the broadest approach, supporting eight different AI coding assistants, from Claude Code and Gemini CLI to Cursor Agent and Copilot CLI, with per-tool state detection strategies.
Then the major vendors joined in. Claude Code's Agent Teams, built on the previously hidden TeammateTool with its 13 distinct operations, became an official feature. OpenAI launched the Codex app on February 2, 2026, branding it a "command center for agents." Cursor's background agent mode let developers run parallel refactoring, testing, and UI polish tasks within a single IDE.
The grassroots energy was evident everywhere. A developer named mikekelly created claude-sneakpeek to bypass TeammateTool's feature flags before it was officially released. Orcha built a visual workflow builder for parallel Claude Code agents, with its creator reporting that work that previously took two hours now finished in twenty minutes. Tools like Uzi, AI-fleet, SplitMind, and Claude Squad each brought their own take on the orchestration problem.
As one RedMonk analyst observed, developers became "managers of agents but had none of the tooling managers have: no dashboard, no status board, just spatial memory of which terminal window had which session." This gap between capability and usability is what drove the entire ecosystem.
The Three-State Model and the Detection Problem
The most striking pattern across these tools is convergence. Without coordinating with each other, multiple teams arrived at the same fundamental abstraction: every agent is in one of three states. It is either Idle (finished, ready for a new task), Busy (actively generating code or processing), or Waiting (blocked, needing human input). CCManager and Agent of Empires both implemented this exact model independently. The Codex app tracks "running, paused, completed." Claude Code's hooks expose permission_prompt and idle_prompt events that map to the same states.
But detecting which state an agent is in turns out to be surprisingly difficult at the technical level. The fundamental issue is that pseudo-terminal (PTY) architecture allows characters to be written into a buffer regardless of whether the process is actually reading from it. You cannot simply check "is the terminal waiting for input?" by looking at the PTY alone.
Practitioners have developed several workarounds. Linux's /proc filesystem exposes a process's wait channel, revealing what kernel function it is blocked on. System call tracing with strace can detect when a process enters a read() call on file descriptor 0 (stdin). pexpect's waitnoecho() method catches when a child process turns off terminal echo mode, a common signal for password prompts. tmux's control mode provides a line-based message protocol with %output notifications that include millisecond-precision lag indicators.
The most pragmatic solutions skip low-level detection entirely. Claude Code's hooks system fires structured events at 14 lifecycle points, letting an orchestrator know exactly when the agent needs attention. This is why CCManager maintains separate detection strategies for each of the eight AI assistants it supports because each tool exposes different signals, and there is no universal protocol.
The lack of a standardized "agent state" protocol across tools is the single biggest barrier to building a universal terminal session manager. Every orchestrator must reverse-engineer each coding tool's output patterns independently.
Why Your Brain Cannot Keep Up
The cognitive science is unambiguous. George Miller's famous 1956 finding that working memory holds about seven items (plus or minus two) is often cited, but later research by Nelson Cowan paints a harsher picture. When you need to actively attend to multiple things simultaneously, as opposed to passively holding them in memory, the limit drops to about three or four. Running ten agents means you are roughly three times over your cognitive bandwidth.
The cost of switching between agents compounds the problem. Research by Rubinstein, Meyer, and Evans found that task switching can consume up to 40% of productive time. A study from the University of California, Irvine found it takes an average of 23 minutes and 15 seconds to fully refocus after an interruption. If you are bouncing between ten terminal windows, you may never achieve deep focus at all.
This is not an abstract concern. The parallel to operations teams is instructive. A 2023 Datadog survey found that 63% of organizations deal with over 1,000 cloud infrastructure alerts every day. Twenty-two percent handle more than 10,000. And PagerDuty reports that 28% of teams admit to missing critical alerts because of sheer alert fatigue. If operations teams with dedicated tooling struggle with notification overload, a solo developer monitoring ten AI agents through raw terminal windows has no chance.
The solution is not more notifications. It is smarter routing. The AURA framework, published in a 2024 paper, scores agent actions on a 0-100 risk scale and auto-approves anything under 30 while escalating scores above 60 to a human. Microsoft's multistage approvals combine AI and human reviews, automatically processing routine requests while flagging complex decisions. These patterns map directly to the agent orchestration problem: let low-risk file reads happen automatically, but interrupt the human for destructive operations.
Lessons from Queues We Already Know
The "route human attention to items needing action" problem is not new. Support ticket systems, CI/CD dashboards, incident management platforms, and PR review tools have all solved versions of it.
PagerDuty's escalation policies notify one person at a time, escalating to the next responder only if the first does not acknowledge within a configurable timeout (defaulting to 30 minutes). Zendesk's omnichannel routing orders tickets by priority and time-to-SLA-breach, with 100 priority levels available. Graphite's PR Inbox organizes pull requests into sections like "Needs your review" and "Waiting for review," and even lets developers approve and merge small PRs directly from Slack. Grafana links alert rules to visualization panels with color-coded health indicators.
The consistent patterns across these systems are telling. Priority-based ordering, visual state indicators (color, badges, countdowns), escalation timeouts, notification grouping to reduce redundancy, and the ability to take action in place without switching contexts. An agent terminal manager that applied these patterns (showing a ranked list of agents needing attention, with countdown timers for how long each has been waiting, and inline approval buttons) would be a massive improvement over scanning raw terminal windows.
Named Tmux Manager (ntm) is one of the closest implementations. It supports 3-10+ agents in a tiled layout, adds real-time token velocity badges showing tokens-per-minute for each agent, and uses visual hierarchy with color themes and structured layouts. The project explicitly lists the pain points it addresses: window chaos, context switching burden, manual orchestration, session fragility, and poor visibility.
The best analogy may be Graphite's PR Inbox: an opinionated view that surfaces exactly what needs your attention, organized by urgency, with inline action capabilities. Apply that pattern to agent terminals, and you get a fundamentally different experience from switching between tmux panes.
How We Got Here
- 1956 George Miller publishes research establishing the 7 plus/minus 2 limit for working memory capacity
- October 2025 Cursor 2.0 releases with an agent-centered interface and background agents with OS notifications
- November 2025 Agent of Empires releases TUI dashboard for multi-agent terminal management with three-state detection
- November 2025 CCManager released supporting 8 AI coding assistants with real-time state monitoring
- December 2025 TeammateTool discovered hidden inside Claude Code's binary, sparking community interest in multi-agent orchestration
- December 2025 AgentBase and ccswarm released, adding centralized approval queues and Git worktree isolation
- January 2026 Claude Code Agent Teams officially released alongside Sonnet 5, moving from community hack to vendor feature
- January 2026 claude-flow v3 ships with support for 60+ agent swarms and 170+ MCP tools
- January 2026 GitHub Copilot CLI enhanced with 4 built-in autonomous agents and parallel execution
- February 2026 OpenAI launches Codex app as a command center for multi-agent orchestration with diff visualization and approval workflows
References
- Claude Code Agent Teams Documentation Anthropic
- Claude Code Hooks System Anthropic
- Claude Code Headless Mode Anthropic
- Windsurf Cascade Documentation Windsurf
- Amazon Q Developer CLI Agent Orchestrator AWS
- OpenHands Agent Delegation OpenHands
- Aider Scripting Documentation Aider
- PagerDuty Escalation Policies PagerDuty
- Zendesk Omnichannel Routing Zendesk
- Grafana Alerting Fundamentals Grafana
- tmux Control Mode tmux
- /proc/pid/wchan Man Page Linux
- pexpect API Documentation pexpect
- Microsoft Multistage Approvals Microsoft
- claude-flow Multi-agent orchestration, 60+ agents
- ccswarm Rust-based, Git worktree isolation
- AgentBase Centralized approval command center
- CCManager 8 AI assistants, three-state monitoring
- Named Tmux Manager (ntm) 3-10+ agents, token velocity badges
- MultipleCursor Concurrent Cursor IDE instances
- Zellij WASM plugin terminal multiplexer
- cursor-agent-notifier macOS completion notifications
- fswatch Cross-platform file change monitor
- Miller's Law: The Magical Number Seven 1956
- Cowan's Working Memory Research Laws of UX
- AURA: Agent Autonomy Risk Assessment Framework arXiv
- Alert Fatigue Best Practices Datadog
- Alert Fatigue Metrics PagerDuty
- Task Switching Cost Research Monitask
- LangGraph interrupt() for HITL Agents LangChain
- The TTY Demystified Linus Akesson, 2008
- 10 Things Developers Want from Agentic IDEs RedMonk
- OpenAI Codex App Analysis Intuition Labs
- Testing Claude Code Swarm Mode Medium
- Claude Code's Hidden Swarm Feature Paddo.dev
- How to Parallelize AI Coding Agents Tessl
- Git Worktrees and AI Agents Medium
- How to Get PR Feedback Faster Graphite
- Orcha Discussion Hacker News
- Claude Code Multi-Agent tmux Setup Dariusz Parys
- Agent of Empires Guide Agent of Empires
Get insights delivered to your inbox
Research, analysis, and perspectives on technology — published when we have something worth saying.