The Multi-Agent Amplification Problem | Production AI Institute

The amplification principle

In our PSF assessment of CrewAI, we noted that multi-agent architectures amplify every safety gap. This finding warrants a dedicated reference, because the amplification is not intuitive, and the mechanisms are distinct from simple single-agent risks.

Consider a single agent with a D1 Gap (no prompt injection defence). In a single-agent deployment, the attack surface is: any input the user can provide. In a multi-agent deployment, the attack surface is: any input any agent can provide to any other agent. This includes inputs derived from external tool calls, retrieval results, web content, and the outputs of other agents — all of which can be adversarially crafted if any upstream agent is compromised.

The amplification ratio scales with the number of agents and the richness of the communication topology. A four-agent crew with a shared memory store does not have four times the attack surface of a single agent. In the worst case, it has a combinatorially larger surface, because each agent-to-agent communication channel is a potential injection vector.

The five amplification mechanisms

1. Blast radius amplification

A single agent with access to a tool can cause harm proportional to that tool's capabilities. An orchestrator agent can direct multiple specialist agents, each with their own tool access, to act simultaneously. A compromised orchestrator can trigger a cascade of downstream harm across all the tools accessible to all the agents it controls.

The practical mitigation is least-privilege decomposition: agents should have access only to the tools they need for their specific role, and the orchestrator should not aggregate all tool permissions. CrewAI's role-based agent model is architecturally sound on this point — the implementation risk is that practitioners assign over-broad tool sets to each role.

2. Agent-to-agent trust escalation

When Agent A calls Agent B, what trust level should Agent B assign to the message from Agent A? The naive answer is: high trust — they're both part of the same system. This is wrong, and it is a principal-agent problem.

If Agent A can be compromised via prompt injection (D1 Gap), then any message from Agent A to Agent B may contain adversarial content. Agent B treating A's messages as trusted system instructions rather than untrusted external inputs means a D1 compromise of Agent A cascades into a D1 compromise of Agent B — and any agent B communicates with.

The correct model: treat inter-agent messages with the same validation as user inputs, unless there is a cryptographically enforced trust channel. No current framework implements cryptographic inter-agent message signing. This is a structural gap in the entire multi-agent ecosystem.

3. Shared context contamination

Multi-agent systems frequently share a working context — a shared memory, a task queue, or a conversation history that multiple agents read and write. A single compromised agent can write adversarial content to this shared context, where it will be processed as trusted input by all other agents reading from it.

This is particularly dangerous in systems where the shared context includes retrieved documents, tool outputs, or external data. Indirect prompt injection in one agent's retrieval can propagate through the shared context to every agent in the system.

4. Oversight gap multiplication

In a single-agent system with L1 oversight (human in the loop for high-risk actions), the human reviewer sees all the high-risk actions the agent takes. In a multi-agent system, high-risk actions may be composed from multiple lower-risk sub-actions — each individually below the review threshold, but together constituting a high-risk outcome.

Consider an orchestrator that delegates: Agent A fetches customer data, Agent B analyses it, Agent C drafts an email, Agent D sends it. None of these individual actions may trigger a review threshold. The composite action — sending a data-informed email to a customer — should have required review. Oversight design in multi-agent systems must reason about composed actions, not just individual agent actions.

5. Observability collapse

Single-agent systems produce a linear trace: input → reasoning → output → action. Multi-agent systems produce a distributed trace: concurrent agent executions, message passing, shared state mutations, parallel tool calls. Without distributed tracing that correlates all agent activities in a single transaction view, it is impossible to understand what happened when something goes wrong.

LangSmith and Langfuse both support multi-agent distributed tracing with parent-child span relationships. Arize Phoenix provides similar capability. This is not optional in multi-agent deployments — without it, D4 is not satisfiable.

Framework safety posture for multi-agent deployments

AutoGen / AG2Most structurally sound

Assessment →

UserProxyAgent creates genuine human checkpoints in the conversation flow. Docker container isolation for code execution limits blast radius. The research-oriented deployment model means you must add production infrastructure, but the safety architecture is correct.

LangGraphGood with explicit design

Assessment →

Interrupt nodes provide composable oversight points. Graph structure makes the blast radius explicit at design time. The risk is implicit trust between nodes — inter-node messages are not validated. Add guardrails at every node boundary, not just at the graph entry.

Semantic KernelStrong isolation via plugin model

Assessment →

Plugin boundaries are natural permission boundaries. Each plugin (agent) can have its own auth scope. Azure RBAC provides real isolation rather than honour-system isolation. Best enterprise option for controlled blast radius.

CrewAIHighest risk — most attention required

Assessment →

Role-based architecture is intuitive but encourages over-broad tool assignment. Shared crew context is a contamination surface. Agent-to-agent messages are not validated. Requires the most companion tooling for safe multi-agent deployment. Use AutoGen or LangGraph for high-risk multi-agent systems.

Safe multi-agent architecture principles

01

Least-privilege decomposition

Each agent holds only the permissions it needs for its specific role. The orchestrator does not aggregate all permissions. Permission boundaries are defined before tool assignments are made.

02

Validate at every boundary

Inter-agent messages are validated with the same rigour as user inputs. There is no trusted internal channel unless it is cryptographically enforced. Assume any message can be adversarially crafted.

03

Isolated working contexts

Agents that handle external data (web, email, documents) operate in isolated contexts before sharing results with the wider crew. Retrieved content is passed as data, not instructions.

04

Composed action oversight

Human oversight thresholds apply to composed actions, not just individual agent actions. The system reasons about what a sequence of sub-actions collectively achieves before deciding whether human review is required.

05

Distributed tracing from day one

Multi-agent systems must have distributed tracing configured before production deployment. Post-hoc addition is extremely difficult. Choose an observability platform that supports multi-agent parent-child span correlation.

06

Circuit breakers between agents

If Agent A fails or behaves unexpectedly, Agent B should not cascade. Circuit breakers at agent boundaries prevent failure propagation. Define what 'unexpected' means in terms of output schema, volume, and timing.

Related assessments

CrewAI PSF Assessment →AutoGen PSF Assessment →Framework comparison matrix →D1 Input Governance guide →D6 Human Oversight guide →Observability tools comparison →

The Multi-AgentAmplification Problem

The amplification principle

The five amplification mechanisms

1. Blast radius amplification

2. Agent-to-agent trust escalation

3. Shared context contamination

4. Oversight gap multiplication

5. Observability collapse

Framework safety posture for multi-agent deployments

Safe multi-agent architecture principles

Related assessments

You understand the gaps.Get the credential that proves it.

The Multi-Agent
Amplification Problem

You understand the gaps.
Get the credential that proves it.