Architectures that route agent outputs back as inputs to improve the next cycle.
Feedback loops are what transform a static agent deployment into a system that improves with use. Rather than treating each agent interaction as independent, feedback loops capture signals about output quality and route them back to inform future agent behaviour.
A feedback loop has four stages: signal collection (capturing indicators of output quality — user ratings, human corrections, downstream outcomes, or automated quality checks), signal validation (filtering noise from genuine quality signal), integration (translating validated signals into changes to agent behaviour — prompt updates, retrieval index changes, routing adjustments, or training data additions), and review (human approval of proposed changes before they are deployed). The human approval gate is not optional for production systems: an automated system that modifies its own behaviour without human review is a security risk and a governance failure, regardless of how beneficial individual changes appear.
A document review platform captures feedback from reviewers: when a reviewer overrides an agent's recommendation, the override is logged with the reviewer's reasoning. Every two weeks, the operations team reviews a sample of overrides to identify patterns. When a pattern is identified (e.g. the agent consistently over-flags a specific contract clause type as high risk when human reviewers consider it standard), the pattern is used to update the agent's evaluation criteria — with legal team sign-off before deployment. The update is then tested on a held-out evaluation set before it goes live.
Agents that cannot improve from feedback are agents that repeat the same mistakes indefinitely. Feedback loops are how the organisation's AI systems become more accurate over time, more aligned with how the organisation actually works, and more resistant to the distribution shift that causes deployed model performance to degrade. They are also how organisations maintain meaningful control over AI behaviour as the environment changes.
How this pattern fails in practice — and what to watch for.
The feedback signal used is a proxy for the actual business outcome — for example, user engagement rather than decision quality. The agent optimises for the proxy, producing engaging but poor-quality outputs. The feedback loop is making the agent worse at what it is actually for, while metrics improve.
Users or downstream systems discover that providing certain feedback signals causes the agent to behave in ways that benefit them. They provide systematically biased feedback. The feedback loop incorporates this bias, and the agent drifts toward serving those users at the expense of others.
The feedback loop is calibrated to integrate signals rapidly and apply changes frequently. In a period of unusual inputs, a feedback cycle produces a significant behavioural shift in 48 hours. The shift is noticed only after it has propagated into production.
Seven things to verify before deploying this pattern in production.
Feedback loops appear in CAIG as a central governance topic: the exam specifically tests on the human approval gate requirement and the risks of automated self-modification. AIDA covers feedback loops under D4. CAIAUD auditors are assessed on their ability to identify feedback loop architectures that lack adequate human oversight or that rely on unvalidated proxy metrics.
The AIDA certification covers all 21 agentic design patterns with a focus on deployment safety, governance, and the PSF. Free to attempt.