Production AI Institute — vendor-neutral certification for AI practitioners
Verify a credentialFor organisationsContact
Pattern LibraryProduction Patterns
Part 2: Production PatternsPSF D3 · Data ProtectionPSF D1 · Input GovernancePAI-8 C3 · Data GovernancePAI-8 C2 · Technical AI Controls

Retrieval-Augmented Generation

Connecting agents to external knowledge so they can retrieve facts rather than hallucinate them.

Retrieval-Augmented Generation (RAG) addresses the fundamental limitation of language models: their knowledge is frozen at training time. RAG connects agents to live, organisation-specific knowledge stores so they can retrieve accurate, current, authorised information before generating responses.

A RAG system has three core components. The knowledge store: a collection of documents, records, or data points that the agent can query. The retrieval mechanism: typically a vector similarity search that finds documents most relevant to the current query. The generation step: a language model that uses the retrieved documents as context to generate an accurate, grounded response. The critical architecture decisions are: how is the knowledge store maintained and updated? How is access to documents controlled? How does the system handle retrieval failure — when the query returns no relevant documents? And how does the model signal when its response is grounded in retrieved context versus falling back on parametric knowledge?

In practice

A large retailer deploys a RAG-based agent for its store operations team. The knowledge base contains product specifications, supplier contracts, store procedures, and regulatory compliance guides — updated nightly. When a store manager asks about the lead time for a specific product, the agent retrieves the current supplier contract and answers from it. When asked about a procedure that was updated last week, the agent retrieves the new version. Access controls ensure that agents serving store staff cannot retrieve documents marked as executive-only. Retrieved source documents are cited in every response, enabling the user to verify the answer.

Why it matters

Without RAG, agents answer questions about your organisation's current policies, products, and procedures from training data that may be months or years out of date — or from hallucinated content that sounds plausible but is wrong. RAG is the pattern that makes agents reliable for knowledge work: it grounds responses in authoritative, current, organisation-owned information.

Framework alignment

PSF Domains
D3
Data Protection
View PSF domain →
D1
Input Governance
View PSF domain →
PAI-8 Controls
C3
C2
Technical AI Controls
View PAI-8 standard →

Production failure modes

How this pattern fails in practice — and what to watch for.

Retrieval-grounded hallucination

The agent retrieves a document and then hallucinates a paraphrase of it that says something the document does not say. The response looks grounded because it cites a source, but the cited content doesn't support the claim. This is more insidious than pure hallucination because the citation creates false confidence.

Knowledge base staleness

The retrieval system returns a document that has been superseded by an update. The indexing pipeline failed to process the updated version. The agent confidently cites the outdated policy. In a regulated environment, this can constitute a compliance failure.

Retrieval permission bypass

The retrieval system does not enforce document-level access controls at query time. An agent serving a customer retrieves an internal pricing strategy document because it is semantically similar to the customer's query. The agent incorporates internal information into its customer-facing response.

Implementation checklist

Seven things to verify before deploying this pattern in production.

1

Implement document-level access controls in the retrieval layer, tied to the querying agent's identity

2

Define and enforce document freshness requirements — how stale can a retrieved document be?

3

Test retrieval precision and recall monthly on a representative query set

4

Validate that retrieved context is actually incorporated into the response, not ignored

5

Log all retrievals with document identifiers, versions, and relevance scores

6

Test with adversarial queries designed to retrieve wrong, sensitive, or confidential documents

7

Define the agent's behaviour when no relevant documents are retrieved — it must not hallucinate an answer

Certification relevance

RAG is covered extensively in AIDA under D3 and D1 — the exam tests retrieval access control scenarios and knowledge base staleness scenarios in depth. CAIG covers the governance of the knowledge base: who can add, update, and remove documents, and what approval process governs this? CAIAUD auditors are expected to assess the adequacy of a RAG system's access controls and freshness guarantees.

AIDA — Take the exam →CAIG — Take the exam →CAIAUD — Take the exam →

Related patterns

Part 2 · Production Patterns
Memory Management
How agents store and retrieve information across sessions, tools, and agent boundaries.
Part 2 · Production Patterns
Context Window Management
Strategies for fitting the right information into the finite context an agent can process.
Part 1 · Core Patterns
Tool Calling
The pattern that turns a language model from a text generator into an actor.
Production AI Institute

Certify your understanding of production AI patterns

The AIDA certification covers all 21 agentic design patterns with a focus on deployment safety, governance, and the PSF. Free to attempt.

Start AIDA — Free →All 21 patterns