What Is Retrieval-Augmented Generation?

Definition

RAG connects a generative model to search, databases, documents, or knowledge bases. Instead of relying only on model memory, the system retrieves relevant context and asks the model to answer from that context.

How it works

A query is embedded or searched, matching source chunks are retrieved, and those chunks are inserted into the prompt. The model then generates an answer grounded in the retrieved material.

Why it matters at work

RAG is one of the most practical patterns for enterprise AI because policies, product docs, contracts, and knowledge bases change faster than models can be retrained.

Workplace example

An HR team uses RAG to answer employee policy questions from approved handbook sections instead of letting a chatbot invent policy.

Frequently Asked Questions

Does RAG eliminate hallucinations?

No. RAG reduces hallucination risk by grounding answers in sources, but retrieval can fail and models can still misread context. Important answers need citations and review.