What is Retrieval-Augmented Generation (RAG)?

Retrieval-Augmented Generation (RAG) is an approach where a model retrieves relevant documents or passages and uses the retrieved content to generate an answer. Retrieval-Augmented Generation (RAG) is used to improve factual grounding and recency.

Quick definition

Retrieval-Augmented Generation (RAG) means an AI model looks up information first, then writes an answer using the information.

How Retrieval-Augmented Generation (RAG) works

  • Retrieval-Augmented Generation (RAG) converts a query into a retrieval request.
  • Retrieval-Augmented Generation (RAG) retrieves documents using keyword or semantic search.
  • Retrieval-Augmented Generation (RAG) passes retrieved text into the model context.
  • Retrieval-Augmented Generation (RAG) can enable source attribution (AI) and citation in AI answers.

Why Retrieval-Augmented Generation (RAG) matters

Retrieval-Augmented Generation (RAG) matters because grounding reduces hallucinations (AI).

Retrieval-Augmented Generation (RAG) also affects which sources are used in LLM answers and how citations are produced.

Example use cases

  • Answering a question using a curated knowledge base.
  • Summarizing product documentation by retrieving relevant sections.
  • Producing an answer with citations from retrieved sources.

Related terms