Essential RAG Reading List
To build production-ready Retrieval-Augmented Generation systems, every developer must understand the core mechanics of retrieval, context routing, and structural indexing.
Here are the 5 foundational papers that form the core curriculum of modern RAG engineering.
1. Dense Passage Retrieval (DPR)
- Paper: Dense Passage Retrieval for Open-Domain Question Answering (Karpukhin et al., 2020)
- Link: arXiv:2004.04906
What you will learn:
- Vector Embeddings: How dual-encoder architectures map questions and documents into the same low-dimensional vector space.
- Semantic Search: Why dense vector representations outperform traditional exact-keyword matching (like BM25) by capturing intent.
- MIPS Mechanics: The foundation of Maximum Inner Product Search used by vector databases today.
2. The Original RAG Paper
- Paper: Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks (Lewis et al., 2020)
- Link: arXiv:2005.11401
What you will learn:
- Parametric vs. Non-Parametric Memory: How to combine frozen LLM weights with an external, dynamic knowledge base.
- Prompt Conditioning: The exact mechanics of passing retrieved contexts into a generator to ground its outputs.
- Hallucination Mitigation: How the system dynamically calculates token probabilities based on source documents to improve factual accuracy.
3. Fusion-in-Decoder (FiD)
- Paper: Leveraging Passage Retrieval with Generative Models for Open-Domain Question Answering (Izacard & Grave, 2020)
- Link: arXiv:2007.01282
What you will learn:
- Context Stuffing Solutions: How to scale your system to process dozens of retrieved text chunks without blowing past LLM context windows.
- Late Fusion: The practice of encoding text chunks independently and combining their information inside the decoder layers.
- Lost-in-the-Middle Prevention: Basic strategies for managing attention weights when multiple source documents compete for relevance.
4. The RAG Survey
- Paper: Retrieval-Augmented Generation for Large Language Models: A Survey (Gao et al., 2023)
- Link: arXiv:2312.10997
What you will learn:
- RAG Evolution: The structural differences between Naive RAG, Advanced RAG, and Modular RAG.
- Pre/Post-Retrieval Pipelines: How to build production components like query rewriting, reranking, and context compression.
- Evaluation Frameworks: Standard patterns for benchmarking retrieval quality and generation faithfulness.
5. GraphRAG
- Paper: From Local to Global: A Graph RAG Approach to Query-Focused Summarization (Edge et al., 2024)
- Link: arXiv:2404.16130
What you will learn:
- Global Querying: How to solve aggregate questions (e.g., "What are the overarching themes?") where traditional vector search fails.
- Knowledge Graph Indexing: Using LLMs to extract entities, relationships, and claims from unstructured text chunks.
- Hierarchical Summarization: How to pre-summarize community clusters within a graph to answer complex, high-level questions efficiently.