Skip to content

ather-techie/rag-interview-system

Repository files navigation

RAG Interview System

Stargazers License: MIT Last Commit Questions PRs Welcome

RAG Interview Questions Banner

A complete collection of RAG interview questions, answers, system design scenarios, architecture patterns, and production-ready concepts.

📚 Sections

Overview & Concepts · Interview Question Banks · Failure Modes & Production Issues · Coming Soon

📖 Overview & Concepts

# Topic Purpose
00a Roadmap RAG maturity model, skill progression, and interview prep pathway
00b RAG Taxonomy Classification framework for all 12 architectures
00c Learning Path Structured curriculum and study plans
00d System Design Principles Production-grade architecture patterns
01a Embeddings Embedding models, similarity metrics, and fine-tuning
01b Chunking Strategies Document splitting and chunk optimization
01c Vector Databases Storage, indexing, and hybrid search
01d Retrieval Strategies Dense, sparse, hybrid, and advanced retrieval
01e Reranking Cross-encoders and precision filtering
01f Evaluation Metrics RAGAS, NDCG, and production monitoring
01g Prompt Injection Risks Security and defense strategies

❓ Interview Question Banks

# Topic Questions
02.01 Naive / Basic RAG 12
02.02 Advanced RAG 12
02.03 Modular RAG 12
02.04 Agentic RAG 12
02.05 Graph RAG 12
02.06 Corrective RAG (CRAG) 12
02.07 Self-RAG 12
02.08 Speculative RAG 12
02.09 Multi-modal RAG 12
02.10 Long-context RAG 12
02.11 Adaptive RAG 10
02.12 Structured / SQL RAG 10

RAG Architectures Total: 140 questions

⚠️ Failure Modes & Production Issues

# Topic Questions
03.01 Hallucination Despite Context 10
03.02 Retrieval Failure 10
03.03 Embedding Mismatch 10
03.04 Stale Index Problem 10
03.05 Context Window Overflow 10
03.06 Reranker Failure 10

Failure Modes Total: 60 questions

Grand Total: 200 questions

Difficulty distribution: 13 Basic, 58 Intermediate, 129 Advanced

🔄 Coming Soon

# Section Status
04 Patterns Planned
05 Graphs Planned
06 Labs Planned
07 Simulator Planned
08 Evaluation Planned
09 Tools Planned
10 Decision System Planned

🗺️ RAG Landscape Overview

RAG Architectures (12 types):

Naive RAG
  └── Chunk → Embed → Store → Retrieve → Generate

Advanced RAG
  └── Query rewriting + Hybrid search + Re-ranking

Modular RAG
  └── Plug-and-play pipeline components

Agentic RAG
  └── LLM decides when/how to retrieve (ReAct, FLARE)

Graph RAG
  └── Knowledge graph for entity-aware retrieval

Corrective RAG (CRAG)
  └── Evaluates retrieval quality, falls back to web search

Self-RAG
  └── Model trained to reflect, retrieve, and critique itself

Speculative RAG
  └── Small model drafts → Large model selects best

Multi-modal RAG
  └── Retrieve across text, images, tables, audio

Long-context RAG
  └── Stuff entire docs into large context windows

Adaptive RAG
  └── Query classifier routes to no-retrieval / single-hop / multi-hop

Structured / SQL RAG
  └── Text-to-SQL generation for relational database retrieval

Production Failure Modes (6 critical issues):

Hallucination Despite Context
  └── LLM ignores retrieved docs, generates false claims

Retrieval Failure
  └── Relevant chunks never surface due to semantic gap

Embedding Mismatch
  └── Query-doc embeddings in different semantic spaces

Stale Index Problem
  └── Index contains outdated information, answers are wrong

Context Window Overflow
  └── Too many/large chunks exceed context, forcing truncation

Reranker Failure
  └── Cross-encoder mis-ranks results, buries correct answers

💡 How to Use

Four content types:

  1. Overview & Concepts (00_overview/, 01_concepts/) — Reference material, not Q&A

    • Read these first to build foundational understanding
    • Comparison tables, ASCII diagrams, code examples, and system design patterns
    • Use to answer conceptual questions and understand mechanisms deeply
  2. Interview Questions (02_interview_bank/) — 10–12 questions per architecture

    • Each section contains interview-style Q&A with detailed answers
    • Sections 01–10: 12 questions each (original 10 + Q11 on cost optimization + Q12 on security)
    • Sections 11–12: 10 questions each (newer RAG types)
    • Questions are tagged with difficulty: [Basic] [Intermediate] [Advanced]
  3. Failure Modes (03_failure_modes/) — 10 questions per failure pattern

    • Six critical production failure scenarios with diagnostic Q&A
    • Use for system design rounds and production-readiness discussions
  4. CHEATSHEET (cheatsheets/CHEATSHEET.md) — Quick reference

    • All 12 RAG types compared in one table
    • Use during phone screens or quick prep

Study path:

  • 1-week prep: Start with 00_overview/learning_path.md → pick a track → follow the schedule
  • Phone screen: cheatsheets/CHEATSHEET.md + Q1–Q5 from relevant architectures
  • System design round: 00_overview/system_design_principles.md + Q9–Q12 from all files + 03_failure_modes/ for production readiness
  • Deep prep: Read 01_concepts/ files + all 02_interview_bank/ Q&A

Contributing

This repo grows best with real-world signal. If you were asked a RAG question in an interview, open a PR — real questions are prioritized over synthetically generated ones.

See CONTRIBUTING.md for how to submit a question.


Support

For issues, questions, or general feedback:


License

MIT


See Contributing to add your interview experience to the repo.