RAG Systems
Retrieval foundations, chunking strategies, hybrid search, reranking, and practical RAG workflows.
Introduction to the Complete RAG Course
Course goals, why RAG matters for AI engineering, and what you will build.
What is RAG, Tokens, Embeddings & Vector Databases
Context windows, chunking, embedding models, and the injection vs retrieval pipeline.
Coding the Injection Pipeline
Chunk → embed → store in a vector DB. Implementing from scratch.
Coding the Retrieval Pipeline
Query → embed → similarity search → top-k chunks → LLM prompt → answer.
Cosine Similarity Explained
How vector similarity is measured — the angle between embeddings explained.
Answer Generation with LLM
From retrieved chunks and user question to a grounded, accurate final answer.
History-Aware Conversational RAG
Multi-turn context and query reformation — making RAG work in chatbots.
Chunking Strategies Overview
Why chunking is the most impactful RAG decision — fixed vs semantic vs agentic.
Character & Recursive Text Splitter
The simplest chunking methods — when to use each and their trade-offs.
Semantic Chunking
Meaning-preserving chunks using embedding similarity between adjacent sentences.
Agentic Chunking
LLM-driven chunking with dynamic metadata — the highest-quality approach.
Multi-Modal RAG with Images and Documents
Embedding and retrieving images alongside text using unified vector spaces.
Advanced Document Retrieval Techniques
Three retrieval methods: similarity, MMR, and score threshold — when to use each.
Multi-Query RAG for Better Search Results
One user query → multiple LLM-generated reformulations → merged and reranked.
Reciprocal Rank Fusion for Enhanced RAG Performance
Fusing multiple ranked retrieval lists into one robust ranking.
Hybrid Search combining Vector and Keyword Search
Combining dense semantic and sparse lexical retrieval in one pipeline.
RAG Reranking and Next Steps!
Final precision layer and production next-step roadmap.