Hybrid search combines dense semantic retrieval (vector similarity) with sparse lexical retrieval (BM25/keyword). This pairing handles both concept-level meaning and exact-term matching.
Why hybrid beats single-mode retrieval:
- Vector search finds semantically related text even with wording mismatch.
- Keyword/BM25 catches exact entities, IDs, API names, product codes, and legal phrases.
Typical pipeline: generate query variants (optional) โ run dense + sparse retrieval โ fuse ranks (often RRF) โ apply threshold/reranker โ send final evidence to generation.
Production design questions:
- How many candidates from each branch (dense/sparse)?
- How to fuse rankings (RRF vs weighted sum)?
- Where to apply metadata filters (before branch retrieval vs after fusion)?
- How to monitor branch contribution over time?
First-time learner checklist: if queries include IDs/codes or legal terms, ensure keyword branch is strong; if users ask conceptual questions in varied language, ensure vector branch is strong. Hybrid succeeds when both branches are tuned and measured, not merely enabled.
Failure modes: branch imbalance (one retriever dominates), stale lexical index updates, and over-reliance on vector retrieval for exact-match queries. Strong systems track per-branch hit rates and periodically recalibrate fusion strategy.
Interview-Ready Deepening
Source-backed reinforcement: these points add detail beyond short-duration UI hints and emphasize production tradeoffs.
- Hybrid search combines dense semantic retrieval (vector similarity) with sparse lexical retrieval (BM25/keyword).
- Combining dense semantic and sparse lexical retrieval in one pipeline.
- Typical pipeline: generate query variants (optional) โ run dense + sparse retrieval โ fuse ranks (often RRF) โ apply threshold/reranker โ send final evidence to generation.
- Because vector search uses embeddings that look at the semantic meaning of a particular sentence or a couple of words.
- Vector search we know that it catches semantic relationships and context and keyword search catches exact matches and specific terminology as we just saw.
- But right now we also have hybrid search and then combined together we are also going to have the ensemble retriever which is going to be the the combination of both vector and keyword.
- Failure modes: branch imbalance (one retriever dominates), stale lexical index updates, and over-reliance on vector retrieval for exact-match queries.
- We've already seen the vector search strategy but in addition to that it is also going to use the classical older keyword search strategy.
Tradeoffs You Should Be Able to Explain
- Higher recall often increases context noise; reranking and filtering are required to keep precision high.
- Smaller chunks improve semantic precision but can break cross-sentence context needed for accurate answers.
- Aggressive grounding reduces hallucinations but can increase abstentions when retrieval coverage is weak.
First-time learner note: Master one stage at a time: ingestion, retrieval, then grounded generation. Validate each stage with small test questions before tuning everything together.
Production note: Treat quality as measurable system behavior. Track retrieval relevance, groundedness, and abstention quality with repeatable eval sets.