← Back
RAG

Vector RAG: Inside the Dense-Vector Retrieval Stack

Embedding models, vector indexes, and chunking as three layers of one stack. Part 1 walks the embedding model landscape (OpenAI, Cohere, BGE, E5, MTEB) and the six-step selection-and-fine-tuning framework. Part 2 is the layered descent through HNSW, IVF, and product quantization. Part 3 is chunking strategies and the lost-in-the-middle effect. Part 4 is the net-new section: how a decision in any one layer cascades into the other two. Sits on top of the lexical floor established in Classic Search.

60 min read 2026
RAG

Measuring Retrieval

You cannot improve what you cannot measure. Two distinct measurement disciplines depending on the retriever: a closed BM25 loop you run inside your own engineering (five steps, k1 and b sweep, stratified metrics), and the MTEB external coordinate system that the field uses to compare embedding models nobody owns end-to-end. Extended preface contrasting the two disciplines, the BM25 evaluation loop step by step, the MTEB user manual (datasets, extensions, the maintainers' reproducibility paper, the BRIGHT 59.0 to 18.3 nDCG drop), and three rules for reading the leaderboard without being misled.

35 min read 2026