Retrieval
Search documents
X @Avi Chawla
Avi Chawla· 2026-02-01 06:30
Here's a common misconception about RAG!When we talk about RAG, it's usually thought: index the doc → retrieve the same doc.But indexing ≠ retrievalSo the data you index doesn't have to be the data you feed the LLM during generation.Here are 4 smart ways to index data:1) Chunk Indexing- The most common approach.- Split the doc into chunks, embed, and store them in a vector DB.- At query time, the closest chunks are retrieved directly.This is simple and effective, but large or noisy chunks can reduce precisi ...
The State of AI Powered Search and Retrieval — Frank Liu, MongoDB (prev Voyage AI)
AI Engineer· 2025-06-27 09:57
Voyage AI & MongoDB Partnership - Voyage AI was acquired by MongoDB approximately 3-4 months ago [1] - The partnership aims to create a single data platform for embedding, re-ranking, query augmentation, and query decomposition [29][30][31] AI-Powered Search & Retrieval - AI-powered search finds related concepts beyond identical wording and understands user intent [7][8][9] - Embedding quality is a core component, with 95-99% of systems using embeddings [12] - Real-world applications include chatting with codebases, where evaluation is crucial to determine the best embedding model and LLM for the specific application [14][15] - Structured data, beyond embeddings, is often necessary for building powerful search and retrieval systems, such as filtering by state or document type in legal documents [16][17][18] - Agentic retrieval involves feedback loops where the AI search system is no longer just input-output, but can expand or decompose queries [19][20] Future Trends - The future of AI-powered search is multimodal, involving understanding images, text, and audio together [23][24][25] - Instruction tuning will allow steering vectors based on instructions, enabling more specific document retrieval [27][28]