RAG pipeline - filings, earnings calls, financial reports, news

RAG pipeline

Search documents

Avi Chawla· 2026-02-02 11:47

If you found it insightful, reshare it with your network.Find me → @_avichawlaEvery day, I share tutorials and insights on DS, ML, LLMs, and RAGs. https://t.co/lNSHKvmczqAvi Chawla (@_avichawla):Your embedding stack forces a 100% re-index just to change models.And most teams treat that as unavoidable.Imagine you built a RAG pipeline with a large embedding model for high retrieval quality, and it ships to production.Six months later, your application traffic and https://t.co/EtZ05xrK81 ...

Avi Chawla· 2026-01-24 12:26

If you found it insightful, reshare it with your network.Find me → @_avichawlaEvery day, I share tutorials and insights on DS, ML, LLMs, and RAGs.Avi Chawla (@_avichawla):100%That's how much data you re-index when you change embedding models.And most teams treat that as unavoidable.Imagine you built a RAG pipeline using an embedding model with high retrieval quality, and it ships to production.Six months later, a better embedding model is https://t.co/NIgrqffgyo ...

Avi Chawla· 2026-01-24 06:45

100%That's how much data you re-index when you change embedding models.And most teams treat that as unavoidable.Imagine you built a RAG pipeline using an embedding model with high retrieval quality, and it ships to production.Six months later, a better embedding model is released that delivers similar quality at a lower cost.But your existing embeddings live in one vector space, while the new model produces embeddings in a different one, which makes them incompatible.Switching models now means rebuilding th ...

Embedding Models

Vector Space

RAG pipeline

Mixture of Experts architecture

Mixture of Experts architecture

voyage-4-large

voyage-4-nano

Prompt Engineering is Dead — Nir Gazit, Traceloop

AI Engineer· 2025-06-27 09:34

Core Argument - The presentation challenges the notion of "prompt engineering" as a true engineering discipline, suggesting that iterative prompt improvement can be automated [1][2] - The speaker advocates for an alternative approach to prompt optimization, emphasizing the use of evaluators and automated agents [23] Methodology & Implementation - The company developed a chatbot for its website documentation using a Retrieval-Augmented Generation (RAG) pipeline [2] - The RAG pipeline consists of a Chroma database, OpenAI, and prompts to answer questions about the documentation [7] - An evaluator was built to assess the RAG pipeline's responses, using a dataset of questions and expected answers [5][7] - The evaluator uses a ground truth-based LLM as a judge, checking if the generated answers contain specific facts [10][13] - An agent was created to automatically improve prompts by researching online guides, running evaluations, and regenerating prompts based on failure reasons [5][18][19] - The agent uses Crew AI to think, call the evaluator, and regenerate prompts based on best practices [20] Results & Future Considerations - The initial score of the prompt was 0.4 (40%), and after two iterations with the agent, the score improved to 0.9 (90%) [21][22] - The company acknowledges the risk of overfitting to the training data (20 examples) and suggests splitting the data into train/test sets for better generalization [24][25] - Future work may involve applying the same automated optimization techniques to the evaluator and agent prompts [27] - The demo is available in the trace loop/autoprompting demo repository [27]