RAG
Search documents
X @Avi Chawla
Avi Chawla· 2025-11-13 19:16
RAG Challenges & HyDE Solution - Traditional RAG faces challenges due to semantic dissimilarity between questions and answers, leading to irrelevant context retrieval [1] - HyDE addresses this by generating a hypothetical answer using an LLM, embedding it, and using the embedding to retrieve relevant context [2] - HyDE leverages contriever models trained with contrastive learning to filter out hallucinated details in the hypothetical answer [3] HyDE Performance & Trade-offs - Studies indicate HyDE improves retrieval performance compared to traditional embedding models [4] - HyDE implementation results in increased latency and higher LLM usage [4] HyDE Implementation - HyDE involves using an LLM to generate a hypothetical answer (H) for the query (Q) [2] - The hypothetical answer is embedded using a contriever model to obtain embedding (E) [2] - Embedding (E) is used to query the vector database and retrieve relevant context (C) [2] - The hypothetical answer (H), retrieved context (C), and query (Q) are passed to the LLM to produce a final answer [3]
X @Avi Chawla
Avi Chawla· 2025-11-13 13:03
If you found it insightful, reshare it with your network.Find me → @_avichawlaEvery day, I share tutorials and insights on DS, ML, LLMs, and RAGs. https://t.co/DzDoKaIVcZAvi Chawla (@_avichawla):Traditional RAG vs. HyDE, visually explained!RAG is great, but it has a major problem:Questions are not semantically similar to their answers.Consider an example where you want to find context similar to "What is ML?"It is likely that "What is AI?" will appear more https://t.co/oZ7lttsZbG ...
X @Avi Chawla
Avi Chawla· 2025-11-13 06:31
RAG Challenges & HyDE Solution - Traditional RAG faces challenges due to semantic dissimilarity between questions and answers, leading to irrelevant context retrieval [1] - HyDE addresses this by generating a hypothetical answer to the query and embedding it to retrieve relevant context [2] - HyDE leverages contriever models trained with contrastive learning to filter out hallucinated details in the hypothetical answer [3] HyDE Performance & Trade-offs - Studies indicate HyDE improves retrieval performance compared to traditional embedding models [4] - The improvement in retrieval performance comes at the cost of increased latency and higher LLM usage [4] HyDE Implementation - HyDE involves using an LLM to generate a hypothetical answer, embedding the answer using a contriever model, querying the vector database, and passing the hypothetical answer, retrieved context, and query to the LLM for the final answer [2]
Accelerating RAG Pipelines with Infinia
DDN· 2025-11-11 18:32
Performance Comparison - DDN Infinia writes chunks at 0041 seconds (4 milliseconds) per chunk, significantly faster than AWS [6] - AWS object store writes each chunk at 01169 seconds (112 milliseconds) per chunk [7] - DDN Infinia uploads a 628-chunk document in approximately 25 seconds, while AWS takes around 74 seconds [7] - DDN Infinia is approximately 285 times faster than AWS in document upload [7] - DDN Infinia retrieves chunks in 01600 seconds (160 milliseconds) total, averaging 32 milliseconds per chunk [13] - AWS retrieves chunks in 165 seconds, with each chunk taking 331 milliseconds [14] - DDN Infinia is 103 times faster than AWS in total query retrieval time [14] AI Pipeline Impact - With DDN Infinia, an analyst can upload and query an annual report in just 2 seconds [8] - A 30x performance advantage transforms the entire AI pipeline, making documents readily available for AI consumption [9] - Reduced latency with DDN Infinia can save significant time, potentially turning a 5-minute research task into 3 seconds [15] - Latency compounds across multiple users and sessions, impacting GPU economics and overall productivity [15]
X @Demis Hassabis
Demis Hassabis· 2025-11-09 23:10
Product Announcement - Gemini API 推出文件搜索工具,这是一个托管的 RAG 解决方案,提供免费存储和免费查询时间嵌入 [1] - 该方法旨在显著简化上下文感知 AI 系统的路径 [1]
X @Avi Chawla
Avi Chawla· 2025-11-08 18:58
AI Tools & Technologies - Six no-code LLM/RAG/Agent builder tools are available for AI engineers [1] - The tools are production-grade and 100% open-source [1]
X @Avi Chawla
Avi Chawla· 2025-11-08 12:21
If you found it insightful, reshare it with your network.Find me → @_avichawlaEvery day, I share tutorials and insights on DS, ML, LLMs, and RAGs. https://t.co/SvYt7PiJQxAvi Chawla (@_avichawla):6 no-code LLM/RAG/Agent builder tools for AI engineers.Production-grade and 100% open-source!(find the GitHub repos in the replies) https://t.co/It07fQRBL7 ...
X @Avi Chawla
Avi Chawla· 2025-11-08 06:31
AI Agent Workflow Platforms - Sim AI is a user-friendly, open-source platform for building AI agent workflows, supporting major LLMs, MCP servers, and vectorDBs [1] - Transformer Lab offers tools like RAGFlow for deep document understanding and AutoAgent, a zero-code framework for building and deploying Agents [2] - Anything LLM is an all-in-one AI app for chatting with documents and using AI Agents, designed for multi-user environments and local operation [6] Open-Source LLM Tools - Llama Factory allows training and fine-tuning of open-source LLMs and VLMs without coding, supporting over 100 models [6] - RAGFlow is a RAG engine for building enterprise-grade RAG workflows on complex documents with citations, supporting multimodal data [2][4] - AutoAgent is a zero-code framework for building and deploying Agents using natural language, with universal LLM support and a native Vector DB [2][5] Key Features & Technologies - Sim AI's Finance Agent uses Firecrawl for web searches and Alpha Vantage's API for stock data via MCP servers [1] - RAGFlow supports multimodal data and deep research capabilities [2] - AutoAgent features function-calling and ReAct interaction modes [5] Community & Popularity - Sim AI is 100% open-source with 18 thousand stars [1] - Transformer Lab is 100% open-source with over 68 thousand stars [2] - LLaMA-Factory is 100% open-source with 62 thousand stars [6] - Anything LLM is 100% open-source with 48 thousand stars [6] - One project is 100% open-source with 8 thousand stars [3]
X @Avi Chawla
Avi Chawla· 2025-11-08 06:31
6 no-code LLM/RAG/Agent builder tools for AI engineers.Production-grade and 100% open-source!(find the GitHub repos in the replies) https://t.co/It07fQRBL7 ...
一篇论文,读懂上下文工程的前世今生
3 6 Ke· 2025-11-07 07:11
Core Concept - The article discusses the emerging field of "context engineering," defined as the art and science of providing the right information to prepare for subsequent reasoning, as proposed by Shopify CEO Tobi Lütke and AI expert Andrej Karpathy [1][3]. Summary by Sections What is Context Engineering? - Context engineering addresses the cognitive gap between humans and machines, where human communication is high-entropy and often ambiguous, while machines require low-entropy, clear instructions [3][14]. - The essence of context engineering is to reduce entropy through richer and more effective context, enabling better machine understanding of human intent [3][4]. Evolution of Context Engineering - Context engineering has evolved from a focus on translation (1.0 era, 1990s-2020) to a focus on instruction (2.0 era, 2020-present), with the introduction of large language models allowing for more natural interactions [5][11]. - The transition from context engineering 1.0 to 2.0 reflects a shift in how users interact with machines, moving from structured programming languages to natural language prompts [12][13]. AI Communication Gaps - The article identifies four main deficiencies in AI that contribute to the communication gap: limited sensory perception, restricted understanding capabilities, lack of memory, and scattered attention [14][15]. - These deficiencies necessitate the development of context engineering to facilitate better communication and understanding between humans and AI [15][16]. Framework of Context Engineering - A comprehensive context engineering framework consists of three components: context collection, context management, and context usage [16][24]. - Context collection involves multi-modal and distributed methods to gather information beyond simple text inputs, addressing AI's sensory and memory limitations [18][20]. - Context management focuses on abstracting and structuring high-entropy information into low-entropy formats that AI can understand, enhancing its learning capabilities [23][24]. - Context usage aims to improve AI's attention mechanisms, ensuring relevant information is prioritized during interactions [25][26]. Future of Context Engineering - The article anticipates the evolution of context engineering into 3.0 and 4.0 stages, where AI will achieve human-level and eventually superhuman intelligence, leading to seamless communication without the need for explicit context [30][34]. - Ultimately, the goal of context engineering is to become an invisible infrastructure that enhances AI usability without being a focal point of discussion [35].