Retrieval-Augmented Generation (RAG)

Search documents
让RAG真正读懂“言外之意”!新框架引入词汇多样性,刷新多项基准SOTA
量子位· 2025-09-27 07:00
Core Insights - The article discusses the introduction of the Lexical Diversity-aware RAG (DRAG) framework, which enhances the accuracy of Retrieval-Augmented Generation (RAG) models by 10.6% and sets new state-of-the-art (SOTA) results in multiple benchmarks [1][2][16]. Group 1: Framework and Innovations - The DRAG framework systematically incorporates lexical diversity into the retrieval and generation processes of RAG, providing a lightweight, general, and easily extensible solution [1][5]. - The research team from Beihang University, Peking University, and Zhongguancun Laboratory highlights the importance of lexical diversity, which has been largely overlooked in existing RAG methods [4][5]. - Two key innovations are introduced: 1. Diversity-sensitive Relevance Analyzer (DRA), which dissects query semantics and employs differentiated strategies for various components, leading to a more granular relevance scoring [9]. 2. Risk-guided Sparse Calibration (RSC), which monitors the "misleading risk" of each generated token and calibrates decoding as necessary, ensuring the generation phase is not disturbed by irrelevant information [11][14]. Group 2: Performance and Results - The DRAG framework has shown significant performance improvements across various open-domain question-answering benchmarks, with notable accuracy increases in PopQA and TriviaQA by 4.9% and 4.4%, respectively, and a 10.6% increase in HotpotQA and 2WikiMultiHopQA [16]. - The method also outperforms existing models in long-answer generation metrics such as str-em and QA-F1, demonstrating strong generalization capabilities across different model sizes, including Llama2-7B and Llama2-13B [18][16]. Group 3: Lexical Diversity Challenges - The article identifies lexical diversity as a critical yet often neglected issue in RAG methods, where different expressions of the same question can confuse retrieval models, leading to incorrect answers [5][8]. - The framework addresses this by allowing semantic flexibility for variable components while ensuring strict matching for invariant components, thus improving the relevance of retrieved documents [12]. Group 4: Future Directions - The research team plans to expand the application of the DRAG framework to more specialized scenarios, aiming to enhance the understanding of complex human language expressions in large models [5].
Progress Software Unveils Breakthrough SaaS RAG Platform Designed to Make Trustworthy and Verifiable Generative AI Accessible Across Organizations of all Sizes
Globenewswire· 2025-09-10 13:00
Core Insights - Progress Software has launched Progress® Agentic RAG, a SaaS platform designed to make generative AI accessible for organizations of all sizes, enabling them to transform unstructured data into actionable intelligence [1][3] - The platform aims to address the challenges posed by the exponential growth of unstructured and structured data, providing a user-friendly and affordable solution for businesses [2][5] Company Overview - Progress Software is a provider of AI-powered digital experience and infrastructure software, focusing on enabling organizations to manage and utilize their data effectively [1][7] - The company has over 4 million developers and technologists relying on its products, indicating a strong market presence [7] Product Features - Progress Agentic RAG offers a no-code RAG pipeline for streamlined ingestion, indexing, and retrieval of data across various formats, including multilingual text, audio, and video [9] - The platform provides intelligent search capabilities, delivering AI-generated answers based on unstructured data, ensuring trusted responses in multiple languages [9] - It supports seamless deployment of AI agents and integrates with leading enterprise-ready Large Language Models (LLMs), allowing users to choose their preferred model [9] - The underlying database, NucliaDB, enhances the platform's capabilities with built-in semantic search, keyword search, and knowledge graph traversal [9] Market Impact - The introduction of Progress Agentic RAG is expected to accelerate productivity and decision-making across various industries by providing fast and accurate insights from unstructured data [4][5] - The platform is positioned as a cost-effective solution that can help businesses unlock productivity and innovation, regardless of their size [5] Pricing and Availability - Progress Agentic RAG is available as a self-service offering on AWS Marketplace, with pricing starting at $700 per month, making it accessible for small businesses and departmental teams [6]
Cisco Systems Inc. (CSCO) Expands Secure AI Factory with the Nvidia Platform
Yahoo Finance· 2025-09-10 11:35
Core Insights - Cisco Systems, Inc. is recognized as a leading cybersecurity stock, particularly following its recent expansion of the Secure AI Factory in collaboration with Nvidia [1][2] - The new solution aims to enhance data extraction and retrieval for agentic AI workloads, integrating VAST Data's InsightEngine with Cisco AI PODs [2] - Cisco's advancements are positioned to meet the increasing demand for AI application performance enhancement, significantly reducing RAG pipeline latency and enabling real-time AI responses [3] Company Overview - Cisco provides a wide range of cybersecurity solutions through its Cisco Security Cloud platform, focusing on network, cloud, endpoint, and email security [4]
Cisco Secure AI Factory with NVIDIA Unlocks Enterprise Data for Agentic AI
Prnewswire· 2025-09-04 13:00
Core Insights - Cisco has launched the Secure AI Factory in collaboration with NVIDIA to provide a comprehensive solution for enterprises to utilize their data securely for agentic AI at scale [2][5] - The solution includes Cisco AI PODs integrated with VAST InsightEngine, designed to accelerate retrieval-augmented generation (RAG) pipelines, ensuring AI agents have immediate access to necessary data [3][7] - The architecture supports low-latency model interaction and high-performance networking, enabling near-real-time business insights while maintaining security and governance [4][11] Company Collaboration - The partnership between Cisco, NVIDIA, and VAST Data aims to create a validated architecture that enhances enterprise AI adoption and secures data through AI Defense [5][7] - VAST Data is the first vendor to integrate with Cisco AI PODs, providing an NVIDIA AI Data Platform reference design for enterprise customers [8] Technical Capabilities - The new capabilities allow for faster data extraction and retrieval, reducing RAG pipeline latency from minutes to seconds, thus facilitating near-real-time AI responses [11] - The architecture is designed to support multiple agents and workloads simultaneously, enabling continuous operation and dynamic learning for AI agents [11] Market Impact - The integration of these technologies represents a significant milestone in the evolution of enterprise AI, allowing intelligent agents to operate securely and collaboratively at an unprecedented scale [5][6]
Progress Software Announces General Availability of MarkLogic Server 12 and Breakthrough Results with Semantic RAG
Globenewswire· 2025-08-12 13:00
Core Insights - Progress Software has announced the general availability of MarkLogic Server 12, which features advanced semantic search and graph Retrieval-Augmented Generation (RAG) capabilities, resulting in a 33% increase in LLM accuracy and faster information discovery for customers [1][3][9] Product Features - MarkLogic Server 12 enables scalable, rapid, and cost-effective retrieval both on-premises and in the cloud, with key features including native vector search, Virtual Views for ad-hoc analysis, BM25 relevance ranking, and advanced semantic algorithms [2][9] - The new capabilities are designed to maximize retrieval accuracy and efficiency for enterprise generative AI and analytics, with global research institutions and public sector organizations already utilizing these features [2][3] Customer Impact - Customers participating in Proof-of-Concept reported an average 33% increase in LLM response accuracy and a significant reduction in information discovery time for subject matter experts [3] - Specific case studies include a financial services firm improving LLM response accuracy from 70% to 95%, a pharmaceutical company accelerating document discovery to seconds with a 73% increase in correct answers, and an agribusiness raising correct answer rates from 50% to 90% [9] Industry Positioning - Progress Software emphasizes that as AI matures, the key differentiator for enterprises will be the effective grounding of AI models in trusted, context-rich data, positioning its RAG-based technology at the forefront of this shift [5] - The company is showcasing its capabilities at Ai4, North America's largest AI industry conference, highlighting the potential for organizations to unlock their data's full value through proven RAG methodologies [5][6]
搜索智能体RAG落地不佳?UIUC开源s3,仅需2.4k样本,训练快效果好
机器之心· 2025-06-17 00:10
Core Insights - The article discusses the emergence of Agentic RAG (Retrieval-Augmented Generation) as a key method for large language models to access external knowledge, highlighting the limitations of current reinforcement learning (RL) training methods in achieving stable performance [1][8]. Group 1: Development of RAG Systems - The evolution of RAG systems is categorized into three stages: Classic RAG, Pre-RL-Zero Active RAG, and RL-Zero stage, with each stage introducing new methodologies to enhance retrieval and generation capabilities [7][8]. - The RL-based methods, while promising, face challenges such as misalignment of optimization goals with actual downstream tasks and the coupling of retrieval and generation processes, which complicates performance evaluation [9][12]. Group 2: Limitations of Current RL Methods - Current RL methods like Search-R1 and DeepRetrieval focus on Exact Match (EM) as a reward metric, which can lead to suboptimal training outcomes due to its strictness and insensitivity to semantic variations [9][10]. - The coupling of retrieval and generation in training can obscure the true performance improvements, making it difficult to discern whether gains are due to better search or enhanced language generation [11][12]. - Existing evaluation metrics fail to accurately measure the contribution of search quality to overall performance, leading to bottlenecks in assessment, training, and generalization [14]. Group 3: Introduction of s3 Framework - The s3 framework, proposed by UIUC and Amazon, aims to improve training efficiency and effectiveness by decoupling the search and generation processes, focusing solely on optimizing the searcher with a new reward function called Gain Beyond RAG (GBR) [1][17]. - s3 demonstrates significant efficiency, requiring only 2.4k training samples and achieving superior performance compared to larger baseline models, with a total training time of just 114 minutes [21][22][25]. Group 4: Experimental Results - In general QA tasks, s3 outperformed both Search-R1 and DeepRetrieval across multiple datasets, showcasing its strong generalization capabilities [23][25]. - In medical QA tasks, s3 exhibited remarkable cross-domain performance, indicating its robustness and adaptability to different datasets and contexts [26][27]. Group 5: Design and Optimization Insights - The design of s3 emphasizes the importance of starting retrieval from the original query, which helps maintain focus and improves search outcomes [31]. - The document selection mechanism within s3 significantly reduces token consumption, enhancing efficiency and minimizing noise in the generation process [31][30].