Workflow
向量检索算法
icon
Search documents
只改2行代码,RAG效率暴涨30%!多种任务适用,可扩展至百亿级数据规模应用
量子位· 2025-06-21 06:07
Core Viewpoint - The article discusses a new open-source method called PSP (Proximity graph with Spherical Pathway) developed by a team from Zhejiang University, which significantly improves the efficiency of RAG vector retrieval by 30% with just two lines of code. This method is applicable to various tasks such as text-to-text, image-to-image, text-to-image, and recommendation system recall, and is scalable for large-scale applications involving billions of data points [1]. Summary by Sections Vector Retrieval and Its Importance - Vector retrieval is a core technology component that supports prominent AI products, expanding the boundaries of traditional semantic retrieval and integrating seamlessly with large models [6]. Challenges in Existing Methods - Traditional vector retrieval methods are primarily based on Euclidean distance, focusing on "who is closest," while AI often requires comparisons based on "semantic relevance," or maximum inner product [2]. - Previous inner product retrieval methods failed to satisfy the mathematical triangle inequality, leading to inefficiencies [3]. PSP Methodology - The PSP method allows for minor modifications to existing graph structures to find optimal solutions for maximum inner product retrieval [4]. - It incorporates an early stopping strategy to determine when to end the search, thus conserving computational resources and speeding up the process [5]. Key Findings and Innovations - The research identifies two paradigms in maximum inner product retrieval: converting maximum inner product to minimum Euclidean distance, which can lead to information loss, and directly searching in inner product space, which lacks effective pruning methods [8]. - The PSP team demonstrated that it is possible to find the global maximum inner product solution using a greedy algorithm on a graph designed for Euclidean distance [10][11]. Performance Testing - The PSP algorithm was tested on eight large-scale, high-dimensional datasets, showing significant improvements in query speed (QPS) compared to existing state-of-the-art methods, with performance stability across various datasets [21][23]. - The algorithm exhibits excellent scalability, with time complexity showing log(N) growth rates for both Top-1 and Top-K retrievals, indicating its potential for efficient retrieval in datasets of billions to hundreds of billions [25][26].