Workflow
RAG(检索增强生成)
icon
Search documents
百亿向量,毫秒响应:清华研发团队向量数据库 VexDB 首发,攻克模型幻觉难题
AI前线· 2025-09-25 08:04
作者 | 棱镜 AI 浪潮席卷之下,企业技术领袖们无不摩拳擦掌,渴望将这些颠覆性技术融入自身的业务核心,抢占智能时代的制高点,不料却被现实狠狠地甩下一记 耳光。 PoC 时的惊艳还历历在目——自动报告生成、智能应答客服、代码辅助开发,一切都看起来那么完美。然而,当试图将这些能力嵌入核心业务系统时, 医疗团队发现,AI 助手会面不改色地编造根本不存在的药物方案;金融机构意识到,风控模型可能依据过时的条款做出百万级的误判;连最简单的客服 场景中,AI 都能把用户引导向一个早已下架的产品。这并非由于某个技术事故,而是生成式 AI 存在已久的幻觉问题。 清华大学计算机系教授指出,大模型在垂直领域知识与实时更新上是有局限的,特别是幻觉问题,已经成为大模型深入企业级应用的掣肘。因此,行业 迫切需要一种既保留大模型生成能力,又能对其输出进行确定性约束的方案。 9 月 25 日,由李国良教授作为技术顾问的数智引航团队,正式发布向量数据库 VexDB,能够支持百亿千维向量数据毫秒级查询,召回准确度高达 99% 以上,从数据基础设施层面为 AI 应用构建一个可信的知识基石。近日,在国际权威的 DABSTEP 非结构化数据分析测试 ...
18 年 SEO 增长经验专家:别再收藏各种 AEO 最佳攻略了,自己动手实验才是做好的关键
Founder Park· 2025-09-23 14:19
Core Insights - The article emphasizes the importance of verifying information about Answer Engine Optimization (AEO) through personal experimentation rather than relying on potentially inaccurate online best practices [2][3] - AEO is closely related to traditional SEO but requires a focus on citation optimization and long-tail questions to be effective [5][8] - The rise of AEO is attributed to the increasing adoption of AI models like ChatGPT, which have changed how users seek information [10][52] Group 1 - AEO is fundamentally about optimizing content to appear as answers in large language models [9][10] - High-quality, authentic comments on platforms like Reddit are more effective than numerous low-quality comments for AEO [3][24] - The distinction between AEO and SEO lies in the need for citation optimization and addressing long-tail questions [5][14] Group 2 - AEO strategies should include both on-site optimization (like improving help center content) and off-site optimization (like increasing mentions across various platforms) [22][58] - The average length of user queries in chat scenarios is significantly longer than traditional search queries, indicating a shift in user behavior [19][20] - Companies can quickly gain visibility in AEO by being mentioned in relevant discussions or content, unlike the longer timeline required for SEO [19][45] Group 3 - The effectiveness of AEO can be measured through experiments that compare the impact of different strategies on visibility and traffic [36][44] - AEO is not a replacement for Google but rather a new channel that complements existing search methods [50][51] - The quality of leads generated through AEO is significantly higher than those from traditional search, with conversion rates being six times greater [16][47] Group 4 - Companies should focus on creating original, high-quality content that provides unique insights to stand out in AEO [32][33] - The optimization of help center content is crucial, as many user queries are related to specific product functionalities and support [58][60] - AEO requires continuous adaptation and validation of strategies to ensure effectiveness in a rapidly changing digital landscape [36][46]
@CEO,你的下一个私人助理何必是人类
量子位· 2025-09-17 03:43
鱼羊 闻乐 发自 凹非寺 量子位 | 公众号 QbitAI CEO私人助理的活儿,也被Agent盯上了。 每天能独立更新出全公司的 日报版"今日头条" ,还是完全 本地部署 、 开箱即用 的那种: 本体甚至能被CEO拎着走。 没错,整个机箱就A4大小,跟iPhone 15 Pro Max对比起来是这样的: 不卖关子,这么个新鲜角色,名叫智跃Agent一体机。很有意思的一点是,这是市面上首个专门面向CEO打造的软硬一体私有化Agent,目标 用户非常明确。 不愧是"Agent应用元年",连AI新硬件都开始彰显"个性"了。 到底怎么一回事,量子位编辑部的同事们也是率先过了一把CEO瘾,咱们一边实测,一边看看2025年的AI新硬件,都进化成什么样的形态了 —— 开箱即用的"信息管理助手" 传统的一体机大家已经比较熟悉了,大体上是算力+模型供给的模式,基本上买到手里还是得给它配个专门的开发团队。 与之相比,智跃Agent一体机实际上属于一个 全新的概念,定位并不相同 。 在硬件层面,它采用小巧的12L机箱设计,搭载 单卡4090 ,可以说是超小型化的Agent方案。 所有数据处理、存储环节均可以在本地完成,无需依赖外 ...
AI Agents与Agentic AI 的范式之争?
自动驾驶之心· 2025-09-05 16:03
Core Viewpoint - The article discusses the evolution and differentiation between AI Agents and Agentic AI, highlighting their respective roles in automating tasks and collaborating on complex objectives, with a focus on the advancements since the introduction of ChatGPT in November 2022 [2][10][57]. Group 1: Evolution of AI Technology - The emergence of ChatGPT in November 2022 marked a pivotal moment in AI development, leading to increased interest in AI Agents and Agentic AI [2][4]. - The historical context of AI Agents dates back to the 1970s with systems like MYCIN and DENDRAL, which were limited to rule-based operations without learning capabilities [10][11]. - The transition to AI Agents occurred with the introduction of frameworks like AutoGPT and BabyAGI in 2023, enabling these agents to autonomously complete multi-step tasks by integrating LLMs with external tools [12][13]. Group 2: Definition and Characteristics of AI Agents - AI Agents are defined as modular systems driven by LLMs and LIMs for task automation, addressing the limitations of traditional automation scripts [13][16]. - Three core features distinguish AI Agents: autonomy, task specificity, and reactivity [16][17]. - The dual-engine capability of LLMs and LIMs is essential for AI Agents, allowing them to operate independently and adapt to dynamic environments [17][21]. Group 3: Transition to Agentic AI - Agentic AI represents a shift from individual AI Agents to collaborative systems that can tackle complex tasks through multi-agent cooperation [24][27]. - The key difference between AI Agents and Agentic AI lies in the introduction of system-level intelligence, enabling broader autonomy and the management of multi-step tasks [27][29]. - Agentic AI systems utilize a coordination layer and shared memory to enhance collaboration and task management among multiple agents [33][36]. Group 4: Applications and Use Cases - The article outlines various applications of Agentic AI, including automated fund application writing, collaborative agricultural harvesting, and clinical decision support in healthcare [37][43]. - In these scenarios, Agentic AI systems demonstrate their ability to manage complex tasks efficiently through specialized agents working in unison [38][43]. Group 5: Challenges and Future Directions - The article identifies key challenges facing AI Agents and Agentic AI, including causal reasoning deficits, coordination bottlenecks, and the need for improved interpretability [48][50]. - Proposed solutions include enhancing retrieval-augmented generation (RAG), implementing causal modeling, and establishing governance frameworks to address ethical concerns [52][53]. - Future development paths for AI Agents and Agentic AI focus on scaling multi-agent collaboration, domain customization, and evolving into human collaborative partners [56][59].
什么是倒排索引(Inverted Index)?
Sou Hu Cai Jing· 2025-09-04 04:14
StarRocks作为新一代实时分析数据库,在倒排索引技术方面展现出显著优势。系统原生支持全文检索功能,通过优化的倒排索引结构实现高效的文本数据 查询。在向量检索场景下,StarRocks能够无缝整合传统倒排索引与向量相似性搜索,为RAG应用提供统一的数据底座。 倒排索引(Inverted Index)是一种将每个词项映射到包含该词项的文档列表的索引结构,与传统正向索引恰好相反。正向索引通过文档ID查找其内容,而倒 排索引则通过关键词快速定位包含该词的所有文档。这种设计思路源于实际应用中需要根据属性值查找记录的需求,特别适用于全文检索、搜索引擎和大规 模数据分析场景。 倒排索引的构建过程包括文本预处理、词典生成和倒排记录表创建三个核心步骤。以三个文档为例:Doc1包含"quick brown fox",Doc2包含"lazy dog", Doc3包含"quick brown dog"。经过分词处理后,系统会为每个词项建立对应的文档列表,如"quick"对应[Doc1, Doc3],"dog"对应[Doc2, Doc3],从而实现快 速检索。 倒排索引技术广泛应用于多个数据处理领域,展现出强大的实用价值。在全文 ...
晓花科技吴淏:大模型存在“幻觉”等风险,应避免输出不合规或错误的信息
Bei Jing Shang Bao· 2025-08-01 10:25
北京商报讯(记者 胡永新)7月31日,由北京商报社、深蓝媒体智库主办的"AI金融双刃剑:从安全底线寻找转型机遇沙龙"在上海成功举办。 在吴淏看来,大模型存在自身稳定性风险、"幻觉"风险和新模型上线时的稳定性风险等。其中,为应对大模型的"幻觉"风险,不应让大模型自由发挥,而是 将其强大的语言能力限制在可控、可信的知识范围内。核心策略为,使用RAG(检索增强生成),限定在业务知识库中寻找答复。通过精细化Prompt(提 示),明确角色和指令,提供反例。并使用经验话术精调模型,让模型适应业务场景的风格和行为模式。最后,对输出结果进行质检,避免输出不合规或错 误的信息。 "传统机器人智能化程度不高,已不能满足业务与客户诉求。自去年起,公司持续关注DeepSeek、文心一言等大模型技术发展,今年决定着手自主搭建基于 大模型的客服系统。"晓花(上海)互联网科技有限公司CTO吴淏在沙龙上表示,为解决"幻觉"问题,公司采用"大模型+小模型"混合架构:通过小模型快速 处理常规问题,大模型则专注于复杂场景。从具体流程来看,用户提问后,系统进行智能决策,对可处理问题,进行问题改写与混合检索,通过重排序算法 生成候选答案,最终推送知 ...
数据治理对人工智能的成功至关重要
3 6 Ke· 2025-07-21 03:09
Group 1 - The emergence of large language models (LLMs) has prompted various industries to explore their potential for business transformation, leading to the development of numerous AI-enhancing technologies [1] - AI systems require access to company data, which has led to the creation of Retrieval-Augmented Generation (RAG) architecture, essential for enhancing AI capabilities in specific use cases [2][5] - A well-structured knowledge base is crucial for effective AI responses, as poor quality or irrelevant documents can significantly hinder performance [5][6] Group 2 - Data governance roles are evolving to support AI system governance and the management of unstructured data, ensuring the protection and accuracy of company data [6] - Traditional data governance has focused on structured data, but the rise of Generative AI (GenAI) is expanding this focus to include unstructured data, which is vital for building scalable AI systems [6] - Collaboration between business leaders, AI technology teams, and data teams is essential for creating secure and effective AI systems that can transform business operations [6]
猫猫拯救科研!AI怕陷“道德危机”,网友用“猫猫人质”整治AI乱编文献
量子位· 2025-07-01 03:51
Core Viewpoint - The article discusses how a method involving "cat" has been used to improve the accuracy of AI-generated references, particularly in the context of scientific research, highlighting the ongoing challenges of AI hallucinations in generating fictitious literature [1][25][26]. Group 1 - A post on Xiaohongshu claims that using "cat" as a safety threat has successfully corrected AI's tendency to fabricate references [1][5]. - The AI model Gemini reportedly found real literature while ensuring the safety of the "cat" [2][20]. - The post resonated with many researchers, garnering over 4000 likes and 700 comments [5]. Group 2 - Testing the method on DeepSeek revealed that without the "cat" prompt, the AI produced incorrect references, including links to non-existent articles [8][12][14]. - Even when the "cat" prompt was applied, the results were mixed, with some genuine references but still many unverifiable titles [22][24]. - The phenomenon of AI fabricating literature is described as a "hallucination," where the AI generates plausible-sounding but false information [25][26]. Group 3 - The article emphasizes that the core issue of AI generating false references stems from its statistical learning from vast datasets, rather than true understanding of language [27][28]. - Current industry practices to mitigate hallucinations include Retrieval-Augmented Generation (RAG), which enhances model outputs by integrating accurate content [31]. - The integration of AI with search functionalities is becoming standard across major platforms, improving the quality of collected data [32][34].
Gemini 2.5 Pro 负责人:最强百万上下文,做好了能解锁很多应用场景
Founder Park· 2025-06-30 11:47
Core Insights - The article discusses the advancements and implications of long-context models, particularly focusing on Google's Gemini series, which offers a significant advantage with its million-token context capability [1][3][35] - It emphasizes the importance of understanding the differences between in-weights memory and in-context memory, highlighting that in-context memory is easier to modify and update [5][6] - The article predicts that while the current million-token context models are not yet perfect, the pursuit of larger contexts without achieving quality improvements is not meaningful [5][34] Group 1: Long Context Models - The Gemini 2.5 Pro model allows for comprehensive project traversal and reading, providing a unique experience compared to other models [1] - The future of long-context models is expected to see a shift towards million-token contexts becoming standard, which will revolutionize applications in coding and other areas [3][35] - Current limitations include the need for real-time interaction, which necessitates shorter contexts, while longer contexts are better for tasks that allow for longer wait times [5][11] Group 2: Memory Types - Understanding the distinction between in-weights memory and in-context memory is crucial, as the latter allows for more dynamic updates [6][7] - In-context memory is essential for incorporating personal and rare knowledge that may not be present in the model's pre-trained weights [7][8] - The competition for model attention among different information sources can limit the effectiveness of short-context models [5][8] Group 3: RAG and Long Context - RAG (Retrieval-Augmented Generation) will not be obsolete; instead, it will work in conjunction with long-context models to enhance information retrieval from vast knowledge bases [10][11] - RAG is necessary for applications with extensive knowledge bases, as it helps retrieve relevant context before processing by the model [10][11] - The collaboration between RAG and long-context models is expected to improve recall rates and allow for more comprehensive information processing [11][12] Group 4: Implications for Developers - Developers are encouraged to utilize context caching to reduce processing time and costs when interacting with long-context models [20][21] - It is advised to avoid including irrelevant information in the context, as it can negatively impact the model's performance in multi-key information retrieval tasks [23][24] - The article suggests that developers should strategically place questions at the end of the context to maximize caching benefits [22][24] Group 5: Future Directions - The article predicts that achieving near-perfect quality in million-token contexts will unlock new application scenarios that are currently unimaginable [34][35] - The cost of implementing longer contexts is a significant barrier, but advancements in technology are expected to lower these costs over time [30][31] - The potential for achieving ten-million-token contexts is acknowledged, but it will require substantial breakthroughs in deep learning [35][36]
全面拥抱AI后,OceanBase推出开箱即用RAG服务
Nan Fang Du Shi Bao· 2025-05-17 09:32
Core Insights - OceanBase is evolving from an integrated database to an integrated data foundation, focusing on Data×AI capabilities to address new data challenges in the AI era [1][2][4] - The company launched PowerRAG, an AI-driven application product that provides ready-to-use RAG (Retrieval-Augmented Generation) application development capabilities [1][5][7] - OceanBase introduced a new "shared storage" product that integrates object storage with transactional databases, significantly reducing storage costs by up to 50% for TP loads [9][10] AI Strategy and Product Development - OceanBase aims to support mixed workloads (TP/AP/AI) through a unified engine, enhancing SQL and AI hybrid retrieval capabilities [2][4] - The PowerRAG service streamlines the application development process by connecting data, platform, interface, and application layers, facilitating rapid development of various AI applications [5][7] - The company is committed to continuous breakthroughs in application and platform layers to solidify its position as an integrated data foundation in the AI era [5][7] Performance and Infrastructure - OceanBase has achieved leading performance in vector capabilities, essential for supporting AI applications, and is continuously optimizing vector retrieval algorithms [8][9] - The latest version of OceanBase enhances mixed retrieval performance through advanced execution strategies and self-developed vector algorithm libraries [9] Shared Storage Innovation - The "shared storage" product represents a significant architectural upgrade, allowing for deep integration of object storage with transactional databases, thus improving cloud data storage elasticity [9][10] - This innovation positions OceanBase's cloud database, OB Cloud, as the first multi-cloud native database to support object storage in TP scenarios, catering to various business applications [10]