Workflow
幻觉
icon
Search documents
清华挖出“幻觉”的罪魁祸首:预训练产生的0.1%神经元
3 6 Ke· 2026-01-06 08:31
无论大型语言模型再怎么刷榜,但有一个幽灵「幻觉」始终徘徊在头上,让那些追求事实准确性的领域任务(如金融、教育、医疗)不敢轻易地把AI结 合到业务中。 清华大学孙茂松团队从神经元角度研究幻觉的微观机制,发现极少数神经元(H-神经元)可预测幻觉,且与过度顺从行为相关,其根源在预训练阶段, 为解决幻觉问题提供了新思路,有助于开发更可靠的大模型。 幻觉是指模型生成看似合理但事实上不准确或缺乏证据支持的输出,比如GPT-3.5 在基于引用的事实性评估中约有40%的幻觉率,尽管GPT-4将幻觉率降 低到28.6%,但仍然处于较高水平;以推理为中心的系统(如DeepSeek-R1)在复杂任务中表现出色,但也存在明显的幻觉模式。 也就是说,无论模型架构如何,幻觉现象始终存在,是影响大模型可靠性的主要瓶颈。 最近,清华大学孙茂松团队从神经元的角度出发,深入研究了LLM中幻觉的微观机制,从三个视角(识别identification、行为影响behavior impact和起源 origins)系统地研究了幻觉相关神经元(H-Neurons)。 论文链接:https://arxiv.org/abs/2512.01797v2 现有的 ...
「幻觉」竟是Karpathy十年前命名的?这个AI圈起名大师带火了多少概念?
机器之心· 2025-07-28 10:45
Core Viewpoint - The article discusses the influential contributions of Andrej Karpathy in the AI field, particularly his role in coining significant terms and concepts that have shaped the industry, such as "hallucinations," "Software 2.0," "Software 3.0," "vibe coding," and "bacterial coding" [1][6][9]. Group 1: Naming and Concepts - Karpathy coined the term "hallucinations" to describe the limitations of neural networks, which generate meaningless content when faced with unfamiliar concepts [1][3]. - He is recognized as a master of naming in the AI community, having introduced terms like "Software 2.0" and "Software 3.0," which have gained traction over the years [6][9]. - The act of naming is emphasized as a foundational behavior in knowledge creation, serving as a stable target for global scientific focus [7]. Group 2: Software Evolution - "Software 1.0" refers to traditional programming where explicit instructions are written in languages like Python and C++ [12][14]. - "Software 2.0" represents a shift to neural networks, where developers train models using datasets instead of writing explicit rules [15]. - "Software 3.0" allows users to generate code through simple English prompts, making programming accessible to non-developers [16][17]. Group 3: Innovative Programming Approaches - "Vibe coding" encourages developers to immerse themselves in the development atmosphere, relying on LLMs to generate code based on verbal requests [22][24]. - "Bacterial coding" promotes writing modular, self-contained code that can be easily shared and reused, inspired by the adaptability of bacterial genomes [30][35]. - Karpathy suggests balancing the flexibility of bacterial coding with the structured approach of eukaryotic coding to support complex system development [38]. Group 4: Context Engineering - Context engineering has gained attention as a more comprehensive approach than prompt engineering, focusing on providing structured context for AI applications [43][44]. - The article highlights a shift towards optimizing documentation for AI readability, indicating a trend where 99.9% of content may be processed by AI in the future [45].
我们为何做梦?从神经科学到精神世界的奇妙之旅
Hu Xiu· 2025-07-08 03:12
Group 1 - The exploration of dreams has evolved from ancient beliefs to modern neuroscience, indicating that dreams may have significant connections to human thought, memory, and creativity [1][3][17] - REM sleep, discovered in 1953, is characterized by high brain activity similar to wakefulness, where most vivid dreams occur [3][4][6] - Dreams are not random; they are closely linked to daily experiences, emotions, and memories, facilitated by the brain's complex neural networks, particularly the Default Mode Network (DMN) [11][12][13] Group 2 - During REM sleep, the brain processes and reorganizes memories, often amplifying emotional experiences, which explains the intense feelings associated with dreams [12][13] - The brain's activity during REM sleep involves various regions, such as the visual cortex for imagery and the limbic system for emotions, while the prefrontal cortex's activity is suppressed, leading to illogical dream narratives [9][10][11] - Dreams may serve as a means of emotional regulation, helping individuals cope with stress and anxiety by reprocessing emotional memories [12][13] Group 3 - The similarities between dreams and hallucinations suggest a shared neurobiological basis, particularly in conditions like schizophrenia, where individuals may struggle to distinguish between reality and their internal perceptions [14][15] - Lucid dreaming, where individuals maintain self-awareness, may offer therapeutic potential for those experiencing hallucinations, allowing them to better control their experiences [16] - The ongoing research into dreams not only seeks to unravel their mysteries but also aims to address fundamental questions about consciousness and reality [17]
大模型越反思越错,原来是长链推理通过自我说服加重幻觉 | 北邮
量子位· 2025-07-03 04:26
Core Viewpoint - The article discusses the phenomenon of "hallucination" in long-chain reasoning models, revealing that as the reasoning chain extends, the rate of hallucinations increases significantly, indicating a critical flaw in the models' ability to self-correct and maintain accuracy [1][3][13]. Group 1: Research Findings - A research team from Beijing University of Posts and Telecommunications quantitatively demonstrated the "more thinking, more errors" phenomenon through a "thinking chain audit experiment" [2][3]. - The study found that in long-chain reasoning, reflection does not serve as a correction mechanism but rather legitimizes hallucinations, allowing the model to alter definitions to maintain semantic consistency with user prompts [2][3][13]. - Errors in long-chain reasoning are not isolated incidents but tend to amplify along the reasoning chain, leading to a "snowball effect" of inaccuracies [3][4]. Group 2: Methodology - The research team constructed a controlled knowledge domain based on RFC protocol documents, generating long-chain reasoning of 30-60 steps and inserting reflection operations to track confidence level changes in real-time [7][10]. - The controlled knowledge domain was designed to capture two types of hallucination cases, ensuring reliable reproduction of hallucinations in a controlled environment [9][11]. - The study employed a modeling system that tracks how knowledge is introduced, feedback is provided, and knowledge is refined across multiple reasoning steps, addressing the challenge of studying hallucination evolution in complex reasoning trajectories [10][12]. Group 3: Experimental Results - The experiments revealed that when models encounter embedded errors, 55.9% trigger internal knowledge fabrication processes [20]. - Reflection processes in long-chain reasoning devolve into self-persuasion tools, where models reinforce incorrect answers rather than approaching the truth [21][25]. - The evaluation of seven mainstream detection methods showed that existing interventions are insufficient to fundamentally eliminate hallucination phenomena, with the best method achieving only 79% accuracy [27][30].
独家洞察 | RAG如何提升人工智能准确性
慧甚FactSet· 2025-06-10 05:12
Core Viewpoint - The accuracy of data is crucial for financial services companies utilizing Generative AI (GenAI) and Large Language Models (LLM), as inaccurate or low-quality data can adversely affect company strategy, operations, risk management, and compliance [1][3]. Group 1: Causes of Data Inaccuracy - Data inaccuracy in the financial services sector often arises from multiple factors, including the increasing volume and variety of data sourced from multiple vendors, patents, and third-party sources [4]. - "Hallucination" is a significant challenge in the financial sector regarding Generative AI, where models generate coherent but factually incorrect or misleading information due to their reliance on learned patterns from training data without factual verification [4]. Group 2: Importance of Retrieval-Augmented Generation (RAG) - RAG is a critical technology for improving the accuracy of Generative AI and significantly reducing hallucinations by integrating real data with generated responses [6]. - RAG combines the generative capabilities of LLMs with effective data retrieval systems, allowing for more accurate and contextually relevant answers, especially in financial risk assessments [6]. - RAG enhances the utilization of various data formats, enabling the processing of both structured and unstructured data efficiently, and connects existing legacy systems without the need for costly migrations or retraining of LLMs [7]. Group 3: Benefits of RAG - RAG helps address the main causes of data inaccuracy discussed earlier, providing more accurate answers based on proprietary data and reducing hallucinations [8]. - It allows for the integration of the latest knowledge and user permission management, ensuring that responses are based on up-to-date information [8].