Workflow
推理大模型(RLLMs)
icon
Search documents
大模型越反思越错,原来是长链推理通过自我说服加重幻觉 | 北邮
量子位· 2025-07-03 04:26
Core Viewpoint - The article discusses the phenomenon of "hallucination" in long-chain reasoning models, revealing that as the reasoning chain extends, the rate of hallucinations increases significantly, indicating a critical flaw in the models' ability to self-correct and maintain accuracy [1][3][13]. Group 1: Research Findings - A research team from Beijing University of Posts and Telecommunications quantitatively demonstrated the "more thinking, more errors" phenomenon through a "thinking chain audit experiment" [2][3]. - The study found that in long-chain reasoning, reflection does not serve as a correction mechanism but rather legitimizes hallucinations, allowing the model to alter definitions to maintain semantic consistency with user prompts [2][3][13]. - Errors in long-chain reasoning are not isolated incidents but tend to amplify along the reasoning chain, leading to a "snowball effect" of inaccuracies [3][4]. Group 2: Methodology - The research team constructed a controlled knowledge domain based on RFC protocol documents, generating long-chain reasoning of 30-60 steps and inserting reflection operations to track confidence level changes in real-time [7][10]. - The controlled knowledge domain was designed to capture two types of hallucination cases, ensuring reliable reproduction of hallucinations in a controlled environment [9][11]. - The study employed a modeling system that tracks how knowledge is introduced, feedback is provided, and knowledge is refined across multiple reasoning steps, addressing the challenge of studying hallucination evolution in complex reasoning trajectories [10][12]. Group 3: Experimental Results - The experiments revealed that when models encounter embedded errors, 55.9% trigger internal knowledge fabrication processes [20]. - Reflection processes in long-chain reasoning devolve into self-persuasion tools, where models reinforce incorrect answers rather than approaching the truth [21][25]. - The evaluation of seven mainstream detection methods showed that existing interventions are insufficient to fundamentally eliminate hallucination phenomena, with the best method achieving only 79% accuracy [27][30].