Workflow
反思技术
icon
Search documents
ACL 2025|自我怀疑还是自我纠正?清华团队揭示LLMs反思技术的暗面
机器之心· 2025-07-14 04:08
Core Viewpoint - The research highlights the limitations of intrinsic self-correction techniques in large language models (LLMs), revealing that these models often fail to improve their performance when prompted to "think again," leading to incorrect answers even on simple factual questions [2][24]. Group 1: Reflection Technology Failures - The study systematically evaluates the failures of reflection technology across various LLMs and tasks, finding that failures occur more frequently than successes, even in advanced models [7][8]. - For instance, the reflection failure rate in the Decision Making task for the o1-mini model is higher than that of the o4 and 3.5-turbo models [8]. - Recent evaluations of ChatGPT models (4.5, 4.1, o4-mini, o3) also show significant reflection failure rates, with the o4-mini model experiencing a decrease in accuracy of 22.1% [9]. Group 2: Reasons for Reflection Failures - Three primary reasons for reflection failures are identified: internal answer fluctuation, prompt bias, and cognitive bias [20][24]. - Internal answer fluctuation indicates that LLMs exhibit self-doubt, leading to frequent changes in answers during multi-turn dialogues [12][15]. - Prompt bias shows that LLMs tend to focus excessively on reflection prompts rather than the actual questions, with 76.1% of failures attributed to this issue [18]. - Cognitive bias reveals that LLMs can overthink and generate excessive "think" instructions, resulting in decision-making paralysis [20]. Group 3: Mitigation Strategies - The research proposes two effective mitigation strategies: problem repetition and few-shot fine-tuning [22][24]. - Problem repetition involves appending the initial question to the reflection prompt to maintain focus on the original query [25]. - Few-shot fine-tuning, which does not introduce new knowledge but corrects abnormal behaviors, shows better results in alleviating reflection failures [25].