Workflow
面对已读乱回的AI,到底要如何分辨真假?哈工大&华为大模型幻觉综述!
自动驾驶之心·2025-09-16 23:33

Core Insights - The article discusses the phenomenon of "hallucination" in large language models (LLMs), which refers to instances where these models generate incorrect or misleading information. It highlights the definitions, causes, and potential mitigation strategies for hallucinations in LLMs [2][77]. Group 1: Definition and Types of Hallucination - Hallucinations in LLMs are categorized into two main types: factual hallucination and faithfulness hallucination. Factual hallucination includes factual contradictions and factual fabrications, while faithfulness hallucination involves inconsistencies in following instructions, context, and logic [8][9][12]. Group 2: Causes of Hallucination - The causes of hallucination are primarily linked to the data used during the pre-training and reinforcement learning from human feedback (RLHF) stages. Issues such as erroneous data, societal biases, and knowledge boundaries contribute significantly to hallucinations [17][21][22]. - The article emphasizes that low-quality or misaligned data during supervised fine-tuning (SFT) can also lead to hallucinations, as the model may struggle to reconcile new information with its pre-existing knowledge [23][30]. Group 3: Training Phases and Their Impact - The training phases of LLMs—pre-training, supervised fine-tuning, and RLHF—each play a role in the emergence of hallucinations. The pre-training phase, in particular, is noted for its structural limitations that can lead to increased hallucination risks [26][28][32]. - During SFT, if the model is overfitted to data beyond its knowledge boundaries, it may generate hallucinations instead of accurate responses [30]. Group 4: Detection and Evaluation of Hallucination - The article outlines methods for detecting hallucinations, including fact extraction and verification, as well as uncertainty estimation techniques that assess the model's confidence in its outputs [41][42]. - Various benchmarks for evaluating hallucination in LLMs are discussed, focusing on both hallucination assessment and detection methodologies [53][55]. Group 5: Mitigation Strategies - Strategies to mitigate hallucinations include data filtering to ensure high-quality inputs, model editing to correct erroneous behaviors, and retrieval-augmented generation (RAG) to enhance knowledge acquisition [57][61]. - The article also discusses the importance of context awareness and alignment in reducing hallucinations during the generation process [74][75].