Workflow
大模型幻觉
icon
Search documents
潘云鹤,最新发声!
Zheng Quan Shi Bao· 2025-09-27 08:56
Group 1 - The core viewpoint of the article emphasizes that AI+ will initiate a new wave in platform economy, leading it into a 2.0 era, where platform companies will leverage big data and AI to create economic activities through algorithms and models [1] - Since 2017, China has outlined five directions for AI planning, including big data intelligence, cross-media intelligence, collective intelligence, human-machine hybrid intelligence, and autonomous intelligent systems, with a focus on both cognitive and action simulation [1] - The issue of "big model hallucination," where generated content may not align with facts or logic, is highlighted as a significant concern, especially in engineering and scientific applications [1] Group 2 - A proposed solution to the hallucination problem is the development of specialized (domain) big models, which are becoming increasingly important [1] - The evolution of big models can transition from general to specialized and vice versa, with the latter potentially being more natural and smoother, akin to the development of individuals from specialists to generalists [2] - The concept of embodied intelligence is expected to further generalize and deepen, expanding the industrial space as robots evolve into more comprehensive intelligent terminals [4] Group 3 - The intelligent agent concept is anticipated to transcend hardware and software boundaries, becoming systems capable of perception, decision-making, and action [4] - The evolution of embodied intelligence will progress from visual-action (VA) models to more advanced visual-language-action (VLA) models combined with mechanics [4]
潘云鹤,最新发声!
证券时报· 2025-09-27 08:53
Core Insights - The article discusses the future of AI and its impact on platform economy, emphasizing the transition to a 2.0 era driven by AI applications in various sectors [2] - It highlights the importance of addressing the "hallucination" problem in large models, which can lead to significant issues in engineering and technology applications [3][4] - The evolution of embodied intelligence and its potential to expand the industry space through the integration of various intelligent systems is also a key focus [6] Group 1: AI and Platform Economy - AI+ is expected to initiate a new wave in platform economy, transitioning it to a technology service-oriented model [2] - Since 2017, China has outlined five directions for AI development, including big data intelligence and human-machine hybrid intelligence [2] Group 2: Challenges in AI Development - The "hallucination" issue in large models arises from statistical probability-driven "reasonable guesses," which can result in content that is inconsistent with facts or logic [3] - Solutions to mitigate hallucination include training specialized large models with domain-specific data and reconstructing the development path of large models [4] Group 3: Embodied Intelligence - The concept of embodied intelligence is expected to generalize and deepen, with intelligent systems evolving into broader intelligent terminals [6] - The integration of various intelligent systems, such as robots and drones, is anticipated to significantly expand the industry scale [6]
面对已读乱回的AI,到底要如何分辨真假?哈工大&华为大模型幻觉综述!
自动驾驶之心· 2025-09-16 23:33
Core Insights - The article discusses the phenomenon of "hallucination" in large language models (LLMs), which refers to instances where these models generate incorrect or misleading information. It highlights the definitions, causes, and potential mitigation strategies for hallucinations in LLMs [2][77]. Group 1: Definition and Types of Hallucination - Hallucinations in LLMs are categorized into two main types: factual hallucination and faithfulness hallucination. Factual hallucination includes factual contradictions and factual fabrications, while faithfulness hallucination involves inconsistencies in following instructions, context, and logic [8][9][12]. Group 2: Causes of Hallucination - The causes of hallucination are primarily linked to the data used during the pre-training and reinforcement learning from human feedback (RLHF) stages. Issues such as erroneous data, societal biases, and knowledge boundaries contribute significantly to hallucinations [17][21][22]. - The article emphasizes that low-quality or misaligned data during supervised fine-tuning (SFT) can also lead to hallucinations, as the model may struggle to reconcile new information with its pre-existing knowledge [23][30]. Group 3: Training Phases and Their Impact - The training phases of LLMs—pre-training, supervised fine-tuning, and RLHF—each play a role in the emergence of hallucinations. The pre-training phase, in particular, is noted for its structural limitations that can lead to increased hallucination risks [26][28][32]. - During SFT, if the model is overfitted to data beyond its knowledge boundaries, it may generate hallucinations instead of accurate responses [30]. Group 4: Detection and Evaluation of Hallucination - The article outlines methods for detecting hallucinations, including fact extraction and verification, as well as uncertainty estimation techniques that assess the model's confidence in its outputs [41][42]. - Various benchmarks for evaluating hallucination in LLMs are discussed, focusing on both hallucination assessment and detection methodologies [53][55]. Group 5: Mitigation Strategies - Strategies to mitigate hallucinations include data filtering to ensure high-quality inputs, model editing to correct erroneous behaviors, and retrieval-augmented generation (RAG) to enhance knowledge acquisition [57][61]. - The article also discusses the importance of context awareness and alignment in reducing hallucinations during the generation process [74][75].
蚂蚁数科鲁玮:用AI守护数字世界的真相
Nan Fang Du Shi Bao· 2025-09-13 13:55
Core Insights - The forum focused on the theme of "Regulating AI Content to Build a Clear Ecosystem," highlighting the risks associated with AI misuse, including identity and credential forgery, as well as hallucinations from large models [1][3] Group 1: AI Risks and Challenges - AI presents three major authenticity challenges: identity forgery, credential forgery, and hallucinations from large models [3] - The black market exploits AI technology for deepfake facial generation and the creation of fraudulent documents, leading to significant risks for individuals and society [3] - A demonstration showed that out of nine digital facial images, two were AI-generated, making it difficult for the audience to distinguish them from real images [3] Group 2: Ant Financial's Solutions - Ant Financial emphasizes a proactive security approach to prevent identity forgery, utilizing its Ant Tianji Laboratory for biometric security in critical scenarios like account opening and payments [3] - The company introduced "Guangjian Intelligent Verification" to combat credential forgery, employing a combination of models to ensure the authenticity of documents and identities [3] - To address hallucinations, Ant Financial launched the "Ant Tianjian" content safety defense service, which includes a multi-layered defense strategy to filter risks in AI-generated content [3] Group 3: Future Outlook - The future of AI will bring more complex security issues, including responsibility boundaries in human-machine interactions and ethical considerations [4] - There is a call for collaboration among industry, academia, and media to establish a transparent and trustworthy AI governance system [4]
OpenAI的新论文,为什么被业内嘲讽是营销?
Hu Xiu· 2025-09-12 09:16
Core Viewpoint - OpenAI's recent paper "Why Language Models Hallucinate" suggests that the hallucination in large models is primarily due to the training and evaluation mechanisms that reward guessing and penalize uncertainty, rather than a failure in model architecture [1][4]. Group 1: Evaluation Mechanisms - Current evaluation benchmarks encourage models to guess answers even when uncertain, leading to higher error rates despite slight improvements in accuracy [3][5]. - OpenAI advocates for a shift in evaluation criteria to penalize confident errors and reward appropriate expressions of uncertainty, moving away from a focus solely on accuracy [3][31]. Group 2: Training Data and Model Behavior - Large models typically only encounter positive examples during pre-training, which does not include instances of refusing to answer, resulting in a lack of learning this behavior [2]. - OpenAI's example shows that older models may have higher accuracy but also higher error rates due to fewer instances of "no answer" responses [3]. Group 3: Community Response and Criticism - The technical community has engaged in heated discussions regarding the paper, with some critics arguing that the research lacks novelty and depth [6][7]. - Concerns have been raised about the definition of hallucination, with various potential causes identified but not strictly defined [10]. Group 4: Implications for Future Models - If the proposed changes are adopted, the focus will shift from marginal accuracy improvements to the willingness to redefine evaluation and product rules to allow models to express uncertainty [5][34]. - The concept of a "low hallucination" model may lead to a more reliable natural language search engine, enhancing the safety and reliability of AI applications [20][21]. Group 5: Confidence and Calibration - The paper discusses the importance of confidence measures in model outputs, suggesting that models should express uncertainty to improve accuracy [28][30]. - OpenAI's approach emphasizes the need for social and technical measures to address the prevalent issue of penalizing uncertain answers [32][34]. Group 6: Strategic Positioning - OpenAI's advocacy for these changes may be linked to its strategic goals, including the promotion of GPT-5's low hallucination rate and its relevance to AI agents and enterprise applications [34][35].
爆火AI神器“智算一体机”,如何迎接Agent元年?
Core Insights - The emergence of integrated AI computing machines, driven by DeepSeek, is making AI large models more accessible to the mass market, but challenges remain in application deployment [1][2] - The evolution of AI large models continues, with a focus on reducing model hallucinations and adapting to ongoing technological advancements [1][4] Group 1: Integrated AI Computing Machines - Integrated AI computing machines are pre-integrated solutions combining hardware, software platforms, models, and applications, lowering the barriers to AI adoption [2][3] - The current market features a diverse range of integrated AI computing machines, necessitating differentiation through performance optimization, customization, and business innovation to avoid price wars [2][3] - Key focus areas for integrated AI computing machines include computing power, model accessibility, and application development to meet diverse enterprise needs [2][3] Group 2: Application and Industry Impact - The deployment of integrated AI computing machines is not a one-time process; it requires continuous adjustment of computing power and model capabilities based on business needs [3][4] - The introduction of AIS one-stop intelligent platforms aids enterprises in quickly developing industry-specific models and exploring potential value scenarios [4][5] - Current applications of integrated AI computing machines are primarily in knowledge Q&A, customer assistance, and code assistance, but face challenges such as data quality and integration into existing workflows [3][4] Group 3: Addressing Model Hallucinations - Model hallucinations arise from the interplay of models, data, and tasks, and are an inherent risk in generative models [5][6] - Short-term solutions to mitigate hallucinations include using RAG, supervised fine-tuning, and exploring verification mechanisms [5][6] - The relationship between integrated AI computing machines and agents is synergistic, with each enhancing the capabilities of the other, leading to more efficient solutions [6]
欢迎OpenAI重返开源大模型赛道,谈一谈我关注的一些要点
3 6 Ke· 2025-08-06 07:55
Core Viewpoint - OpenAI has released two open-source large models, GPT-OSS 120B and GPT-OSS 20B, marking its return to the open-source arena after a six-year hiatus, driven by competitive pressures and the need to cater to enterprise clients who prioritize data security [1][4][5]. Group 1: OpenAI's Shift to Open Source - OpenAI's name originally signified "openness" and "open source," but it deviated from this path since early 2019, limiting the release of its models due to "safety concerns" [1][2]. - OpenAI is now one of the few leading AI developers without any new open-source models until the recent release, alongside Anthropic, which has also not released open-source models [2][5]. Group 2: Reasons for Open Sourcing - Open-sourcing allows clients to run models locally, enhancing data security by keeping sensitive information off third-party platforms, which is crucial for industries like government and finance [3][4]. - Clients can fine-tune open-source models to meet specific industry needs, making them more attractive for sectors with complex requirements [3][4]. Group 3: Competitive Landscape - The release of GPT-OSS is seen as a response to competitors like Meta's LLaMA series and DeepSeek, which have gained traction in the enterprise market due to their open-source nature [4][5]. - The global landscape now features only two major developers without open-source versions, highlighting a significant shift towards open-source models in the industry [5]. Group 4: Technical Insights - GPT-OSS models are comparable in performance to GPT-4o3 and utilize a mixed expert architecture, which is a common approach among leading models [6][7]. - The training of GPT-OSS utilized significant computational resources, with the 120B parameter version consuming 2.1 million H100 GPU hours, indicating a substantial investment in infrastructure [9][10]. Group 5: Limitations of Open Source - GPT-OSS is described as an "open weight" model rather than a fully open-source model, lacking comprehensive training details and proprietary tools used in its development [8][9]. - The release of GPT-OSS does not include the latest advancements or training methodologies, limiting its impact on the broader AI development landscape [6][10].
紫东太初开源视觉神经增强方法,即插即用终结多模态幻觉 | ACL 2025
量子位· 2025-06-27 10:57
Core Viewpoint - The article discusses a novel solution, Visual Head Reinforcement (VHR), to address the hallucination phenomenon in Large Visual Language Models (LVLMs) by enhancing the model's attention mechanisms to better utilize visual information rather than relying on language priors [1][2][3]. Group 1: Introduction and Background - LVLMs often generate factually incorrect outputs due to an over-reliance on language knowledge instead of actual visual content, leading to hallucinations [4][5]. - Experiments show that when models are prompted to describe images, they frequently include entities not present in the images, indicating a systemic reliance on language co-occurrence patterns [4][5][7]. Group 2: VHR Methodology - VHR identifies and strengthens attention heads that are sensitive to visual information, thereby reducing the model's dependency on language priors and significantly lowering hallucination occurrences [8]. - The Visual Head Divergence (VHD) metric is introduced to quantify each attention head's sensitivity to visual inputs, revealing that only a few heads are responsive to visual information while most rely on language patterns [9][11]. - The VHR process involves filtering out abnormal VHD scores, selecting and scaling the outputs of the top 50% of attention heads based on VHD scores, and applying a layer-wise enhancement strategy to avoid interference [14][15][16]. Group 3: Experimental Results - VHR has been tested against multiple benchmarks, showing superior performance compared to existing methods while maintaining efficiency with minimal additional time costs [16][17]. - The results indicate that VHR outperforms baseline methods in various evaluations, demonstrating its effectiveness in reducing hallucinations in LVLMs [17]. Group 4: SSL Method - The article also introduces a Semantic Guided Learning (SSL) method that analyzes the internal representation space of models to mitigate hallucinations by injecting real semantic directions and suppressing hallucination-related projections [19][22]. - This method shows cross-model applicability, enhancing the robustness of hallucination mitigation across different LVLM architectures [22].
海致科技港股IPO:自称技术实力全球领先 研发费用及费用率连续下降且低于同行
Xin Lang Zheng Quan· 2025-06-20 07:39
Core Viewpoint - Haizhi Technology claims to be the first AI company in China to effectively reduce large model hallucinations through knowledge graphs, but its revenue from this AI agent business is relatively low, accounting for only 17.2% in 2024 [1][2]. Business Overview - Haizhi Technology's main business market share is only 1.11%, with the AI agent business market share at 2.8% [2][3]. - The company's Atlas intelligent agent revenue from 2022 to 2024 shows a growth from 0 to 86.55 million RMB, but still represents a small portion of total revenue [2][4]. Financial Performance - The company reported revenues of 313 million RMB, 376 million RMB, and 503 million RMB for 2022, 2023, and 2024 respectively, with net losses of 176 million RMB, 266 million RMB, and 94 million RMB, indicating a narrowing loss [4]. - The Atlas graph solution revenue accounted for 100%, 97.6%, and 82.8% of total revenue in the same years [4]. R&D Expenditure - R&D expenses for Haizhi Technology decreased from 86.94 million RMB in 2022 to 60.68 million RMB in 2024, with a significant drop in R&D expense ratio from 27.8% to 12.1% [6][9]. - The company’s R&D expenses are significantly lower than competitors like Minglue Technology and Xinghuan Technology, raising questions about its claimed technological advantages [9]. Market Position and Future Outlook - The market for integrated knowledge graph AI agents is projected to grow from 200 million RMB in 2024 to 13.2 billion RMB by 2029, with a compound annual growth rate of 140% [10]. - If Haizhi Technology can capitalize on opportunities for significant growth in AI agent revenue, it could strengthen its market position, although competition from internet giants poses potential challenges [11].
DeepSeek R1幻觉率降低,用户喊话:想要R2
第一财经· 2025-05-29 15:13
Core Viewpoint - The updated DeepSeek R1 model has significantly improved its capabilities, particularly in reducing hallucination rates and enhancing performance in complex reasoning tasks, positioning itself competitively against leading international models [2][9][12]. Group 1: Model Improvements - The new R1 model has reduced hallucination rates by approximately 45%-50% compared to the previous version, improving accuracy in tasks such as rewriting, summarization, and reading comprehension [9][12]. - In the AIME 2025 test, the model's accuracy increased from 70% to 87.5%, showcasing its enhanced mathematical reasoning abilities [12]. - The updated model is capable of generating longer and more structured written works, aligning more closely with human writing preferences [12]. Group 2: Benchmark Performance - The updated R1 model achieved top scores in various benchmark tests, outperforming all domestic models and nearing the performance of international leaders like o3 and Gemini-2.5-Pro [9][12]. - The model's performance in coding tasks has also improved significantly, nearly matching the capabilities of OpenAI's o3-high model [12]. Group 3: Technical Specifications - The new R1 model has 685 billion parameters and supports a context length of 128K in the open-source version, with 64K available in web, app, and API formats [13]. - The model continues to utilize the DeepSeek V3 Base model as its foundation, with enhanced computational resources applied during the training process to improve reasoning depth [12][13].