“医生版ChatGPT”来了！百川发布最强循证增强大模型M2 Plus，幻觉率远低于DeepSeek

Core Viewpoint - Baichuan Intelligent has launched the Baichuan-M2 Plus, an evidence-enhanced medical large model, which significantly reduces the hallucination rate compared to general models, achieving credibility comparable to experienced clinical doctors [3][15]. Group 1: Product Launch and Features - Baichuan-M2 Plus is an upgrade from the previously open-sourced Baichuan-M2, featuring a significant reduction in hallucination rates, outperforming both DeepSeek and OpenEvidence [3][4]. - The model introduces a six-source evidence reasoning (EAR) paradigm, making it suitable for clinical decision support in various healthcare environments, including China, the US, Japan, and the UK [4][22]. - The model's architecture is designed to ensure that it only uses authoritative medical evidence, avoiding non-professional information from the internet [6][9]. Group 2: Evidence Framework - The six-source evidence framework consists of layers that evolve from original research to real-world feedback, ensuring a comprehensive knowledge system [5][8]. - The layers include original research, evidence reviews, guidelines, practical knowledge, public health education, and regulatory information, creating a robust evidence chain [8][9]. Group 3: Retrieval and Reasoning Mechanisms - M2 Plus employs a PICO framework for structured medical queries, enhancing the precision of evidence retrieval [11][12]. - The model incorporates a "evidence-enhanced training" mechanism that prioritizes citation over speculation, fundamentally changing its response logic [13][15]. - The model's ability to evaluate evidence quality ensures that it prioritizes high-trust information, embedding it seamlessly into its reasoning process [13][15]. Group 4: Performance Metrics - M2 Plus achieved a score of 97 in the USMLE, surpassing the average human score and matching GPT-5, demonstrating its clinical problem-solving capabilities [19][21]. - In the Chinese medical licensing exam, M2 Plus scored 568, significantly higher than the average passing score, showcasing its mastery of clinical guidelines and practices [21]. - The model also performed well in various international medical qualification exams, achieving over 85% accuracy [20][21]. Group 5: Market Position and Applications - Baichuan-M2 Plus is positioned as a "doctor's version of ChatGPT," enhancing the usability of AI in serious medical scenarios [22][23]. - The model is integrated into the Baixiao app, providing a tool for doctors to counteract the challenges posed by general models like DeepSeek [23][24]. - The company aims to continuously improve the applicability of AI in real clinical settings through open-source initiatives and API offerings [24].