MedBench 4.0
Search documents
微医医疗大模型领跑MedBench 4.0综合榜单
Huan Qiu Wang· 2026-01-13 04:31
Core Insights - MedBench has released its latest evaluation results, highlighting the performance of medical AI models, with WeDoctor's medical model leading the rankings [1][3] Group 1: Evaluation Results - WeDoctor's medical model achieved a comprehensive score of 60.8, ranking first among various models evaluated [2] - The evaluation included models from other companies, such as UniGPT-Med-VL and OpenAI's GPT-5, which scored 59.6 and 53.7 respectively [2] Group 2: MedBench 4.0 Features - MedBench 4.0 focuses on "practical evaluation breakthroughs" and "ecological open co-construction," covering three main technical paradigms: multi-modal models, large language models, and intelligent agents [2] - The platform aligns with national guidelines and includes 60 self-constructed evaluation sets, with over 700,000 professional evaluation questions to assess models in various medical scenarios [2] Group 3: WeDoctor's Model Capabilities - WeDoctor's model excelled in multi-modal capabilities, particularly in clinical core scenarios such as medical imaging and report analysis, filling a technical gap in Chinese medical multi-modal evaluation [3] - The model ranked in the top three for evaluations of large language models and intelligent agents, showcasing its leading position in medical AI development [3] Group 4: Real-World Application - WeDoctor's model is closely integrated with real-world medical processes, ensuring that its development aligns with clinical needs and standards [4] - The model's capabilities are applied across various services in WeDoctor's AI hospital, creating a closed-loop conversion from technical capability to commercial value [4] Group 5: Future Directions - WeDoctor aims to deepen its applications in medical AI, leveraging its validated model to build a more intelligent and accessible healthcare ecosystem [4]