ReasonMed
Search documents
达摩院推出多智能体框架ReasonMed,打造医学推理数据生成新范式
机器之心· 2025-11-03 04:04
Core Insights - The article discusses the development of ReasonMed, a new paradigm for generating high-quality medical reasoning data, addressing the challenges in constructing large-scale medical reasoning datasets [2][3][27]. Data Challenges - There is a scarcity of high-quality medical reasoning data, with existing datasets being limited in scale and lacking a systematic pipeline for large-scale construction [2]. - Current datasets often rely on a single model for generation, failing to leverage diverse knowledge domains from multiple pre-trained models [2]. - The cost of constructing high-quality medical reasoning datasets is prohibitively high, requiring significant computational and human resources [2]. ReasonMed Framework - ReasonMed integrates knowledge from four authoritative medical question benchmarks, aggregating approximately 195,000 medical questions across various specialties [3]. - The framework employs multiple proprietary models to collaboratively generate and validate medical reasoning paths, enhancing knowledge coverage and logical consistency [3]. - A multi-agent interaction system is designed to validate and optimize reasoning data across multiple dimensions, balancing quality and cost [3]. Data Generation Process - The data generation process consists of three main steps: data collection, multi-agent reasoning generation and validation, and layered optimization and refinement [12]. - ReasonMed has successfully generated a dataset of 370,000 high-quality medical reasoning samples, significantly outperforming existing public datasets in quality metrics [13]. Model Performance - Models trained on the ReasonMed dataset, such as ReasonMed-7B and ReasonMed-14B, have demonstrated superior performance on various authoritative medical question benchmarks, achieving an accuracy of 82.0% on PubMedQA, surpassing larger models like LLaMA3.1-70B [22][21]. - The hybrid training strategy combining reasoning paths and summary answers has proven to be the most effective, achieving a comprehensive accuracy of 69.6% [23]. Cost Efficiency - The layered optimization mechanism of ReasonMed has reduced data construction costs by over 70%, demonstrating a cost-effective approach to generating complex reasoning chains [25]. - The project illustrates a scalable framework for generating reasoning data that can be applied to other knowledge-intensive fields, such as life sciences and materials science [27]. Community Impact - ReasonMed has garnered positive feedback from the research community, being recognized as a new paradigm for high-quality reasoning data generation and gaining significant attention on platforms like Hugging Face [30].