Workflow
豆蔻妇科大模型再突破:钉钉行业训练平台+精标数据SFT ,准确率从 77.1%上升至 90.2%
Tai Mei Ti A P P·2025-07-10 07:49

Core Insights - The article discusses the limitations of general large language models in clinical scenarios, particularly in providing accurate medical diagnoses, highlighting the need for specialized training methods like supervised fine-tuning (SFT) [1][2][3] - The performance of the Doukou Gynecology model improved significantly from an initial accuracy of 77.1% to 90.2% through targeted SFT processes [1][3] Data Quality Control - The training dataset underwent a rigorous selection process involving systematic data cleaning, ensuring consistency between reasoning and results, and verifying the logical integrity of the data [2][5] - Low-quality data, such as those with clear medical inconsistencies, were excluded to maintain high standards [2] Model Training Phases - The first phase involved building a foundational SFT model using 1,300 meticulously labeled gynecological consultation data, achieving an initial accuracy of 77.1% [3] - The second phase focused on synthesizing symptom data and refining the model, resulting in a final diagnostic accuracy of 90.2% for six major gynecological symptoms [3][6] Iterative Optimization - Continuous iterative optimization was implemented, where high-quality samples scoring above 8 were added to the training set for further SFT, creating a cycle of training, evaluation, and retraining [10][18] - Key performance indicators were monitored throughout the process to ensure comprehensive model improvement [10] Evaluation System - A dual evaluation system was established, combining automated assessments with manual reviews by medical experts to ensure diagnostic accuracy [11][13] - The automated evaluation system utilized a high-performance language model to objectively score outputs based on a structured framework [11] Challenges and Lessons Learned - Initial reliance on manual labeling slowed data accumulation and increased costs, prompting a shift to a more efficient "machine distillation → expert review → post-training evaluation" system [14][15] - The model's ability to recognize rare diseases was enhanced through balanced sampling strategies [15] Future Directions - The company plans to explore a collaborative training paradigm combining SFT and reinforcement learning (RL) to enhance clinical reasoning capabilities [18]