Deep Think with Confidence (DeepConf)

Search documents
Z Tech|9月9日线上对话Meta FAIR研究科学家:利用Confidence动态过滤,告别低效推理
Z Potentials· 2025-09-06 04:40
Core Viewpoint - The article discusses the emergence of the Deep Think with Confidence (DeepConf) method, which enhances the efficiency and performance of large language models (LLMs) by dynamically filtering low-quality inference trajectories using internal confidence signals during the reasoning process [1][10]. Group 1: DeepConf Methodology - DeepConf addresses the limitations of existing inference methods by utilizing confidence signals from the model to filter out low-quality trajectories, thereby improving both inference efficiency and performance [1][10]. - The method can be seamlessly integrated into existing service frameworks without requiring additional model training or hyperparameter tuning [8][10]. Group 2: Performance Metrics - In offline mode, DeepConf@512 achieved a 99.9% accuracy on the GPT-OSS-120B model, significantly surpassing the traditional majority vote accuracy of 97.0% [10]. - In online mode, DeepConf can reduce the number of generated tokens by up to 84.7% compared to full parallel inference while simultaneously improving accuracy, effectively balancing performance and efficiency [10]. Group 3: Contributors and Research Background - Jiawei Zhao, a research scientist at Meta FAIR, has a PhD from Caltech and focuses on optimization methods for LLMs and deep learning [5][6]. - Yichao Fu, a PhD student at UCSD, specializes in LLM inference optimization and has contributed to research on efficient scheduling and breaking sequential dependencies in LLM inference [8][10].