Workflow
DeepConf
icon
Search documents
【AI产业跟踪-海外】首个 Agent 浏览器Fellou CE发布,微软推出14B数学推理Agent rStar2-Agent
请务必阅读正文之后的免责条款部分 1 of 5 【AI 产业跟踪-海外】首个 Agent 浏览器 Fellou CE 发 产业研究中心 | 布,微软推出 | [Table_Authors] 14B 数学推理 Agent rStar2-Agent 李嘉琪(分析师) | | --- | --- | | 摘要:产业最新趋势跟踪,点评产业最新风向 | 010-83939821 | | | lijiaqi2@gtht.com | | [Table_Summary] 行业资讯 AI | 登记编号 S0880524040001 | | ASML 入股 | Mistral AI | | 微软携手 Nebius | 签 174 亿美元算力协议 刘峰(研究助理) | | AI 应用资讯 | 0755-23976068 | | 首个 Agent | 浏览器 Fellou CE 发布 | | | liufeng6@gtht.com | | AI 大模型资讯 | 登记编号 S0880124060013 | | 微软推出 14B | 数学推理 Agent rStar2-Agent | AI 科技前沿 NVIDIA 发布 Rubin CP ...
Z Tech|对话Meta FAIR研究科学家:利用Confidence动态过滤,告别低效推理
Z Potentials· 2025-09-05 02:27
Core Viewpoint - The article discusses the emergence of the Deep Think with Confidence (DeepConf) method, which enhances the inference efficiency and performance of large language models (LLMs) by dynamically filtering low-quality inference trajectories using internal confidence signals during the inference process [1][5]. Group 1: DeepConf Methodology - DeepConf addresses the limitations of existing methods by utilizing model internal confidence signals to filter out low-quality inference trajectories, thereby improving both inference efficiency and performance [1][10]. - The method can be seamlessly integrated into existing service frameworks without requiring additional model training or hyperparameter tuning, making it user-friendly for developers [8][10]. - DeepConf operates in both offline and online modes, allowing for flexibility in application depending on the use case [8]. Group 2: Performance Metrics - In offline mode, DeepConf@512 achieved a 99.9% accuracy on the GPT-OSS-120B model, significantly surpassing the traditional majority vote accuracy of 97.0% [10]. - In online mode, DeepConf can reduce the number of generated tokens by up to 84.7% compared to full parallel inference while simultaneously improving accuracy, effectively balancing performance and efficiency [10]. Group 3: Contributors and Research Background - Jiawei Zhao, a research scientist at Meta FAIR and a Caltech PhD, focuses on optimization methods for LLMs and deep learning [5][6]. - Yichao Fu, a PhD student at UCSD, specializes in LLM inference optimization and has contributed to multiple research projects aimed at improving LLM scheduling and breaking sequential dependencies in inference [8][10].
比GPT-5还准?AIME25飙到99.9%刷屏,开源模型首次
3 6 Ke· 2025-08-25 03:50
Core Insights - DeepConf, developed by Meta AI and UC San Diego, enables large models to monitor confidence levels in real-time during inference, dynamically eliminating low-confidence paths and weighting high-confidence paths for improved accuracy and efficiency [1][8][9] - At the AIME 2025 competition, DeepConf achieved a remarkable 99.9% accuracy using open-source models without external tools, while reducing token generation by 85% [2][4][19] Performance Metrics - DeepConf demonstrated an average accuracy improvement of approximately 10% across various models and datasets [10][19] - The method significantly reduced token generation, achieving a reduction of up to 85% while maintaining high accuracy [10][21] Methodology - The core idea of DeepConf is to filter inference paths based on confidence signals, balancing quality answers with efficiency [8][9] - DeepConf operates in two modes: offline and online. In offline mode, it evaluates completed inference paths, while in online mode, it monitors confidence in real-time to terminate low-quality paths [14][31] Voting Mechanism - DeepConf employs a confidence-weighted majority voting system, where the contribution of each inference path to the final decision is weighted by its confidence level [29][30] - The method filters out the lowest confidence paths before voting, ensuring that only high-confidence paths contribute to the final answer [15][30] Implementation and Compatibility - DeepConf is compatible with existing models without requiring additional training or hyperparameter tuning, allowing for easy deployment with minimal code [10][21] - The system can be integrated into vLLM with approximately 50 lines of code, making it accessible for various applications [10] Research and Development - The research team, led by Yichao Fu at UC San Diego, focuses on optimizing algorithms and systems for large language models (LLMs) to enhance their reasoning processes [47]
腾讯研究院AI速递 20250825
腾讯研究院· 2025-08-24 16:01
Group 1 - The core viewpoint of the article is the significant advancements in AI technologies and their implications for various companies and industries, highlighting developments from xAI, Meta, OpenAI, and others [1][2][3][4][5][6][7][8][9][10]. Group 2 - xAI has officially open-sourced the Grok-2 model, which features 905 billion parameters and supports a context length of 128k, with Grok-3 expected to be released in six months [1]. - Meta AI and UC San Diego introduced the DeepConf method, achieving a 99.9% accuracy rate for open-source models while reducing token consumption by 85% [2]. - OpenAI's CEO Sam Altman has delegated daily operations to Fidji Simo, focusing on fundraising and supercomputing projects, indicating a dual leadership structure [3]. - The release of DeepSeek's UE8M0 FP8 parameter precision has led to a surge in domestic chip stocks, enhancing bandwidth efficiency and performance [4]. - Meta is collaborating with Midjourney to integrate its AI image and video generation technology into future AI models, aiming to compete with OpenAI's offerings [5]. - Coinbase's CEO mandated all engineers to use AI tools, emphasizing the necessity of AI in operations, which has sparked debate in the developer community [6]. - OpenAI partnered with Retro Biosciences to develop a micro model that enhances cell reprogramming efficiency by 50 times, potentially revolutionizing cell therapy [7]. - a16z's research indicates that AI application generation platforms are moving towards specialization and differentiation, creating a diverse competitive landscape [8]. - Google's AI energy consumption report reveals that a median Gemini prompt consumes 0.24 watt-hours of electricity, equivalent to one second of microwave operation, with a 33-fold reduction in energy consumption over the past year [9][10].