大模型推理优化 - filings, earnings calls, financial reports, news

大模型推理优化

Search documents

Sou Hu Cai Jing· 2026-02-08 11:21

趋境科技成立于2023年底，专注于大模型推理优化方向，致力于降低大模型的使用成本，让每个勇于创新的团队，都能平等获得这个时代最顶尖的AI生产力。公司创始团队均来自清华大学，在 AI、体系结构、系统软件等相关的技术系统和软件领域，有多年学术与产业实践经验。证券之星消息，根据天眼查APP于2月3日公布的信息整理，北京趋境科技有限责任公司Pre-A轮融资，融资额未披露，参与投资的机构包括国际国方，哈勃投资，华控技术转移，尚势资本。以上内容为证券之星据公开信息整理，由AI算法生成（网信算备310104345710301240019号），不构成投资建议。数据来源：天眼查APP ...

无需训练、只优化解码策略，DTS框架让大模型推理准确率提升6%，推理长度缩短23%

机器之心· 2025-11-21 02:04

Core Insights - The article discusses the advancements in Large Reasoning Models (LRMs) and introduces DTS (Decoding Tree Sketching), a new inference framework that addresses the issue of "overthinking" in models, which leads to longer and often incorrect reasoning paths [2][8][26]. Group 1: Problem Identification - The "overthinking" problem in reasoning models results in longer reasoning chains that are more prone to errors and self-repetition, decreasing accuracy [8][11]. - Existing methods to mitigate this issue often rely on additional training or aggressive pruning, which can be costly and unstable [8][11]. Group 2: DTS Framework - DTS employs two key strategies: high uncertainty branching reasoning and early stopping upon the first completion of a path, aiming to approximate the shortest and correct reasoning path [2][8][26]. - The framework does not require additional training or modifications to model weights, making it a plug-and-play solution [8][26]. Group 3: Empirical Results - In AIME2024/2025, DTS achieved an average accuracy improvement of 6% and a reduction in average reasoning length by approximately 23%, along with a 10% decrease in endless repetition rates [4][20]. - The empirical findings indicate a significant negative correlation between reasoning chain length and accuracy, with shorter reasoning chains often yielding higher correctness rates [9][11]. Group 4: Methodology - The reasoning process is conceptualized as a decoding tree, where nodes represent generated tokens and paths represent complete chains of thought (CoT) [12][13]. - DTS focuses on branching only at "key tokens" where uncertainty is high, thereby avoiding unnecessary complexity in the decoding tree [15][16]. Group 5: Conclusion and Future Directions - DTS provides a lightweight optimization route for reasoning models, allowing them to "think less but more accurately" [26][27]. - The approach is expected to integrate with multi-step reasoning, calibration, and uncertainty estimation, paving the way for more efficient and reliable reasoning in LRMs [27].

大模型推理优化

稀疏化解码树

DTS（Decoding Tree Sketching）

大模型推理优化

稀疏化解码树

DTS（Decoding Tree Sketching）

英伟达帮你省钱，让大模型推理「短而精」，速度快5倍

机器之心· 2025-11-04 04:22

Core Insights - The article discusses the challenges and advancements in reasoning models, particularly focusing on the balance between reasoning length and accuracy [2][3] - It highlights the introduction of DLER, a new reinforcement learning method that significantly reduces reasoning length while maintaining accuracy [7][10] Group 1: DLER Methodology - DLER addresses the issues arising from length penalties in reinforcement learning training, proposing a simple yet effective training recipe [7] - The DLER model achieves a reduction in reasoning length by over 70% while keeping accuracy intact, with DLER-Qwen-R1-7B using an average of 3230 tokens to reach 55.6% accuracy on the AIME-24 benchmark [7][10] Group 2: Key Findings - DLER is effective not only for small models but also for large models, introducing magnitude-selective weight merging to mitigate performance drops during fine-tuning [12] - The research indicates that improving reasoning efficiency relies more on the choice of optimization algorithms rather than the complexity of penalty designs [15] Group 3: Future Implications - The findings suggest a shift in the approach to reasoning models, emphasizing smarter and more efficient thinking rather than merely extending reasoning chains [14] - DLER is positioned as a critical technology for the practical deployment of reasoning models, enhancing their speed and utility [14]