Deep Thinking Ratio
Search documents
不是所有token都平等,谷歌提出真·深度思考:思维链长≠深度推理
3 6 Ke· 2026-02-25 12:23
Core Insights - Google's research challenges the long-held belief that longer reasoning chains in large models lead to better inference quality, introducing a new metric called Deep Thinking Ratio (DTR) to assess true cognitive depth rather than mere token count [1][3][9]. Group 1: Research Findings - The study found a negative correlation of -0.54 between token length and accuracy across various models, indicating that longer reasoning chains can lead to misdirection and overthinking [3][5]. - DTR measures the proportion of "deep thinking tokens" in a generated sequence, with a higher ratio indicating a focus on core reasoning rather than unnecessary content [8][10]. Group 2: Implementation of DTR - Google introduced the Think@n strategy, which allows models like GPT-OSS and DeepSeek-R1 to maintain accuracy while halving computational costs by filtering out low-quality samples early in the reasoning process [2][12]. - In tests, the Think@n strategy achieved an accuracy of 94.7% for GPT-OSS-120B-medium on the AIME 2025 dataset, surpassing traditional methods, while reducing token consumption from 355.6k to 181.9k [12][13]. Group 3: Implications for Model Development - The findings suggest a shift in focus for model developers from merely increasing token length to enhancing the quality of reasoning, emphasizing the importance of deep cognitive processing [1][19]. - The research highlights the potential for significant cost savings and efficiency improvements in model inference through the application of DTR and the Think@n strategy [9][12].