Better Search
Search documents
更多非共识,Test-time Scaling 能否一直大力出奇迹?
机器之心· 2025-12-07 01:30
Group 1 - The article discusses the ongoing debate in the industry regarding Test-time Scaling (TTS) and its effectiveness in enhancing the performance of large language models (LLMs) [6][7]. - TTS has gained significant attention since Q3 2024, as it represents a crucial paradigm for improving LLM performance by dynamically allocating more computational resources during the inference phase [7][8]. - Various research institutions, including Google and UC Berkeley, have explored how increasing computational resources at test time can enhance LLM capabilities, leading to a focus on inference processes [8][9]. Group 2 - The article outlines four dimensions for systematically reviewing TTS methods: "What to scale," "How to scale," "Where to scale," and "How well to scale" [8][10]. - "What to scale" focuses on the objects of expansion, such as the length of the chain of thought (CoT), sample size, path depth, or internal states [9]. - "How to scale" examines the methods of expansion, including approaches like Prompt, Search, Reinforcement Learning (RL), or Mixture-of-Models [10]. Group 3 - The article highlights that the industry has developed a deeper understanding of TTS mechanisms and implementation methods over the past year, although there are still significant disagreements and reflections on improvement strategies [12]. - Research from Fudan University suggests that the popular "Sequential" approach of extending CoT does not consistently improve accuracy, proposing a "Parallel" method as a potential improvement [12][13]. - The "Parallel" method allows models to perform parallel reasoning to generate multiple inference paths, aggregating these paths to derive the final answer, thus enhancing the breadth of thought [13]. Group 4 - The article notes that as the industry continues to explore TTS, previously unrecognized limitations of certain approaches are being confirmed [14]. - There is a growing trend towards External (parallel, hybrid, etc.) TTS methods as Internal (Sequential) approaches approach their limits [14]. - The future of TTS may not lie solely in increased computational power but rather in smarter search techniques, indicating a shift in focus [14][15].