Workflow
交替「推理 - 擦除」范式
icon
Search documents
ICML 2025 | 大模型深度思考新范式:交替「推理-擦除」解决所有可计算问题
机器之心· 2025-05-15 06:04
Core Viewpoint - The article introduces a new deep thinking paradigm called PENCIL, which alternates between generation and erasure to efficiently solve complex reasoning tasks, outperforming traditional Chain-of-Thought (CoT) methods [1][3]. Group 1: PENCIL Paradigm - PENCIL operates by dynamically erasing unnecessary intermediate results during the reasoning process, allowing for a more efficient generation of final answers [3][6]. - The paradigm addresses limitations of traditional CoT, such as exceeding context window limits, difficulty in retrieving key information, and decreased generation efficiency as context length increases [5][10]. Group 2: Mechanism and Design - The erasure mechanism in PENCIL is inspired by logical rewriting rules and stack frame memory management in functional programming, utilizing special tokens to manage the process [8][9]. - PENCIL supports various reasoning modes, allowing for the simplification of complex thought processes and efficient backtracking during problem-solving [10][13]. Group 3: Training and Experimental Results - PENCIL demonstrates superior accuracy in solving larger-scale reasoning problems compared to CoT, maintaining high accuracy rates even as problem size increases [15][21]. - The training efficiency of PENCIL is enhanced by reducing the context length required for each token, leading to significant savings in computational resources [12][17]. Group 4: Theoretical Implications - Theoretically, PENCIL can simulate any Turing machine's operations with optimal time and space complexity, making it capable of efficiently solving all computable problems [23][24]. - PENCIL's approach allows it to maintain a context length that is polynomial in relation to the problem size, contrasting with the exponential context length required by traditional CoT methods [25][28].