Workflow
700万参数击败DeepSeek R1等,三星一人独作爆火,用递归颠覆大模型推理
机器之心·2025-10-09 04:43

Core Viewpoint - The article discusses the emergence of new models in AI reasoning, particularly the Hierarchical Reasoning Model (HRM) and the Tiny Recursive Model (TRM), highlighting their efficiency and performance in complex reasoning tasks despite having significantly fewer parameters compared to traditional large models [1][4][29]. Group 1: Hierarchical Reasoning Model (HRM) - HRM, proposed by researchers from Sapient Intelligence, utilizes a hierarchical reasoning structure and has 27 million parameters, achieving remarkable performance with only 1,000 training samples [1]. - The model's architecture is based on a two-network design, which increases the parameter count compared to conventional single-network supervised learning [12]. - HRM's performance is benchmarked against various tasks, showing its accuracy in Sudoku-Extreme and Maze-Hard [25][29]. Group 2: Tiny Recursive Model (TRM) - TRM, introduced by researchers from Samsung Advanced Technology Institute, contains only 7 million parameters and outperforms larger models like o3-mini and Gemini 2.5 Pro in challenging reasoning tasks [4][29]. - The model operates through a recursive reasoning process, iterating up to 16 times to refine its answers, demonstrating the principle of "less is more" [6][9]. - TRM's experimental results indicate superior accuracy in Sudoku-Extreme (87.4%) and competitive performance in other benchmarks compared to HRM [27][29]. Group 3: Experimental Results and Comparisons - The article presents a comparison of accuracy rates between HRM and TRM across various datasets, showing TRM's efficiency in achieving higher accuracy with fewer parameters [23][29]. - In the ARC-AGI benchmarks, TRM-Att and TRM-MLP models demonstrate better performance than HRM, emphasizing the advantages of parameter efficiency and generalization capabilities [26][29]. - The findings suggest that reducing model complexity while increasing recursive iterations can lead to improved performance, challenging traditional assumptions about model depth and parameter size [15][17].