Workflow
分层推理模型(HRM)
icon
Search documents
“惊人转变”,美媒:清华AI专利数超过哈佛、麻省理工等美国四校总和
Xin Lang Cai Jing· 2025-11-19 09:23
Core Insights - China's artificial intelligence (AI) technology is rapidly advancing, narrowing the gap with the United States, as evidenced by Tsinghua University leading in the number of highly cited AI papers and surpassing top U.S. universities in patent approvals [1][2] Group 1: Academic Achievements - Tsinghua University has accumulated 4,986 AI and machine learning-related patents from 2005 to the end of 2024, with over 900 new patents added in the last year [1] - In the global context, China holds over half of the effective patent families in the AI field [1] - Tsinghua's engineering, AI, and computer science programs consistently rank among the top globally [4] Group 2: Competitive Landscape - Despite China's advancements, the U.S. still leads in influential patents and high-performance models, with 40 notable AI models developed by U.S. institutions compared to 15 from China [1] - The proportion of top global AI researchers from China increased from 10% in 2019 to 26% in 2022, while the U.S. share decreased from 35% to 28% [2] Group 3: Innovation and Startups - The success of Chinese AI startups, such as DeepSeek, demonstrates the capability of Chinese teams to compete in the large language model space [5] - Tsinghua's Brain and Intelligence Laboratory fosters interdisciplinary education, leading to innovative projects like the Hierarchical Reasoning Model (HRM) developed by students [5] Group 4: Government and Institutional Support - The Chinese government is providing substantial support for AI research through tax incentives, funding subsidies, and policies that encourage innovation [4] - Tsinghua University is integrating AI technology across all disciplines, making AI research a common activity among students [7] - The establishment of new AI computing platforms at Tsinghua aims to facilitate research across various fields by providing free computational resources to students [7]
“惊人转变!清华超过美国顶尖四校总和”
Guan Cha Zhe Wang· 2025-11-19 07:51
Core Insights - China's artificial intelligence (AI) technology is rapidly advancing, closing the gap with the United States, as evidenced by Tsinghua University's leading position in global AI research and patent filings [1][2][4] Group 1: Research and Development - Tsinghua University has published the highest number of AI papers among global universities, with 4,986 AI-related patents granted from 2005 to the end of 2024, including over 900 new patents in the last year [1][4] - Despite China's advancements, the U.S. still holds the most influential patents and superior AI models, with 40 notable AI models developed by U.S. institutions compared to 15 from China [1][2] Group 2: Talent and Innovation - The proportion of top global AI researchers from China increased from 10% to 26% between 2019 and 2022, while the U.S. share decreased from 35% to 28% [2] - Tsinghua University is fostering a collaborative environment for AI innovation, with several startups founded by its graduates, such as DeepSeek, which has developed a competitive large language model [5][6] Group 3: Educational Initiatives - Tsinghua University is integrating AI technology across various disciplines, providing subsidies for students to access new AI computing platforms for research [6][7] - The university's Brain and Intelligence Laboratory is producing innovative AI models, such as the Hierarchical Reasoning Model (HRM), which outperforms larger models from U.S. companies in specific tasks [5][6]
700万参数击败DeepSeek R1等,三星一人独作爆火,用递归颠覆大模型推理
机器之心· 2025-10-09 04:43
Core Viewpoint - The article discusses the emergence of new models in AI reasoning, particularly the Hierarchical Reasoning Model (HRM) and the Tiny Recursive Model (TRM), highlighting their efficiency and performance in complex reasoning tasks despite having significantly fewer parameters compared to traditional large models [1][4][29]. Group 1: Hierarchical Reasoning Model (HRM) - HRM, proposed by researchers from Sapient Intelligence, utilizes a hierarchical reasoning structure and has 27 million parameters, achieving remarkable performance with only 1,000 training samples [1]. - The model's architecture is based on a two-network design, which increases the parameter count compared to conventional single-network supervised learning [12]. - HRM's performance is benchmarked against various tasks, showing its accuracy in Sudoku-Extreme and Maze-Hard [25][29]. Group 2: Tiny Recursive Model (TRM) - TRM, introduced by researchers from Samsung Advanced Technology Institute, contains only 7 million parameters and outperforms larger models like o3-mini and Gemini 2.5 Pro in challenging reasoning tasks [4][29]. - The model operates through a recursive reasoning process, iterating up to 16 times to refine its answers, demonstrating the principle of "less is more" [6][9]. - TRM's experimental results indicate superior accuracy in Sudoku-Extreme (87.4%) and competitive performance in other benchmarks compared to HRM [27][29]. Group 3: Experimental Results and Comparisons - The article presents a comparison of accuracy rates between HRM and TRM across various datasets, showing TRM's efficiency in achieving higher accuracy with fewer parameters [23][29]. - In the ARC-AGI benchmarks, TRM-Att and TRM-MLP models demonstrate better performance than HRM, emphasizing the advantages of parameter efficiency and generalization capabilities [26][29]. - The findings suggest that reducing model complexity while increasing recursive iterations can lead to improved performance, challenging traditional assumptions about model depth and parameter size [15][17].
只用2700万参数,这个推理模型超越了DeepSeek和Claude
机器之心· 2025-06-30 10:23
Core Insights - The article discusses the need for transformation in the architecture of large language models (LLMs), particularly focusing on the limitations of current chain-of-thought (CoT) techniques, which face challenges such as task complexity, high data requirements, and latency issues [2][4]. Group 1: Hierarchical Reasoning Model (HRM) - The Hierarchical Reasoning Model (HRM) is introduced as a novel cyclic architecture inspired by the human brain's layered and multi-timescale processing mechanisms, achieving high computational depth while maintaining training stability and efficiency [3][6]. - HRM operates through two interdependent cyclic modules: a high-level module for slow, abstract planning and a low-level module for fast, detailed computations, achieving remarkable performance on complex reasoning tasks with only 27 million parameters and 1,000 training samples [4][5]. - HRM does not require pre-training or CoT data, yet it performs nearly perfectly on challenging tasks such as complex Sudoku puzzles and optimal pathfinding in large mazes, outperforming larger models with longer context windows [5][6]. Group 2: Design and Mechanisms - The core design of HRM is based on hierarchical processing and time-scale separation, where high-level brain regions integrate information over longer time scales while low-level regions handle immediate sensory information [12][13]. - HRM incorporates feedback loops similar to the brain's dense recurrent neural network connections, enhancing representation accuracy and contextual adaptability while avoiding issues related to backpropagation through time (BPTT) [14][19]. - The model introduces approximate gradients and deep supervision, allowing for efficient memory usage and improved training dynamics, which contrasts with traditional methods that require extensive memory and time [20][23]. Group 3: Performance and Adaptability - HRM demonstrates hierarchical convergence, with the high-level module stabilizing while the low-level module converges repeatedly, leading to rapid convergence and minimal residuals compared to deep neural networks [17][36]. - The model features adaptive computation time (ACT), enabling it to dynamically adjust computational resources based on task complexity, thus optimizing performance without significant resource expenditure [25][27]. - HRM can seamlessly extend inference computation by adjusting parameters without the need for retraining or architectural changes, showcasing its flexibility in handling complex reasoning tasks [28][36]. Group 4: Experimental Results - Experimental results indicate that HRM excels in complex reasoning tasks, raising questions about the underlying reasoning algorithms it employs, which is crucial for enhancing model interpretability [31][39]. - Visualizations of HRM's reasoning processes reveal its strategies in maze and Sudoku tasks, demonstrating a combination of exploration and optimization techniques that resemble depth-first search methods [31][38]. - The hierarchical structure of HRM emerges as a natural characteristic during the learning of complex reasoning tasks, rather than being an inherent property of the model architecture [34].