Workflow
分层推理模型(HRM)
icon
Search documents
“惊人转变”,美媒:清华AI专利数超过哈佛、麻省理工等美国四校总和
Xin Lang Cai Jing· 2025-11-19 09:23
[文/观察者网 阮佳琪] 中国人工智能技术正以爆发式速度迭代,与美国的差距快速拉近。据彭博社19日报道,一项最新统计显 示,清华大学在人工智能领域发表的学术论文中,入选"全球引用量最高100篇论文"的数量位居全球高 校之首,且该校获批的相关专利数量更超过麻省理工学院、斯坦福大学、普林斯顿大学与哈佛大学这四 所美国顶尖高校的总和。 报道引述励讯集团(RELX)旗下专利数据分析服务商律商联讯(LexisNexis)的数据称,2005年至 2024年末,清华累计获得4986项人工智能与机器学习相关专利,仅去年一年就新增900余项。而在全球 该领域的有效专利族(同一项核心技术在多国/地区申请的全部专利)中,中国占比已过半。 不过目前,美国在人工智能领域仍保有最具影响力的专利与性能最优的模型。例如,在专利影响力排名 中,哈佛大学与麻省理工学院始终领先于清华大学。斯坦福大学发布的《人工智能指数报告》也显示, 2024年美国科研机构研发的知名人工智能模型达40个,而中国为15个。但在部分性能测评指标上,中国 机构正逐步缩小与美国的差距。 这也让美国想要保住人工智能科研领域的领先地位,正面临不小挑战。华盛顿智库美国信息技术与 ...
“惊人转变!清华超过美国顶尖四校总和”
Guan Cha Zhe Wang· 2025-11-19 07:51
Core Insights - China's artificial intelligence (AI) technology is rapidly advancing, closing the gap with the United States, as evidenced by Tsinghua University's leading position in global AI research and patent filings [1][2][4] Group 1: Research and Development - Tsinghua University has published the highest number of AI papers among global universities, with 4,986 AI-related patents granted from 2005 to the end of 2024, including over 900 new patents in the last year [1][4] - Despite China's advancements, the U.S. still holds the most influential patents and superior AI models, with 40 notable AI models developed by U.S. institutions compared to 15 from China [1][2] Group 2: Talent and Innovation - The proportion of top global AI researchers from China increased from 10% to 26% between 2019 and 2022, while the U.S. share decreased from 35% to 28% [2] - Tsinghua University is fostering a collaborative environment for AI innovation, with several startups founded by its graduates, such as DeepSeek, which has developed a competitive large language model [5][6] Group 3: Educational Initiatives - Tsinghua University is integrating AI technology across various disciplines, providing subsidies for students to access new AI computing platforms for research [6][7] - The university's Brain and Intelligence Laboratory is producing innovative AI models, such as the Hierarchical Reasoning Model (HRM), which outperforms larger models from U.S. companies in specific tasks [5][6]
700万参数击败DeepSeek R1等,三星一人独作爆火,用递归颠覆大模型推理
机器之心· 2025-10-09 04:43
Core Viewpoint - The article discusses the emergence of new models in AI reasoning, particularly the Hierarchical Reasoning Model (HRM) and the Tiny Recursive Model (TRM), highlighting their efficiency and performance in complex reasoning tasks despite having significantly fewer parameters compared to traditional large models [1][4][29]. Group 1: Hierarchical Reasoning Model (HRM) - HRM, proposed by researchers from Sapient Intelligence, utilizes a hierarchical reasoning structure and has 27 million parameters, achieving remarkable performance with only 1,000 training samples [1]. - The model's architecture is based on a two-network design, which increases the parameter count compared to conventional single-network supervised learning [12]. - HRM's performance is benchmarked against various tasks, showing its accuracy in Sudoku-Extreme and Maze-Hard [25][29]. Group 2: Tiny Recursive Model (TRM) - TRM, introduced by researchers from Samsung Advanced Technology Institute, contains only 7 million parameters and outperforms larger models like o3-mini and Gemini 2.5 Pro in challenging reasoning tasks [4][29]. - The model operates through a recursive reasoning process, iterating up to 16 times to refine its answers, demonstrating the principle of "less is more" [6][9]. - TRM's experimental results indicate superior accuracy in Sudoku-Extreme (87.4%) and competitive performance in other benchmarks compared to HRM [27][29]. Group 3: Experimental Results and Comparisons - The article presents a comparison of accuracy rates between HRM and TRM across various datasets, showing TRM's efficiency in achieving higher accuracy with fewer parameters [23][29]. - In the ARC-AGI benchmarks, TRM-Att and TRM-MLP models demonstrate better performance than HRM, emphasizing the advantages of parameter efficiency and generalization capabilities [26][29]. - The findings suggest that reducing model complexity while increasing recursive iterations can lead to improved performance, challenging traditional assumptions about model depth and parameter size [15][17].
只用2700万参数,这个推理模型超越了DeepSeek和Claude
机器之心· 2025-06-30 10:23
Core Insights - The article discusses the need for transformation in the architecture of large language models (LLMs), particularly focusing on the limitations of current chain-of-thought (CoT) techniques, which face challenges such as task complexity, high data requirements, and latency issues [2][4]. Group 1: Hierarchical Reasoning Model (HRM) - The Hierarchical Reasoning Model (HRM) is introduced as a novel cyclic architecture inspired by the human brain's layered and multi-timescale processing mechanisms, achieving high computational depth while maintaining training stability and efficiency [3][6]. - HRM operates through two interdependent cyclic modules: a high-level module for slow, abstract planning and a low-level module for fast, detailed computations, achieving remarkable performance on complex reasoning tasks with only 27 million parameters and 1,000 training samples [4][5]. - HRM does not require pre-training or CoT data, yet it performs nearly perfectly on challenging tasks such as complex Sudoku puzzles and optimal pathfinding in large mazes, outperforming larger models with longer context windows [5][6]. Group 2: Design and Mechanisms - The core design of HRM is based on hierarchical processing and time-scale separation, where high-level brain regions integrate information over longer time scales while low-level regions handle immediate sensory information [12][13]. - HRM incorporates feedback loops similar to the brain's dense recurrent neural network connections, enhancing representation accuracy and contextual adaptability while avoiding issues related to backpropagation through time (BPTT) [14][19]. - The model introduces approximate gradients and deep supervision, allowing for efficient memory usage and improved training dynamics, which contrasts with traditional methods that require extensive memory and time [20][23]. Group 3: Performance and Adaptability - HRM demonstrates hierarchical convergence, with the high-level module stabilizing while the low-level module converges repeatedly, leading to rapid convergence and minimal residuals compared to deep neural networks [17][36]. - The model features adaptive computation time (ACT), enabling it to dynamically adjust computational resources based on task complexity, thus optimizing performance without significant resource expenditure [25][27]. - HRM can seamlessly extend inference computation by adjusting parameters without the need for retraining or architectural changes, showcasing its flexibility in handling complex reasoning tasks [28][36]. Group 4: Experimental Results - Experimental results indicate that HRM excels in complex reasoning tasks, raising questions about the underlying reasoning algorithms it employs, which is crucial for enhancing model interpretability [31][39]. - Visualizations of HRM's reasoning processes reveal its strategies in maze and Sudoku tasks, demonstrating a combination of exploration and optimization techniques that resemble depth-first search methods [31][38]. - The hierarchical structure of HRM emerges as a natural characteristic during the learning of complex reasoning tasks, rather than being an inherent property of the model architecture [34].