机器之心
Search documents
大规模高精度量子化学模拟新范式:字节最新成果入选Nature子刊
机器之心· 2025-11-09 11:48
Core Insights - The article discusses the increasing reliance on computational methods in understanding material properties, particularly in fields like catalysis and clean energy [2] - A new quantum embedding framework, SIE+CCSD(T), has been developed to combine high-precision quantum chemistry with large-scale simulations, enabling accurate studies of complex materials [3][6] Group 1: SIE+CCSD(T) Framework - The SIE+CCSD(T) framework allows for the first time the application of "gold standard" CCSD(T) methods to real material systems containing thousands of electrons and hundreds of atoms [3][6] - This framework achieves linear computational scaling up to 392 atoms, demonstrating significant efficiency improvements through GPU optimization [4][6][15] - The SIE framework can combine different levels of high-precision algorithms, allowing researchers to adjust computational speed and accuracy as needed [6][12] Group 2: Performance and Accuracy - In a system with 392 carbon atoms and approximately 11,000 orbitals, SIE+CCSD(T) achieved "gold standard" accuracy while maintaining near-linear computational efficiency on GPUs [6][16] - The method consistently produced results within ±1 kcal/mol of experimental data across various real systems, indicating its reliability and potential as a universal tool [21][24] - The framework successfully reconciled results from different boundary conditions in large systems, showing convergence in adsorption energy for water on graphene [25] Group 3: Implications for Surface Chemistry - The study revealed that water molecules do not exhibit a preferred orientation when adsorbed on graphene, providing important insights for applications in blue energy and water desalination [24][26] - The SIE+CCSD(T) framework addresses the limitations of traditional quantum chemistry methods, enabling accurate simulations of surface chemistry at a larger scale [8][26] - The findings contribute to a better understanding of molecular adsorption on surfaces, which is critical for material design and surface mechanism exploration [26]
IEEE | LLM Agent的能力边界在哪?首篇「图智能体 (GLA)」综述为复杂系统构建统一蓝图
机器之心· 2025-11-09 11:48
Core Insights - The article discusses the rapid development of LLM Agents and highlights the challenges of fragmentation in research and limitations in capabilities such as reliable planning, long-term memory, and multi-agent coordination [2][3]. Group 1: Introduction of Graph-Augmented LLM Agents - A recent comprehensive review published in IEEE Intelligent Systems proposes the concept of "Graph-augmented LLM Agent (GLA)" as a new research direction, which utilizes graphs as a universal language to systematically analyze and enhance various aspects of LLM Agents [3][5]. - Compared to pure LLM solutions, GLA demonstrates significant advantages in reliability, efficiency, interpretability, and flexibility [3]. Group 2: Core Challenges and Solutions - The main challenge for LLM Agents lies in processing structured information and workflows, which can be effectively addressed by using graphs as a natural representation of structured data [6]. - The article outlines how graph structures can enhance the planning capabilities of agents by modeling plans, task dependencies, reasoning processes, and environmental contexts as graphs [11]. Group 3: Memory and Tool Management - To overcome memory limitations, graph structures provide two effective methods: using interaction graphs to record and organize the agent's interaction history and knowledge graphs to store and retrieve structured factual knowledge [12]. - The "tool graph" can clarify the dependencies between tools, assisting in tool selection and improving the agent's ability to call and combine tools [15]. Group 4: Multi-Agent Systems - The review categorizes multi-agent collaboration into three paradigms, illustrating the evolution from static to dynamic and adaptive systems [18][22]. - Graph theory methods can optimize multi-agent systems by reducing redundancy in communication and agent numbers, thereby lowering costs [21]. Group 5: Trustworthiness and Safety - The article discusses the role of graphs in building trustworthy multi-agent systems by systematically analyzing the propagation of biases and harmful information, and utilizing techniques like Graph Neural Networks (GNN) to detect and predict malicious nodes [25]. Group 6: Future Directions - The review identifies five key future directions for GLA development, including dynamic and continuous graph learning, unified graph abstraction across all modules, multi-modal graphs for integrating various types of information, trustworthy systems focusing on privacy and fairness, and large-scale multi-agent simulations [28].
Which Attention is All You Need?
机器之心· 2025-11-09 01:30
Core Insights - The article discusses the ongoing innovations and challenges in the Attention mechanism within AI and Robotics, highlighting the need for breakthroughs in algorithm design to address computational complexities and enhance performance [5][7]. Group 1: Attention Mechanism Innovations - The industry is focusing on optimizing the Attention mechanism due to the computational complexity of O(N^2) associated with standard self-attention, which poses a fundamental obstacle for efficient long-sequence modeling [9]. - Two main paths for improving Attention have emerged: Linear Attention, which aims to reduce complexity to O(N), and Sparse Attention, which seeks to limit calculations to a subset of important tokens [10][13]. - Kimi Linear, a recent development, has shown significant improvements over traditional full attention methods, achieving up to 75% reduction in KV cache requirements and processing contexts of up to 1 million tokens six times faster than full attention [11][12]. Group 2: Linear Attention Approaches - Linear Attention can be categorized into three main types: Kernelized methods, forgetting mechanisms, and in-context learning, each aiming to optimize the attention process while maintaining performance [10][11]. - The Kimi Linear architecture, which incorporates a channel-wise gating mechanism, optimizes memory usage in RNNs and demonstrates superior performance across various scenarios [12]. - The design of Kimi Linear includes a hierarchical mixed architecture that combines linear and full attention layers, enhancing its efficiency and effectiveness [12]. Group 3: Sparse Attention Strategies - Sparse Attention focuses on pre-selecting a subset of important tokens for attention calculations, utilizing methods such as fixed patterns, block-sparse, and clustering approaches [13][14]. - DeepSeek's NSA and DSA represent significant advancements in Sparse Attention, with DSA employing a token-wise sparse strategy that dramatically reduces attention complexity while maintaining performance [16][17]. - In tests, DSA has achieved a reduction in attention complexity from O(L^2) to O(Lk), resulting in cost reductions of 60%-70% during both pre-filling and decoding phases [17].
突破LLM遗忘瓶颈,谷歌「嵌套学习」让AI像人脑一样持续进化
机器之心· 2025-11-08 06:10
Core Insights - Google has introduced a new machine learning paradigm called Nested Learning, which allows models to continuously learn new skills without forgetting old ones, marking a significant advancement towards AI that evolves like the human brain [1][3][4]. Group 1: Nested Learning Concept - Nested Learning treats machine learning models as a series of interconnected optimization sub-problems, enabling a more efficient learning system [6][11]. - The approach bridges the gap between model architecture and optimization algorithms, suggesting they are fundamentally the same and can be organized into hierarchical optimization systems [7][16]. - This paradigm allows for different components of a model to update at varying frequencies, enhancing the model's ability to manage long-term and short-term memory [15][20]. Group 2: Implementation and Architecture - Google has developed a self-modifying architecture called Hope, based on Nested Learning principles, which outperforms existing models in language modeling and long-context memory management [8][24]. - Hope is an evolution of the Titans architecture, designed to execute infinite levels of contextual learning and optimize its memory through a self-referential process [24][26]. Group 3: Experimental Results - Evaluations show that Hope exhibits lower perplexity and higher accuracy in various language modeling and common-sense reasoning tasks compared to other architectures [27][30]. - The performance of different architectures, including Hope, Titans, and others, was compared in long-context tasks, demonstrating the effectiveness of the Nested Learning framework [30]. Group 4: Future Implications - Nested Learning provides a theoretical and practical foundation for bridging the gap between current LLMs' limitations and the superior continuous learning capabilities of the human brain, paving the way for the development of self-improving AI [30].
虚数 i ,要被量子力学抛弃了?
机器之心· 2025-11-08 06:10
Core Viewpoint - Recent research suggests that quantum mechanics may be rewritten using only real numbers, challenging the long-standing reliance on imaginary numbers in the field [1][7][11]. Group 1: Historical Context - Quantum mechanics was established over a century ago to explain the strange behavior of atoms and fundamental particles, achieving significant success [2]. - The core equations of quantum mechanics include the imaginary unit i, which has been a point of contention among physicists [3][4]. Group 2: Recent Developments - In 2021, a study indicated that imaginary numbers were essential to quantum theory, but subsequent research in 2025 proposed a real-number equivalent that is fully compatible with standard quantum theory [8][11][15]. - Several teams have developed real-number formulations of quantum theory, raising questions about the necessity of imaginary components [15][38]. Group 3: Experimental Evidence - A modified Bell experiment demonstrated that the correlations between entangled particles exceeded the limits set by real-number theories, suggesting that imaginary numbers are crucial for accurate quantum descriptions [30][29]. - Despite statistical evidence supporting the necessity of imaginary numbers, skepticism remains regarding the conclusions drawn from these experiments [31][32]. Group 4: Philosophical Implications - The debate continues over why real-number formulations are more complex and whether they can fully replicate the results of traditional quantum mechanics [42][43]. - Some researchers argue that even if imaginary numbers are not strictly necessary, they provide a more elegant and intuitive framework for quantum mechanics [44][49]. Group 5: Future Directions - Ongoing research aims to uncover the unique properties of quantum mechanics that make imaginary numbers particularly suitable, with some theorists suggesting that spin may play a role [51][52]. - The quest for a simpler axiomatic framework for quantum mechanics continues, as researchers seek to understand why the traditional formulation remains dominant [53].
2025年人工智能产业及赋能新型工业化创新任务揭榜挂帅工作启动
机器之心· 2025-11-08 06:10
Core Viewpoint - The Ministry of Industry and Information Technology (MIIT) of China has initiated a project to promote the integration of artificial intelligence (AI) into new industrialization, focusing on key areas such as AI industry development, "AI + manufacturing," and intelligent product equipment [2][4]. Group 1: Task Content - The initiative aims to discover and cultivate a batch of key technologies and products that are strong in technological innovation, quick in application, and exemplary in demonstration, accelerating the deep integration of AI with industry [5]. Group 2: Recommendation Conditions - Applicants must be legally registered entities within China and cannot reapply for projects that have already been recognized in previous rounds. Recommendations should prioritize projects with outstanding innovation capabilities and good industrialization prospects [7]. Group 3: Work Requirements - Applicants are required to register and submit materials by November 20, 2025. Recommendation units must confirm their recommended lists by November 30, 2025, with specific limits on the number of projects that can be recommended by different regions and entities [9][10]. Group 4: Support for Winning Units - Local governments are encouraged to leverage their development advantages to provide support in terms of policy funding, scenario opening, and application promotion for the winning units [11].
SimKO:缓解RLVR训练中的概率过度集中,优化pass@K性能
机器之心· 2025-11-08 04:02
Core Insights - The article discusses the limitations of existing Reinforcement Learning with Verified Rewards (RLVR) methods in enhancing the performance of large language models, particularly in terms of pass@K metrics, which show a decline compared to base models despite improvements in pass@1 performance [2][3][12]. Group 1: Problem Analysis - The decline in exploration capability of RLVR methods is attributed to the models concentrating probabilities on a single reasoning path, thus sacrificing the ability to explore diverse correct solutions [3][12]. - Current RLVR algorithms, such as GRPO and DAPO, reinforce the probability of correct answers while punishing incorrect ones, leading to a concentration of probability on rank-1 candidates and inhibiting exploration of other potential correct paths [8][23]. - The use of entropy as a diversity metric is limited, as it does not accurately reflect the shape of the probability distribution, which can lead to misleading conclusions about the model's exploration capabilities [9][12]. Group 2: Proposed Solution - The research team introduces SimKO (Simple Pass@K Optimization), a new algorithm designed to improve pass@K performance by addressing the issue of probability concentration [4][17]. - SimKO employs an asymmetric gradient adjustment strategy, applying label smoothing to correct paths while imposing precise penalties on incorrect paths, thus balancing exploration and exploitation [17][23]. - The algorithm identifies key tokens with high entropy in reasoning paths, applying updates only to these critical nodes to enhance the model's exploration capabilities [18][20]. Group 3: Experimental Results - SimKO was evaluated on multiple mathematical reasoning benchmarks, demonstrating significant improvements in pass@K performance while maintaining or slightly enhancing pass@1 accuracy [21][27]. - In comparison to GRPO, SimKO showed a 31.6% increase in pass@1 and a 26.3% increase in pass@128 on in-distribution tasks, while also performing well on out-of-distribution tasks [27][26]. - The results indicate that SimKO effectively mitigates the issue of probability concentration, thereby enhancing the model's exploration ability and improving overall performance metrics [26][27].
6.4万star的开源智能体框架全面重构!OpenHands重大升级,叫板OpenAI和谷歌
机器之心· 2025-11-08 04:02
Core Insights - OpenHands development team announced the completion of the architectural restructuring of the OpenHands Software Agent SDK, evolving from V0 to V1, which provides a practical foundation for prototyping, unlocking new custom applications, and large-scale reliable deployment of agents [1][2]. Design Principles - OpenHands V1 introduces a new architecture based on four design principles that address the limitations of V0: 1. Sandbox execution should be optional rather than universally applicable, allowing for flexibility without sacrificing security [9]. 2. Default statelessness with a single source of truth for session state, ensuring isolation of changes and enabling deterministic replay and strong consistency [10]. 3. Strict separation of relevant items, isolating the core of the agent into a "software engineering SDK" for independent evolution of research and applications [11]. 4. Everything should be composable and safely extensible, with modular packages that support local, hosted, or containerized execution [12][13]. Ecosystem and Features - OpenHands V1 is a complete software agent ecosystem, including CLI and GUI applications built on the OpenHands Software Agent SDK [15][16]. - The SDK features a deterministic replay capability, an immutable configuration for agents, and an integrated tool system that supports both local prototyping and secure remote execution with minimal code changes [18][20]. Comparison with Competitors - The team compared OpenHands SDK with OpenAI, Claude, and Google SDKs, highlighting that OpenHands uniquely combines 16 additional features, including native remote execution and multi-LLM routing across over 100 vendors [21][22]. Reliability and Evaluation - OpenHands SDK's reliability and performance are assessed through continuous testing and benchmark evaluations, with automated tests costing only $0.5–3 per run and completing in 5 minutes [24][25]. - The SDK demonstrates competitive performance in software engineering and general agent benchmarks, achieving a 72% solution rate on SWE-Bench and a 67.9% accuracy on GAIA using Claude Sonnet 4.5 [29][30].
Utopai联手LG、中东主权基金加码韩娱,新模型颠覆AI视频格局!
机器之心· 2025-11-08 04:02
Core Viewpoint - The article discusses the evolution of AI video generation technology, highlighting the transition from short video creation to long-form narrative filmmaking, with a focus on the collaboration between Utopai Studios and Stock Farm Road to enhance the internationalization of Korean cinema [1][2][24]. Group 1: AI Video Generation Technology - Current mainstream models like Sora 2 and Google Veo 3 excel in generating short video segments but struggle with long-form narratives [1]. - Utopai Studios aims to address the challenges of AI understanding and managing the narrative logic of long films, moving from short video generation to industrial-level long-form production [6][24]. Group 2: Collaboration and Investment - Utopai Studios has partnered with Stock Farm Road to establish a joint venture with a capital scale of several billion dollars, focusing on the internationalization of Korean film [2][4]. - The collaboration is backed by significant figures, including Brian Koo from LG Group and Amin Badr-El-Din from the UAE sovereign fund [3]. Group 3: Technical Innovations - Utopai's innovative approach involves a layered collaborative architecture where an autoregressive model handles planning and a diffusion model manages rendering, enhancing the AI's narrative capabilities [11][12]. - The training methodology shifts from 2D pixel statistics to 3D physical rules, allowing the model to understand depth, material properties, and motion trajectories [14]. Group 4: Quantifiable Advantages - Utopai has developed an internal evaluation system that surpasses traditional metrics by focusing on narrative quality, including consistency across multiple shots, adherence to script instructions, and improved production efficiency [18][19][20]. - The system maintains character identity and scene continuity over extended sequences, ensuring logical progression in storytelling [18]. Group 5: Future of AI in Filmmaking - The partnership signifies a paradigm shift in AI filmmaking, where AI evolves from a mere tool to a creative partner capable of understanding a director's vision [21][22]. - The integration of AI technology in filmmaking is expected to lower production costs and expand creative possibilities, allowing for grand narratives that were previously deemed unfeasible [24].
猫步已成,「具身智能」的技术难关还有「哪几重门」?
机器之心· 2025-11-08 02:30
Group 1 - The recent debut of Xiaopeng's humanoid robot IRON, showcasing stable gait and fluid posture, has sparked widespread discussion about the maturity of humanoid robot technology [4][5] - Global investment in robotics startups has exceeded $8.5 billion in the first three quarters of 2025, with notable funding such as Figure's $1 billion Series C round, valuing the company at $39 billion [6] - The domestic market is also experiencing a surge, with funding reaching 23.2 billion yuan in the first five months of 2025, surpassing the total for 2024 [6] Group 2 - Renowned roboticist Rodney Brooks argues that humanoid robots are still in an early hype phase and emphasizes the importance of overcoming technical challenges for future development [6] - Current humanoid robots struggle with dexterity, relying heavily on visual input without real-time tactile feedback, making them akin to "blind touch" [6][8] - Brooks predicts that even by 2036, the dexterity of deployable humanoid robots will still be significantly inferior to that of human hands [8] Group 3 - To transition humanoid robots from laboratory settings to practical applications, challenges such as dexterous manipulation and bipedal stability must be addressed, alongside environmental perception and action planning [9] - The human hand has 27 degrees of freedom, while advanced robotic hands typically have only around 20, highlighting the gap in sensory capabilities [9] - The industry is developing new tactile sensors to enhance robots' ability to perform fine manipulations similar to humans [9]