情境智能
Search documents
从Token到词元:全模态时代的基模与交互入口
量子位· 2026-03-27 05:10
Core Viewpoint - The article discusses the establishment of "Token" as the standard translation for "词元" by the National Bureau of Statistics, highlighting the significant daily usage of Tokens in China and the shift from discrete text to continuous perception in AI systems [1][37]. Group 1: Token Standardization and Industry Trends - The term "词元" was promoted by Professor Qiu Xipeng from Fudan University in 2021, emphasizing its role as a fundamental unit in language processing while avoiding confusion with natural language "words" [3]. - The deployment of Agents in multi-modal scenarios is changing the way Tokens are generated and consumed, impacting the capabilities and cost structures of next-generation AI systems [1][10]. - Companies focusing on unified Token structures and contextual intelligence are gaining significant capital attention, as seen with the recent funding of MoSi Intelligent [4][36]. Group 2: Technological Pathways and Innovations - MoSi Intelligent is pursuing a less common path by starting with voice technology and moving towards a unified Token structure for multi-modal information processing [7][9]. - The choice of voice as a breakthrough point is due to its higher information density and its natural alignment with real-world human-computer interactions [9][10]. - The development of SpeechGPT and SpeechTokenizer demonstrates the feasibility of integrating continuous speech signals into a unified Token space, allowing for a cohesive understanding of both spoken and written language [14][17]. Group 3: Advancements and Future Directions - The release of AnyGPT marks a significant step in unifying voice, text, images, and video into a discrete Token system, paving the way for comprehensive multi-modal models [18][19]. - MoSi Intelligent's ongoing advancements, such as MOSS-TTSD and NEX, showcase the competitive edge gained through a unified architecture that extends to Agent and productivity scenarios [21][22]. - The company is building a robust team with deep research and engineering capabilities, supported by the Shanghai Institute of Intelligent Technology, which enhances its speed of technological transformation [27][31]. Group 4: Market Positioning and Commercialization - MoSi Intelligent's multi-modal model open platform is in full public testing, providing API services that cater to enterprise-level demands across various sectors [35][36]. - The company emphasizes an integrated capability from foundational models to vertical applications, aiming to create a dual-driven growth model through Token production, distribution, and application [36][38]. - The official recognition of "词元" signifies a shift towards a more regulated industry, where future model capabilities will increasingly depend on architectural innovation and talent density rather than just parameter scaling [37][38].
具身智能2026机器人“破壁之年”
Xin Lang Cai Jing· 2026-02-27 07:06
Core Insights - The article discusses the significant advancements in humanoid robotics, highlighting the transition from basic movement capabilities to a deeper understanding of human intentions and environmental contexts, marking a pivotal moment for the industry [2][3] Group 1: Technological Breakthroughs - The evolution of embodied intelligence in robots is expected to reach a critical point by 2026, characterized by a dual-driven model of "brain evolution + body iteration," enabling robots to understand physical world dynamics like humans [2] - The integration of embodied intelligence with large models will lead to a shift from traditional programming to a fully autonomous process of "perception-decision-execution," making task-oriented AI agents the focal point of industry competition [3] Group 2: Application Scenarios - By 2026, the application of embodied intelligence will expand from isolated pilot projects to comprehensive integration across various sectors, including industrial, domestic, commercial, and medical fields, with an anticipated penetration rate in industrial applications exceeding 15% [3] Group 3: Safety and Collaboration Standards - The introduction of international safety standards, such as IEC62849:2025, will address the safety concerns associated with robots in domestic and public environments, while advancements in flexible materials will enhance the safety features of robots [4] Group 4: Energy Efficiency and Endurance - Innovations in solid-state batteries and energy recovery systems are projected to extend the operational time of humanoid robots to over 16 hours, with some models incorporating solar charging capabilities for enhanced efficiency [5] Group 5: Mass Production and Cost Reduction - The humanoid robotics industry is expected to transition from laboratory prototypes to large-scale production by 2026, with an estimated global output of over 50,000 humanoid robots, and a projected cost reduction of 35%-45% compared to 2025 [6] Group 6: Urban Services and Infrastructure - Smart cities like Singapore and Hangzhou are leading the way in deploying robot-friendly infrastructure, facilitating the operation of delivery, inspection, and cleaning robots, thereby improving urban management efficiency [7] Group 7: Trust and Ethical Challenges - The primary challenge for embodied intelligence robots in 2026 will be societal acceptance and trust, particularly concerning privacy and data security, prompting the adoption of edge intelligence paradigms to manage sensitive data locally [8] - The establishment of a layered responsibility system for robots will address liability issues in case of accidents, promoting the development of insurance and ethical standards within the industry [8] Conclusion - The year 2026 is anticipated to be a transformative period for embodied intelligence robots, shifting their role from mere performers to essential participants in daily life, with Chinese companies and research institutions at the forefront of this evolution [9]
这届NeurIPS 2025太有看头了!11月22日北京见
机器之心· 2025-11-16 07:30
Core Insights - The evolution of AI is transitioning from "capability breakthroughs" to "system construction" by 2025, focusing on reliability, interpretability, and sustainability [2] - NeurIPS 2025 will be held from December 2 to 7 in San Diego, USA, with a record of 21,575 submissions and an acceptance rate of 24.52%, indicating a growing global AI academic ecosystem [2] - The event aims to serve the Chinese AI community through various activities, including keynote speeches, paper sharing, roundtable discussions, and poster sessions [3] Event Details - The "NeurIPS 2025 Paper Sharing Conference" will take place on November 22, 2025, from 09:00 to 17:30 at the Crowne Plaza Hotel in Zhongguancun, Beijing [5][6] - The agenda includes keynote speeches, paper presentations, and poster exchanges, providing a platform for academic and industry collaboration [3][10] Keynote Speakers - The morning keynote will be delivered by Professor Qiu Xipeng from Fudan University, focusing on "Contextual Intelligence: Completing the Key Puzzle of AGI" [14][16] - The afternoon keynote speaker is Fan Qi from Nanjing University, with the topic yet to be determined [17] Paper Presentations - A variety of papers will be presented, covering topics such as data mixing, multimodal adaptation, and reinforcement learning in large language models [9][11][23] - Notable presentations include "Data Mixing Can Induce Phase Transitions in Knowledge Acquisition" and "Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model" [9][11]
复旦大学/上海创智学院邱锡鹏:Context Scaling,通往AGI的下一幕
机器之心· 2025-06-15 04:40
Core Viewpoint - The article discusses the concept of Context Scaling as a crucial step towards achieving Artificial General Intelligence (AGI), emphasizing the need for AI to understand and adapt to complex and ambiguous contexts rather than merely increasing model size or data volume [2][21]. Summary by Sections Evolution of Large Models - The evolution of large models is summarized in three acts: 1. The first act focuses on the success of model scaling, where data and parameters are stacked to compress knowledge, leading to the emergence of models like ChatGPT and MOSS [6]. 2. The second act involves post-training optimization, enhancing decision-making capabilities through methods like reinforcement learning and multi-modal approaches, exemplified by models such as GPT o1/o3 and DeepSeek-R1 [6][7]. 3. The third act, Context Scaling, aims to address the challenges of defining context to improve model capabilities, particularly in complex and nuanced situations [8][21]. Context Scaling - Context Scaling is defined as the ability of AI to understand and adapt to rich, complex, and dynamic contextual information, which is essential for making reasonable judgments in ambiguous scenarios [8][9]. - The concept of "tacit knowledge" is introduced, referring to the implicit understanding that humans possess but is difficult to articulate, which AI must learn to capture [11][12]. Three Technical Pillars - Context Scaling is supported by three key capabilities: 1. Strong Interactivity: AI must learn from interactions, understanding social cues and cultural nuances [14][15]. 2. Embodiment: AI needs a sense of agency to perceive and act within its environment, which can be tested in virtual settings [16]. 3. Anthropomorphizing: AI should resonate emotionally with humans, understanding complex social interactions and cultural sensitivities [17]. Challenges and Integration - The article highlights that Context Scaling is not a replacement for existing scaling methods but rather complements them by focusing on the quality and structure of input data [18]. - It also redefines the environment for reinforcement learning, moving beyond simple state-action-reward loops to include rich contextual information [20]. Conclusion - The exploration of Context Scaling aims to unify various technological paths under the core goal of contextual understanding, which is seen as essential for navigating the complexities of the real world and a potential key to achieving AGI [22].