Workflow
通用人工智能
icon
Search documents
单次光传播完成复杂张量计算 向通用AI硬件研制迈出重要一步
Ke Ji Ri Bao· 2025-11-16 23:47
据《自然·光子学》杂志14日报道,芬兰阿尔托大学领导的国际研究团队开发出一种新方法,可利用单 次光传播完成复杂张量运算,实现以光速完成深度学习中的关键计算步骤。这是向通用人工智能 (AI)硬件研制迈出的重要一步,也为突破现有计算平台的性能瓶颈提供了全新解决路径。 为了进一步扩大计算能力,团队还采用多波长光,使不同颜色的光分别携带不同维度的数据,从而处理 更高阶的张量运算。这一方法的另一大优势在于其简单性。所有计算均在光的被动传播过程中完成,无 需主动控制或电子开关,因而更适合低能耗、高并行度的光学平台。 (文章来源:科技日报) 为此,团队开发出"光速单次张量计算"新方法,通过光波在空间中的自然传播实现数学运算,无需依赖 电子电路,也无需任何主动调控。卷积、矩阵乘法、注意力机制等深度学习的关键步骤,可在光穿过系 统的瞬间同步完成。 该方法的核心创新在于,将数字数据编码进光的幅度与相位,使数字信息转化为光场的物理属性。当这 些光场相互作用时,便能自然完成矩阵和张量运算。这一机制就像检查和分拣海关包裹,通常需要通过 多台功能各异的机器逐个检查,然后将它们分拣到正确的箱子里。光学计算方法可以将所有包裹和所有 机器整合 ...
AI浪潮奔涌,北京按下“加速键”!2025人工智能+大会以场景驱动点燃新质生产力
Huan Qiu Wang Zi Xun· 2025-11-16 14:22
来源:环球网 【环球网科技综合报道】2025年11月15-17日,以"AI下一个十年:场景驱动×新质引擎"为主题的2025人 工智能+大会在北京中关村国际创新中心举行。大会邀请众多行业顶级专家、头部企业和创业者、投资 人代表出席,共绘"人工智能+"未来发展新图景。本次大会由国家高新区人工智能产业协同创新网络、 中央广播电视总台《赢在AI+》节目组、清华大学可持续社会价值研究院、中国人民大学交叉科学研究 院、赛迪研究院人工智能研究中心、中关村发展集团联合主办。 畅想新趋势:共话AI未来十年 图灵奖得主、中国科学院院士、清华大学交叉信息研究院及人工智能学院院长姚期智发表主旨演讲《人 工智能的未来趋势》。他提出,多年的科技进步与积累,让人工智能发展得到新的动力。大模型的出现 正革新着各行各业,带来了人工智能的新潮流,有着无限发展机会。未来人工智能发展最重要的方向是 AGI即通用人工智能,前景辽阔、影响深远。我们必须不断地创新突破。中国不缺应用人才和场景,最 重要的是培养更多的尖端创新人才。 中国人工智能发展从一开始就强调与实体经济相融合。中国可持续发展研究会理事长、科技部原副部长 李萌在致辞中指出,中国人工智能发展 ...
姚期智、王兴兴发声!预见人工智能“下一个十年”
新浪财经· 2025-11-16 09:51
Core Viewpoint - The future development of artificial intelligence (AI) is centered around achieving satisfactory general artificial intelligence (AGI), which will significantly impact various sectors including science, strategy, and economic competition [2][3]. Group 1: Directions Towards AGI - The journey towards AGI will inevitably focus on four key directions: continuous evolution of large models, embodied general intelligence, AI for science, and AI safety governance [5][8]. - In the past five years, China has made remarkable progress in large model development, reaching a competitive level internationally [7]. - Embodied intelligence is crucial for enhancing robots' capabilities, allowing them to perform tasks that were previously difficult due to their rigid nature [8]. - AI for science is expected to revolutionize scientific research methodologies within the next 5 to 10 years, making collaboration between scientists and AI essential for competitive advantage [9]. Group 2: Risks and Governance - The development of AI poses significant safety risks, as it can potentially lead to loss of control and conflict with human intentions [10][11]. - AI algorithms inherently possess characteristics such as lack of robustness, uncertainty, and non-interpretability, which can impact societal values and ethics [11]. - Addressing the "survival risk" associated with AI requires the development of provably safe AI systems, leveraging theories from cryptography and game theory [12]. Group 3: Future of Robotics - The next decade is anticipated to transform robots from mere tools into life partners, capable of understanding the world and performing various tasks [14][17]. - Robots will increasingly collaborate with humans in industrial settings and provide assistance in community services, such as elderly care [17]. - The robotics industry will benefit from open-source collaboration to accelerate technological advancements and reduce innovation costs [17]. Group 4: Market Potential - The AI market is projected to reach a trillion-dollar scale as it empowers various industries, with open-source initiatives playing a crucial role in fostering commercial growth [19][20]. - The focus on intelligent terminals as potential AI entry points highlights the importance of integrating AI into everyday life, particularly in the automotive sector [22].
Dexmal原力灵机两轮融资金额近10亿元 阿里与蔚来资本分别领投
最新进展方面,今年10月,Dexmal原力灵机开源基于PyTorch的VLA工具箱——Dexbotic,为具身智能 领域从业者提供一站式科研服务;推出机器人开源硬件产品——DOS-W1(Dexbotic Open Source-W1), 大幅降低机器人的使用门槛,提升机器人维护和改造的便利性。 同时,该公司还联合Hugging Face发布全球首个具身智能的大规模真机评测平台RoboChallenge。 Dexbotic、DOS-W1和RoboChallenge三者正在形成深度协同,从软件、硬件和标准方面积极推动具身智 能机器人行业发展。 技术能力方面,公司曾参加CVPR2025协作智能Workshop核心赛事之一——RoboTwin,在第一轮仿真平 台赛中斩获并列第一;参加ICRA2025全球机器人视触融合挑战赛(ManiSkill-ViTac2025)中荣获"纯触觉 操控"和"触觉传感器设计"两个赛道金牌。 展望未来,Dexmal原力灵机表示公司将加速具身智能领域的算法驱动、硬件设计与场景闭环的协同创 新,加快通用人工智能的物理世界落地。 近日,具身智能公司Dexmal原力灵机宣布完成数亿元A+轮融资,阿 ...
万字长文总结多模态大模型最新进展(Modality Bridging篇)
自动驾驶之心· 2025-11-15 03:03
Core Insights - The article discusses the emergence of Multimodal Large Language Models (MLLMs) as a significant research focus, highlighting their capabilities in performing multimodal tasks such as story generation from images and mathematical reasoning without OCR, indicating a potential pathway towards general artificial intelligence [2][4]. Group 1: MLLM Architecture and Training - MLLMs typically undergo large-scale pre-training on paired data to align different modalities, using datasets like image-text pairs or automatic speech recognition (ASR) datasets [2]. - The Perceiver Resampler module maps variable-sized spatiotemporal visual features from a vision encoder to a fixed number of visual tokens, reducing computational complexity in visual-text cross-attention [6][8]. - The training process involves a two-phase strategy: the first phase focuses on visual-language representation learning from frozen image encoders, while the second phase guides visual-to-language generation learning from frozen LLMs [22][24]. Group 2: Instruction Tuning and Data Efficiency - Instruction tuning is crucial for enhancing the model's ability to follow user instructions, with the introduction of learned queries that interact with both visual and textual features [19][26]. - The article emphasizes the importance of diverse and high-quality instruction data to improve model performance across various tasks, including visual question answering (VQA) and OCR [44][46]. - Data efficiency experiments indicate that reducing the training dataset size can still maintain high performance, suggesting potential for further improvements in data utilization [47]. Group 3: Model Improvements and Limitations - LLaVA-NeXT shows improvements in reasoning, OCR, and world knowledge, surpassing previous models in several benchmarks [40]. - Despite advancements, limitations remain, such as the model's inability to handle multiple images effectively and the potential for generating hallucinations in critical applications [39][46]. - The article discusses the need for efficient sampling methods and the balance between data annotation quality and model processing capabilities to mitigate hallucinations [48].
宇树科技IPO辅导完成,拟境内首次公开发行股票并上市
是说芯语· 2025-11-15 02:03
Core Viewpoint - Yushu Technology is actively preparing for its IPO, which is expected to be one of the largest and most well-known domestic technology company listings in China in recent years [3]. Group 1: Company Overview - Yushu Technology focuses on civil robotics, with its revenue structure in 2024 projected to be approximately 65% from quadruped robots, 30% from humanoid robots, and 5% from component products [4]. - About 80% of quadruped robots are used in research, education, and consumer fields, while the remaining 20% are applied in industrial sectors such as inspection and firefighting [4]. Group 2: IPO Preparation - Yushu Technology has completed its IPO counseling work with CITIC Securities, which confirms that the company has the necessary governance structure, accounting practices, and internal control systems to become a listed company [2]. - The company is expected to submit its listing application documents in the fourth quarter of this year [3]. Group 3: Product Development - On October 20, Yushu Technology launched the new generation full-size humanoid robot Unitree H2, which features a significant increase in joint flexibility from 19 to 31 joints, enhancing its movement capabilities by 63% [6]. - The founder of Yushu Technology, Wang Xingxing, stated that the H2 represents a shift from "moving machines" to "usable partners," aiming to serve safely and friendly [6]. Group 4: Industry Insights - Wang Xingxing highlighted that as AI technology advances, the dependency of robots on hardware performance will gradually decrease, suggesting that modern AI algorithms are more tolerant of hardware errors and inconsistencies [8]. - He emphasized that achieving embodied intelligence could bring robots closer to AGI (Artificial General Intelligence), which could perform a wide range of human-required tasks [8].
Dexmal原力灵机两轮融资近10亿元,CEO来自清华“姚班”
Sou Hu Cai Jing· 2025-11-14 05:40
来源:猎云网 近日,具身智能公司 Dexmal 原力灵机宣布完成数亿元 A+ 轮融资,阿里巴巴为独家投资方;此前,公司 A 轮融资由蔚来资本领投,洪泰基金、联想创投、 锡创投和正景基金跟投,老股东君联资本超额追投、启明创投和九坤创投追投;两轮融资金额近 10 亿元,资金主要用于智能机器人软、硬件技术研发与落 地。 成立于2025年3月,Dexmal原力灵机是一家专注于具身智能软硬件技术研发与落地的创新型公司,公司CEO唐文斌为清华大学"姚班"出身、首届"Yao Award"金牌得主,同时也是旷视科技联合创始人兼CTO。 其核心团队兼具AI顶尖学术背景和10余年AI原生产品规模落地经验,不仅在算法研发、硬件研发、数据管理、工程创新、场景落地多个方面积累丰富,同 时具备行业稀缺的"算法+硬件+场景"复合基因。 值得一提的是,公司团队还在AI物流机器人领域积累了丰富的落地经验。凭借在智慧物流机器人技术和柔性仓库自动化产品方面的优势,公司推动物流行 业智能化升级的同时,也为其在具身智能领域的发展奠定了软件、硬件和产品的先发优势。 基于上述优势,公司自主研发的端到端多模态具身智能大模型MMLA,可以通过深度融合多传感器 ...
2025第二届中关村具身智能机器人应用大会——全流程解码,共赴产业爆发盛宴
机器人大讲堂· 2025-11-13 15:00
Core Insights - The article highlights the significance of the 2025 Second Zhongguancun Embodied Intelligence Robot Application Conference, emphasizing its role in shaping the future of intelligent technology and industry needs [1][3]. Event Overview - The conference will take place on November 19, 2025, at the Zhongguancun National Independent Innovation Demonstration Zone Conference Center, gathering over 400 top scientists, entrepreneurs, and government representatives [6][19]. - It aims to create a value bridge from laboratory innovation to industrial-level implementation, focusing on breaking industry bottlenecks and activating industrial momentum [3][17]. Agenda Highlights - The opening ceremony will feature keynotes on topics such as "Embodied Intelligence Perception and Operation" and "New Production Forces in the Intelligent Era" by leading experts from Tsinghua University and Beihang University [8][11]. - A roundtable forum will discuss the transformation from competition to market, addressing the adaptation of technology to real business needs [10][17]. Technical Insights - The conference will include discussions on the latest breakthroughs in embodied intelligence, focusing on practical applications and ecological construction to drive industrial momentum [17][18]. - Key industry leaders will share experiences on enabling humanoid robots with human-like interaction capabilities and the future of self-evolving robots [18]. Industry Engagement - The event will serve as a hub for resource connection, featuring exhibitions from 13 well-known industry companies and award-winning teams, showcasing cutting-edge technologies and products [18][19]. - The conference aims to facilitate a comprehensive service loop from policy guidance to execution, enhancing the overall ecosystem of the embodied intelligence industry [3][17].
李飞飞最新长文火爆硅谷
量子位· 2025-11-11 00:58
Core Viewpoint - Spatial intelligence is identified as the next frontier for AI, with the potential to revolutionize creativity, robotics, scientific discovery, and more [2][4][10]. Group 1: Definition and Importance of Spatial Intelligence - Spatial intelligence is described as a foundational aspect of human cognition, enabling interaction with the physical world and driving reasoning and planning [20][21]. - The evolution of spatial intelligence is linked to the development of perception and action, which are crucial for understanding and interacting with the environment [12][13][14]. - Historical examples illustrate how spatial intelligence has driven significant advancements in civilization, such as Eratosthenes' calculation of the Earth's circumference and the invention of the spinning jenny [18][19]. Group 2: Current Limitations of AI - Current AI models, including multimodal large language models (MLLMs), have made progress in spatial perception but still fall short of human capabilities [23][24]. - AI struggles with tasks involving physical representation and interaction, lacking the holistic understanding that humans possess [25][26]. Group 3: World Models as a Solution - The concept of "world models" is proposed as a new generative model that can surpass the limitations of current AI by understanding, reasoning, generating, and interacting with complex virtual or real worlds [28][30]. - World models should possess three core capabilities: generative, multimodal, and interactive [31][34][38]. - The development of world models is seen as a significant challenge that requires innovative methodologies to coordinate semantic, geometric, dynamic, and physical aspects [39][41]. Group 4: Applications and Future Potential - The potential applications of spatial intelligence span various fields, including creativity, robotics, science, healthcare, and education [56][57]. - In creativity, platforms like World Labs' Marble are enabling creators to build immersive experiences without traditional design constraints [52][53]. - In robotics, achieving spatial intelligence is essential for robots to assist in various environments, enhancing productivity and human collaboration [60][62]. Group 5: Vision for the Future - The vision for the future emphasizes the importance of AI enhancing human capabilities rather than replacing them, with spatial intelligence playing a crucial role in this transformation [47][50]. - The exploration of spatial intelligence is framed as a collective effort that requires collaboration across the AI ecosystem, including researchers, innovators, and policymakers [51][63].
突发!英特尔首席技术官跳槽
是说芯语· 2025-11-11 00:29
Core Viewpoint - The departure of Intel's CTO Sachin Katti to OpenAI has raised significant attention in the tech industry, particularly regarding Intel's AI business strategy and future developments in general artificial intelligence (AGI) infrastructure [1][5]. Group 1: Leadership Changes - Sachin Katti, during his tenure at Intel, held multiple key positions including Senior Vice President (SVP), Chief Technology Officer (CTO), and Artificial Intelligence Officer (AIO), playing a crucial role in shaping Intel's AI strategy and product roadmap [3]. - Following Katti's departure, Intel's CEO Pat Gelsinger will personally oversee the AI business to ensure a smooth transition and continued progress in related initiatives [5]. Group 2: Background of Sachin Katti - Sachin Katti has a strong academic background with a Ph.D. in Electrical Engineering and Computer Science from MIT and a bachelor's degree from the Indian Institute of Technology, Bombay [4]. - Prior to joining Intel, Katti was a professor at Stanford University, recognized for his pioneering research in wireless communication and network coding, earning several prestigious awards [4]. - Katti is also a successful entrepreneur, co-founding Kumu Networks and Uhana, the latter of which focused on advanced AI solutions for mobile network optimization before being acquired by VMware [4][5]. Group 3: Industry Impact - Katti is acknowledged as a leader in the telecommunications sector, having co-chaired the O-RAN Alliance's Technical Steering Committee, promoting the adoption of open intelligent wireless access networks globally [5].