Emu3.5
Search documents
登上Nature!智源研究院推出AI全能选手——Emu3,一统多模态学习
生物世界· 2026-01-31 03:05
撰文丨王聪 编辑丨王多鱼 排版丨水成文 AI 模型 能否像人类一样,同时理解 文字 、 图像 、 视频 甚至 动作 ?过去,AI 领域需要针对不同任务使 用不同模型——例如用扩散模型生成图像,用组合架构处理视觉语言理解。 而现在, 北京智源人工智能研究院推出了一款多模态大模型—— Emu3 ,或将改变这一局面。 该研究以: Multimodal learning with next-token prediction for large multimodal models ( 通过预测 下一个词元进行多模态学习的多模态大模型) 为题,于 2026 年 1 月 28 日在线发表于 Nature 期刊, 北 京智源人工智能研究院 黄铁军 、 王仲远 、 王鑫龙 为论文共同通讯作者,据悉,这也是 我国科研机构主 导的大模型成果首次在 Nature 正刊发表。 Emu3 仅基于 预测下一个词元 (Next-token predictio,NTP) ,就统一了 大规模文本、图像和视频的 多模态学习, 它不仅在生成和理解任务上媲美专用模型,还展示了视频生成、机器人操作等强大能力,这 一成果对构建可扩展、统一的 多模态智能系 ...
AI应用下一个突破口在哪
Bei Jing Shang Bao· 2025-12-10 15:44
Core Insights - The report indicates that AI is transitioning from the "tool era" to the "partner era" by 2025, with a clearer development trend for 2026 [1] Infrastructure - The report emphasizes the importance of computing power and chips, identifying the computing economy as the primary engine of the intelligent industry, with unprecedented global demand for AI computing power driving the construction of large-scale data centers [3] - These data centers are evolving from traditional server hosting to AI company-led powerhouses integrating massive computing, storage, and network resources [3] - Cloud computing vendors are shifting investments from general computing resources to dedicated computing infrastructure that meets AI demands, leading to strategic partnerships with AI companies [3] - The rise of AI-native demands is reshaping chip innovation, with GPU dominance being challenged by NPU and growth in ASIC/FPGA technologies [3] - China is accelerating the construction of a self-controlled computing ecosystem, with domestic "chip + SDK + framework" solutions validated in trillion-level model training [3] Model Innovation - Pre-training determines the hierarchy of large models, while architectural innovation influences pre-training levels, with hybrid expert models becoming mainstream under computational constraints [4] - In 2025, large models will enter the "inference time," with breakthroughs in multi-modal deep inference and adaptive reasoning [4] - The report highlights a surge in research and development in physical AI and embodied intelligence, with world models and VLA frameworks becoming focal points [4] - A new order collaboration was announced by UBTECH, involving humanoid robot sales exceeding 50 million yuan, showcasing integration with AI large models [4] Application Landscape - AI is reshaping traffic entry points, transitioning from "people finding services" to "services finding people," marking a new interaction paradigm [7] - AI agents are developing closed-loop capabilities in perception, planning, decision-making, and execution, gradually replacing traditional apps [7] - The new generation of AI systems can process and understand multiple information types simultaneously, enhancing performance in complex scenarios and opening new possibilities for creative content generation and intelligent interaction [7] - The report predicts that as technology matures in the next 2-3 years, AI will become a standard tool across various industries, transitioning from a competitive advantage to a necessity [7] - The AI hardware sector is also gaining attention, with lightweight models and edge computing technologies driving AI capabilities to mobile devices, cars, and IoT devices [7] - More smart devices are gaining local AI processing capabilities, addressing data privacy, network latency, and cost efficiency issues [7]
100亿都不够烧!机器人公司CEO们给出新判断:具身智能不能再照搬LLM
Sou Hu Cai Jing· 2025-11-22 02:41
Core Insights - The event highlighted the latest advancements in embodied intelligence by the Zhiyuan Research Institute, focusing on the importance of world models and the development of a comprehensive embodied brain system [2][3] Group 1: Zhiyuan's Full-Stack Layout - Zhiyuan introduced the native multimodal world model Emu3.5, which expanded training data from 15 years of video to 790 years and increased parameter size from 8 billion to 34 billion, enhancing video and image generation speed [5] - The institute is constructing a cross-heterogeneous ontology embodied intelligence system, including RoboBrain, RoboOS, and RoboBrain-0, deployed across various robotic forms for tasks ranging from navigation to complex interactions [5] Group 2: Key Elements of Embodied Intelligence - The role of world models in embodied intelligence was debated, with experts emphasizing the need for models that predict the next state based on the robot's form and goals, rather than merely generating videos [7][10] - There is a consensus that embodied intelligence should not follow the current language-first paradigm but rather adopt a structure centered on action and perception [10][12] - The importance of real data was highlighted, with discussions on the necessity of combining real, simulated, and video data for effective learning in robots [15][17] Group 3: Investment Priorities - When asked how to allocate 10 billion, experts prioritized talent acquisition, computational power, and data engines as key investment areas [19][21] - There were differing views on the importance of infrastructure versus model development, with some advocating for a focus on creating a comprehensive data engine for continuous digitalization [21][22] Group 4: Human-like Robots and Hardware Limitations - The debate on whether human-like robots represent the ultimate form of embodied intelligence concluded that neither models nor hardware define each other; rather, the specific application scenarios dictate the requirements [22][24] - Experts suggested that a layered structure for embodied intelligence should be adopted, where higher-level models can be reused across different robotic forms, but lower-level models must be tailored to specific hardware [23][24] Conclusion - The discussions at the event signaled a proactive search for solutions to achieve a closed-loop system in embodied intelligence, emphasizing the need for models, hardware, and scaling to evolve together [24]
奥特曼否认OpenAI明年上市;中国移动0元划转4198万股
2 1 Shi Ji Jing Ji Bao Dao· 2025-11-04 03:27
Group 1: OpenAI Developments - OpenAI CEO Altman denied rumors of the company going public next year, stating that there is no specific date or decision from the board regarding an IPO, but he believes it will eventually happen [2] - OpenAI's annual revenue significantly exceeds the rumored $13 billion [2] - OpenAI signed a $38 billion computing power procurement agreement with Amazon Web Services (AWS), marking its first collaboration with a global cloud infrastructure leader outside of Microsoft [5] Group 2: Corporate Actions and Financial Moves - China Mobile announced a non-cash transfer of 41.98 million shares to China National Petroleum Corporation, reducing its stake from 69.05% to 68.85% [3] - Boeing completed the sale of part of its digital aviation solutions business for $10.55 billion to Thoma Bravo, optimizing its capital structure and allowing a focus on core business [8] - Wuhan Weinan Battery Asset Co., Ltd. completed a C-round financing of 670 million yuan, with participation from NIO and CATL, to support battery asset-related business and technology development [12] Group 3: Technology and Innovation - Microsoft CEO Nadella indicated the company may restart hiring in the next year, contingent on existing employees learning to collaborate with AI [4] - Xiaopeng Motors' CEO He Xiaopeng announced plans to mass-produce robots by 2026, emphasizing the importance of integration and overcoming challenges in cost, safety, and consistency [6] - The Zhiyuan Research Institute released the Emu3.5 multimodal world model, significantly enhancing training data and inference speed, marking a new era in multimodal AI [13] Group 4: Market Trends and Strategic Moves - Elon Musk announced the upcoming launch of a new encrypted communication platform, XChat, which will integrate with the existing X social platform [7] - Qualcomm and MediaTek are accelerating their adoption of TSMC's N2P process technology to compete with Apple in chip production [11] - Tesla's AI team is progressing on the AI 5 chip for smart assisted driving, with future versions AI 6 and AI 7 expected to follow [10]
AI伪造黄仁勋直播,观看人数超英伟达官方5倍;OpenAI计划2027年上市,估值或高达一万亿美元|一周AI要闻汇总
36氪· 2025-11-01 09:45
Group 1 - Adobe launched its advanced image generation and editing model Firefly Image 5, supporting 4 million pixel native output and introducing new generative AI tools for applications like Photoshop and Premiere Pro [2][3] - Zhiyuan Research Institute released the Emu3.5 multimodal model, trained on over 10 trillion tokens, with video training duration increasing from 15 years to 790 years and parameter count rising from 8 billion to 34 billion [2] Group 2 - Figma acquired AI generation company Weavy to create a new "node-based" AI design paradigm, enhancing creative control for designers [6] - OpenAI plans to go public in 2027 with a potential valuation of $1 trillion, expecting revenue to double this year to $12.7 billion and continue growing rapidly [6][9] - YouTube is undergoing restructuring focused on AI applications, offering voluntary buyout options to employees considering leaving [7] Group 3 - Google Labs introduced Pomelli, an AI marketing tool designed to help small businesses quickly create social media campaigns by extracting brand information from their websites [4] - Synthesia, a UK-based AI video generation unicorn, completed a $200 million funding round, achieving a valuation of $4 billion and serving around 60,000 enterprises [9] - Ant Group's AI health application AQ ranked 7th in China's AI native application list, with a compound growth rate of 83.4%, significantly outpacing the industry average of 13.5% [8]
90后数学家王虹拿下超级大奖;陈天桥将投10亿美元算力支持发现式智能;泡泡玛特中东首店开业;OpenAI回应筹备IPO丨邦早报
创业邦· 2025-10-31 00:08
Group 1 - The 2025 Hurun Women Entrepreneurs List was released, with Zhong Huijuan from Hansoh Pharmaceutical becoming China's richest woman for the first time, with a wealth of 141 billion yuan [1] - Young mathematician Wang Hong from Guangxi won the 2025 Salem Prize, which is considered a precursor to the Fields Medal, and was also awarded at the World Chinese Mathematicians Conference [1] - OpenAI is reportedly preparing for an IPO, with a potential valuation of up to $1 trillion, which could be one of the largest IPOs in history [2] Group 2 - Li Cao from Leap Motor clarified that the company focuses on self-research of core technologies and respects Huawei as a benchmark for China's technological independence [2] - Xiaomi's "Giant Energy Saving" series was clarified by executives as a product line name rather than a performance metric, with energy efficiency exceeding national standards [4] - JD.com launched a promotional campaign offering free food delivery as part of its 11.11 shopping festival, with a total of 1 million free orders available [6] Group 3 - JD.com founder Liu Qiangdong treated 150,000 full-time delivery riders to KFC as a reward for their hard work during the 11.11 sales event [8] - Chen Tianqiao announced a $1 billion investment in computing power to support innovative AI research, emphasizing the importance of discovery in AI [8] - Giant Network responded to the departure of its former CEO, stating that the company is focused on reducing internal conflicts and improving decision-making efficiency [10] Group 4 - Didi announced a freight payment guarantee, committing to fully cover drivers' unpaid earnings if not received within seven days after order completion [10] - Pop Mart opened its first store in the Middle East, which operates 24 hours a day, marking a significant expansion for the brand [10] - Taobao is set to launch a "Taobao Convenience Store" project, offering a wide range of products online with a focus on quality and service standards [13] Group 5 - The skincare brand "LAN" responded to consumer concerns about compliance with regulatory standards, stating that their product registrations are valid [13] - Apple CEO Tim Cook avoided questions regarding iPhone Air production cuts during a recent earnings call, maintaining the company's policy of not disclosing specific model sales [13] - The NBA approved Mark Walter as the new owner of the Los Angeles Lakers, with a total valuation of $10 billion for the team [14] Group 6 - Ford announced an additional investment of $170 million in Argentina for the production of hybrid Ranger vehicles, set to begin in 2027 [14] - Wikipedia subtly criticized Elon Musk's AI-driven encyclopedia GrokiPedia, emphasizing its human-operated nature in a fundraising announcement [14] - Tesla is recalling 6,197 Cybertruck vehicles in the U.S. due to potential issues with the installation of off-road light bars [17] Group 7 - YouTube is undergoing a restructuring focused on AI applications, offering voluntary buyout options to employees considering leaving the company [17] - Volkswagen reported a net loss of €1.072 billion in Q3 2025, with a significant decline in profits attributed to increased electric vehicle production and additional costs [18] - Nvidia plans to invest up to $1 billion in AI startup Poolside, potentially increasing its valuation significantly [18] Group 8 - Intel is in preliminary talks to acquire AI chip startup SambaNova Systems, with potential valuation lower than its previous funding round [18] - Shunwei Capital led a multi-million yuan angel round investment in Zhefei Aviation Technology, indicating continued interest in the aviation sector [18] - Pyromind Dynamics completed a $10 million seed round financing to expand its team and product development in the reinforcement learning sector [18]