Workflow
World Model
icon
Search documents
腾讯研究院AI速递 20260311
腾讯研究院· 2026-03-10 16:01
Group 1 - Anthropic has introduced a multi-agent code review system for Claude Code, increasing the proportion of PRs receiving substantial review feedback from 16% to 54% after deployment [1] - In large PRs exceeding a thousand lines, 84% receive review comments, averaging 7.5 issues found, with incorrect review results marked at less than 1% [1] - The review system operates on a token-based billing model, costing between $15 to $25 per review, and allows customization of review rules for team and enterprise users [1] Group 2 - AMI Labs, founded by Turing Award winner Yann LeCun, has completed a $1.03 billion seed round with a valuation of $3.5 billion, led by former FAIR engineering director Alex LeBrun as CEO [2] - The company aims to build a world model based on the JEPA architecture, focusing on high-reliability scenarios in industrial control, robotics, wearables, and healthcare [2] - Alexey Sutskever, the proposer of the DiT architecture, has joined as Chief Scientist, with the first practical application expected to take at least a year of research [2] Group 3 - Microsoft has launched Copilot Cowork, which fully integrates with Excel, Word, PPT, and Outlook, utilizing the Anthropic Claude model for reasoning [3] - Key functionalities include automatic weekly schedule organization, preparing entire client meeting agendas with a single command, and executing comprehensive plans from competitive analysis to product launch [3] - The pricing is set at an additional $30/month on top of the M365 enterprise version, with a new E7 package available for $99/month, currently in limited customer research preview [3] Group 4 - Tencent's Mix Yuan 3D team has open-sourced the first reinforcement learning post-training framework for world models, named WorldCompass, addressing pre-trained world model instruction failures [4] - The framework features three core innovations: slice-level sampling to reduce computational complexity, interaction-following scoring based on a 3D base model, and efficient RL optimization algorithms [4] - Interaction accuracy in composite action scenarios has improved from 20% to 55%, achieving better scores on the Stanford WorldScore benchmark [4] Group 5 - Zhipu has launched AutoClaw, a one-click installation tool for local versions on macOS and Windows, providing full OpenClaw capabilities and automatic integration with instant messaging tools [6] - The tool includes the Pony-Alpha-2 model optimized for OpenClaw scenarios, enhancing task execution and integrating AutoGLM Browser-Use capabilities [6] - It features over 50 mainstream skills and APIs covering content creation, office tasks, coding, marketing, and finance, with support for various model APIs [6] Group 6 - Reports indicate that the U.S. military utilized Palantir's Maven system embedded with the Claude model during the U.S.-Iran conflict, analyzing over 150 information streams on the first day [7] - The Maven system integrates data from satellite images, drone footage, and intercepted communications, allowing Claude to generate target suggestions and precise coordinates in real-time [7] - The military has reportedly struck over 3,000 targets, with a Georgetown University study showing that the workload previously requiring 2,000 personnel can now be handled by just 20 [7] Group 7 - Figure has released an update on its robot, which autonomously organizes a living room using the Helix 02 system, performing tasks such as disinfecting surfaces and organizing items [8] - The Helix 02 system features a three-layer architecture for semantic reasoning, perception conversion, and control based on extensive human motion data [8] - The team has not developed new algorithms or customized scenarios, instead allowing the system to learn new tasks simply by supplementing data [8] Group 8 - The AI system OALL has launched O-DataMap, mapping experimental data from global papers into a navigable two-dimensional coordinate system [9] - The map allows users to assess research field heat and maturity, trace knowledge lineages of individual studies, and evaluate research gaps based on input ideas [9] - The map grows in real-time as the AI pipeline continuously analyzes new papers, providing insights into the influence of researchers across fields [9] Group 9 - The latest a16z global AI product Top 100 report shows ChatGPT leading with 900 million weekly users, while Claude's paid subscriptions have increased by over 200% [10] - ChatGPT is expanding into over 85 categories, including travel and shopping, while Claude focuses on professional users with integrated financial terminals and developer infrastructure [10] - OpenClaw has become the highest-starred project on GitHub, surpassing React and Linux, indicating a shift in the competitive landscape of AI products [10] Group 10 - A discussion between Fields Medalist Terence Tao and OpenAI's Mark Chen highlighted that AI is transforming mathematics into a more industrialized field, with significant reductions in error rates [11] - Tao noted that AI has become a daily research tool, outsourcing complex calculations, and has already solved several long-standing mathematical problems with minimal human oversight [11] - Chen emphasized that formal verification systems in mathematics serve as natural judges for reinforcement learning, enabling a mechanism for "infinite cheap trial and error" [11]
AI Video Generation: An AGI Precursor?
Alex Kantrowitz· 2026-02-27 19:20
You can think of a video model that can generate you 10 seconds, 20 seconds of a realistic scene. It's sort of a model of the physical world. Intuitive physics we'd sometimes call it in physics land.And it's sort of intuitively understood how uh liquids and and and and objects behave in the world. And that's um and obviously one way to exhibit understanding is to be able to generate it at least to the to the to the human eye being accurate enough to to be satisfying to the human eye. Obviously, it's not com ...
中国人形机器人_春晚曝光有望推动应用热潮-China Humanoid Robot_ Gala visibility likely to fuel adoption surge
2026-02-24 14:16
We maintain our global humanoid robot shipment forecasts of 51,000 units in 2026E and 76,000 units in 2027E. This represents a multi-fold increase from the estimated 15,000-20,000 units in 2025, primarily driven by dedicated-purpose commercial deployments ahead in addition to entertaining stage performances, scientific study, education and data factory demand. The market is likely to react positively to key humanoid robot supply chain stocks for a few trading days, anticipating an adoption surge in the comi ...
X @Balaji
Balaji· 2026-02-14 07:54
RT a16z (@a16z)"You can't just think your way into an advantage."Balaji Srinivasan on why AGI may be embodied:"Every drone, every self-driving car, every humanoid pulls in all that [sensory] data. They actually can have a world model.""The very Abrahamic western concept of a single AGI that will tell us what to do might have been plausible three years ago—I don't think it's really that plausible now.""AGI is gated by the physical world."@balajis on Network State Podcast ...
李飞飞的反共识判断
虎嗅APP· 2026-02-08 09:42
Core Insights - The article presents a counter-consensus viewpoint from Fei-Fei Li, emphasizing that large language models alone cannot lead to Artificial General Intelligence (AGI), and that spatial intelligence is a more foundational path [4][5][6]. Group 1: AGI Route Debate - Language is not the entirety of intelligence and is not its foundation; spatial intelligence, which has evolved over 500 million years, is crucial for AI development [5][6]. - If AI only possesses language capabilities, it will remain confined to the digital realm; true AGI requires understanding and interaction with the three-dimensional physical world [6]. Group 2: Redefining World Models - The newly introduced spatial intelligence model, Marble, can process multimodal inputs and create a navigable, interactive 3D world with physical consistency, differing from traditional video models [7][8]. - Marble has applications in various fields, including game development, visual effects, and even therapeutic settings for conditions like OCD [8]. Group 3: Scaling Law and Data Challenges - The slower development of physical world AI compared to language models is attributed to the noise in physical data and the difficulty in large-scale data acquisition [8][9]. - World Labs employs a hybrid data strategy, combining existing internet data with synthetic and real-world data to overcome these challenges [8][9]. Group 4: General Robotics vs. Autonomous Driving - General robotics is viewed as a higher-dimensional challenge compared to autonomous driving, which operates primarily in a 2D space [10][11]. - The core task of general robots involves interaction in 3D space, which presents significant technical challenges [10][11]. Group 5: AI as a Fundamental Infrastructure - AI is likened to electricity, with its success not measured by model size but by its ability to empower civilization and improve individual lives [11][12]. - The goal of World Labs is to integrate spatial intelligence into various industries, aiming for significant advancements by 2026 [12].
Google World Model AI Accelerates Waymo Robotaxi Expansion
PYMNTS.com· 2026-02-06 23:32
Core Insights - Waymo is enhancing its self-driving technology through the development of the Waymo World Model, which is based on Google DeepMind's Genie 3, aimed at improving real-world service scalability [1][2] Group 1: Waymo World Model - The Waymo World Model utilizes Genie 3's extensive world knowledge to simulate various scenarios, including extreme weather and safety-critical events [3][4] - This model allows engineers to modify simulations using simple language prompts and driving inputs, enhancing the controllability of the simulations [3][4] Group 2: Impact of Genie 3 - Genie 3 is designed to create 3D environments governed by physics, enabling AI agents to learn through exploration of virtual worlds rather than relying on static datasets [5] - Google DeepMind launched an experimental prototype, Project Genie, which allows users to interact with world-generation features [6] Group 3: Market Reaction and Investment - Following the announcement of Genie 3, the video game industry experienced a significant market value loss due to concerns over AI's capability to generate video games [7] - Waymo successfully raised $16 billion in a funding round, resulting in a post-money valuation of $126 billion, with Alphabet remaining its majority investor [7]
X @Demis Hassabis
Demis Hassabis· 2026-01-29 17:23
Thrilled to launch Project Genie, an experimental prototype of the world's most advanced world model. Create entire playable worlds to explore in real-time just from a simple text prompt - kind of mindblowing really! Available to Ultra subs in the US for now - have fun exploring! https://t.co/2XDy0V0BW0 ...
X @Herbert Ong
Herbert Ong· 2026-01-23 19:41
Well done @olivercameron!This is huge!Feels like the beginning of Star Trek's holodeck?Instantly create any video or simulated world you want. I can see it impacting so many industries.Odyssey (@odysseyml):Introducing Odyssey-2 Pro—a frontier world model that generates long-running, interactive simulations in 720p!We're also launching the first world model API, to enable devs to build magical apps.We're now in the GPT-2 era of world models. Let the explosion of apps commence! https://t.co/oth5V9cv5M ...
华为哈勃押注,成立仅半年融资三连跳,这家公司凭什么成为“世界模型黑马”?
机器人大讲堂· 2026-01-20 09:11
Core Viewpoint - Manifold AI, founded by a former key member of SenseTime, aims to redefine embodied intelligence through its World Model technology, enabling robots to not only perceive but also predict physical interactions in their environment [1][4][12]. Group 1: Financing and Growth - Manifold AI has completed over 300 million yuan in financing within just seven months of its establishment, showcasing a rapid fundraising pace that reflects strong market interest in "Physical AI" [2][7]. - The company has successfully raised funds in three rounds: a seed round led by Inno Angel Fund, followed by two angel rounds, each exceeding 100 million yuan [4][7]. - The latest funding round included notable investors such as Meihua Venture Capital, Junlian Capital, and Huawei Hubble, indicating a strong backing from the industry [1][9]. Group 2: Technology Development - Manifold AI's technology focuses on World Model Action (WMA), which allows robots to predict physical state changes based on first-person perspective videos, moving beyond traditional visual-language models (VLM) [12][14]. - The company's WorldScape model enables robots to simulate and interact with their environment autonomously, marking a shift from mere execution of pre-set codes to possessing "brain-like" capabilities [14][15]. - Manifold AI is developing multiple specialized models, including DriveScape for autonomous driving, RoboScape for physical interaction, and AirScape for drones, all built on the foundational WorldScape model [15]. Group 3: Future Aspirations - The company aims to equip over 10% of robots in the market with its "Manifold Brain," pushing the boundaries of Physical AI agents [19][20]. - The long-term vision includes transitioning World Models from experimental stages to practical applications in warehouses, factories, and homes within the next three years [20][21]. - The strategy emphasizes creating a universal embodied world model while simultaneously commercializing sub-domain models to generate revenue and support further development [20].
我们在招募这些方向的合伙人(世界模型/4D标注/RL)
自动驾驶之心· 2026-01-12 09:20
Core Viewpoint - The autonomous driving industry has entered its second phase, requiring more dedicated individuals to address its challenges and pain points [2]. Group 1: Industry Direction - The main focus areas include but are not limited to: autonomous driving product management, 4D annotation/data loop, world models, VLA, large models for autonomous driving, reinforcement learning, and end-to-end solutions [4]. Group 2: Job Description - The positions are primarily aimed at training collaborations in autonomous driving, targeting B-end (enterprises, universities, research institutes) and C-end (students, job seekers) for course development and original article creation [5]. Group 3: Contact Information - For discussions regarding compensation and collaboration methods, interested parties are encouraged to add the WeChat contact wenyirumo for further communication [6].