Workflow
DeepMind
icon
Search documents
最近咨询世界模型岗位的同学越来越多了......
自动驾驶之心· 2026-01-22 00:51
Core Viewpoint - The article emphasizes the growing demand for positions in the field of autonomous driving, particularly in the areas of world models, end-to-end systems, and VLA, highlighting the importance of practical experience and advanced knowledge in these domains [2][4]. Course Overview - The course on world models in autonomous driving is being launched in collaboration with industry experts, focusing on various algorithms and applications, including Tesla's world model and the Marble project by Fei-Fei Li's team [2][4]. - The course aims to provide a comprehensive understanding of world models, covering their development history, current applications, and different approaches such as pure simulation, simulation + planning, and generative sensor input [7]. Course Structure - **Chapter 1: Introduction to World Models** This chapter reviews the relationship between world models and end-to-end autonomous driving, discussing the evolution and current applications of world models, as well as various streams within the field [7]. - **Chapter 2: Background Knowledge of World Models** This chapter covers foundational knowledge related to world models, including scene representation, Transformer technology, and BEV perception, which are crucial for understanding subsequent chapters [8][12]. - **Chapter 3: General World Model Exploration** Focuses on popular models such as Marble, Genie 3, and the latest discussions around VLA + world model algorithms, providing insights into their core technologies and design philosophies [9]. - **Chapter 4: Video Generation-Based World Models** This chapter delves into video generation algorithms, starting with notable works like GAIA-1 & GAIA-2 and extending to recent advancements, ensuring a balance between classic and cutting-edge research [10]. - **Chapter 5: OCC-Based World Models** Concentrates on OCC generation methods, discussing three major papers and a practical project, highlighting their applicability in trajectory planning and end-to-end systems [11]. - **Chapter 6: World Model Job Specialization** This chapter shares practical insights from the instructor's experience, addressing industry applications, pain points, and interview preparation for related positions [12]. Learning Outcomes - The course is designed to elevate participants to a level equivalent to one year of experience as a world model algorithm engineer, covering key technologies and enabling practical application in projects [15].
腾讯研究院AI速递 20260122
腾讯研究院· 2026-01-21 16:01
Group 1 - DeepSeek's Model 1 has been discovered in the FlashMLA codebase, potentially indicating an upcoming release, featuring a 512-dimensional architecture and support for NVIDIA's Blackwell architecture [1] - Liquid AI has launched the open-source inference model LFM2.5-1.2B-Thinking, which operates on a liquid neural network architecture and requires only 900MB of memory on mobile devices, achieving a score of 88 on MATH-500 [2] - The xAI engineer revealed that AI is being tested as a "colleague" in the MacroHard project, achieving human speeds eight times faster, and the company is considering utilizing idle computing power from approximately 4 million Tesla vehicles in North America [3] Group 2 - Research indicates that models like DeepSeek-R1 can spontaneously form multi-role debate mechanisms, significantly improving accuracy through internal social dialogue [4][5] - Medical SAM3, a new model developed by the University of Central Florida, allows for expert-level segmentation in medical imaging using only text prompts, achieving an average accuracy increase from 11.9% to 73.9% across 33 datasets [6] - Anthropic's CEO predicts that AI will fully take over software engineering roles within 6-12 months, with a significant portion of entry-level jobs expected to disappear in the next 1-5 years [7] Group 3 - The Sequoia xbench team reported that top agents can handle over 60% of 104 daily tasks, indicating that foundational agent capabilities have become commoditized [8] - OpenAI's CFO discussed the maturation of multi-agent systems by 2026, emphasizing that AI bubbles should be measured by API call volumes rather than stock prices, with productivity increases of 27-33% for cutting-edge companies [9]
牵手强生!诺奖得主创办的AI制药,重磅合作!
Xin Lang Cai Jing· 2026-01-21 12:42
Core Insights - Isomorphic Labs, an AI pharmaceutical company spun off from DeepMind, has announced a collaboration with Johnson & Johnson's Janssen Pharmaceuticals to combine AI design capabilities with drug development expertise [1][5] - The partnership will focus on "cross-modal, multi-target research collaboration," with Isomorphic responsible for computer predictions and design, while the U.S. team will handle experimental testing and project development [1][5] - Isomorphic has also signed significant research collaborations with Novartis and Eli Lilly, totaling over $3 billion [6] Company Developments - Isomorphic Labs was established at the end of 2021 and is based on the technology and team from the renowned AlphaFold2 [6] - The company recently raised $600 million in funding led by Thrive Capital, with participation from GV and Alphabet, aimed at supporting its AI drug design engine and advancing internal projects into clinical stages [5] - In May 2024, Isomorphic and DeepMind jointly released AlphaFold3, a revolutionary tool for predicting the structure and interactions of all life molecules, enhancing drug development capabilities [8] Future Outlook - The founder of Isomorphic, Demis Hassabis, emphasized in a recent interview that AI is the ultimate tool for scientific exploration, capable of solving complex scientific challenges, with AlphaFold's success as a testament [10] - There is optimism that AI will usher in a new golden age of scientific discovery across various fields, including materials science, physics, mathematics, and weather forecasting [10]
X @何币
何币· 2026-01-21 00:56
AI 赛道的格局正在被重新定义!Sentient 正通过其全球最大的情报网络 GRID,证明开源力量完全可以超越硅谷巨头。强大战绩:ROMA & ODS:Sentient 开发的开源框架在深度搜索和复杂推理任务上,已全面超越 Perplexity、OpenAI (Search Preview) 和 Gemini。在 DeepMind 的基准测试中,其准确率高达 75.3%!SERA:专为加密货币研究打造,在 Web3/DeFi 领域的研究能力力压 GPT 系列和 Perplexity Finance,成为链上情报的王者。突破性的技术:指纹识别 (OML)开源 AI 难以盈利?Sentient 推出的“指纹识别”研究解决了这一痛点!创作者可以在保持代码开放的同时,证明模型所有权并实现大规模变现,真正资助开源生态的长远发展。数据表现:Sentient Chat 候补名单已突破 200 万人!斩获 2025 Minsky 大奖“年度 AI 初创企业”。四篇论文入选 NeurIPS 2025,科研实力硬核霸榜。开源 AGI 的未来已来,Sentient 正在用更少的资金、更开放的姿态,跑出最快的速度。#Sentien ...
透过 2025 的内容现场,寻找通往 2026 的坐标与锚点 | 声东击西
声动活泼· 2026-01-20 10:04
Core Insights - The article discusses the rapid technological advancements and the resulting societal shifts, emphasizing the generational differences in perceptions and experiences of these changes [3][4][5]. Group 1: Technological Advancements - The pace of technological progress, particularly in AI, quantum computing, and robotics, is described as unprecedented, leading to a sense of urgency and excitement in the tech community [7][8]. - There is a belief that while technology creates new job opportunities, it also generates anxiety about job displacement, particularly among white-collar workers [9][10]. - The article highlights the need for individuals to adapt to changing skill requirements as AI reshapes job roles and workflows [10][11]. Group 2: Societal Concerns and Consumer Behavior - The concept of "谷子经济" (Guzi Economy) is introduced, reflecting a trend where consumer behavior is influenced by emotional connections to products and characters, rather than just technological advancements [19][20]. - The phenomenon of "Labubu" is discussed as a case study of consumer excitement driven by social media, illustrating how ordinary people engage with new trends [15][16]. - The article notes that despite the rapid advancement of AI, many consumers still seek tangible, physical products that provide emotional comfort and connection [20][21]. Group 3: Generational Perspectives - Younger generations, referred to as "digital natives," exhibit a natural acceptance of AI and are more focused on how to utilize it effectively in their lives [35][36]. - There is a notable concern among youth regarding global issues such as climate change, indicating a heightened awareness and engagement with societal challenges [42][44]. - The article suggests that maintaining curiosity and a desire to explore the world will be crucial for the younger generation as they navigate an uncertain future [49][50]. Group 4: Governance and Power Dynamics - The article discusses the shifting power dynamics between traditional nation-states and large tech companies, which are increasingly exercising authority typically reserved for governments [24][25]. - It highlights the need for new governance mechanisms to address the challenges posed by technological advancements and the emergence of a "cloud empire" [26][27]. - The implications of AI on political processes and public discourse are examined, emphasizing the potential for technology to influence societal norms and values [26][27].
AI-驱动的新药研发-原理-应用与未来趋势
2026-01-20 01:50
Summary of AI-Driven Drug Development Conference Call Industry Overview - The conference call focuses on the application of Artificial Intelligence (AI) in the pharmaceutical industry, particularly in drug discovery and development processes [1][2][3]. Core Insights and Arguments - **AI Enhancements in Drug Development**: AI significantly improves the efficiency and success rates of drug development processes, traditionally characterized by lengthy and costly stages [2][3]. For instance, AlphaFold enhances protein structure prediction speed and accuracy, accelerating target discovery [2]. - **AI vs. Traditional Methods**: Unlike traditional Computer-Aided Drug Design (CADD), which relies on physical rules, AI-driven drug discovery (AIDD) utilizes vast datasets for direct predictions, bypassing complex physical computations [3][4]. - **Evaluation of AI Capabilities**: To assess a company's AI capabilities in drug development, it is crucial to examine the use of advanced algorithms like deep learning, the quality of data, successful case studies, and ongoing innovation [5][6]. - **Specific Applications of AI**: AI applications in pharmaceuticals include generating drug structures, gene diagnostics, and automating tasks like report writing through large models (e.g., ChatGPT) and smaller, specialized models [7][8]. Important but Overlooked Content - **Graph Neural Networks (GNN)**: GNNs are effective for small molecule structure data but struggle with complex molecules due to increased computational demands [9][13]. The need for new encoders to represent complex small molecules is emphasized [14]. - **Multimodal Learning**: This approach integrates various data types (images, text, fingerprints) to enhance drug development efficiency, as demonstrated in KRAS target research [15]. - **Market Trends**: Current AIDD companies exhibit diverse technical characteristics, with some focusing on generative adversarial networks (GANs) and others on traditional CADD while incorporating deep learning [16]. The future of AI in pharmaceuticals is expected to involve more complex small molecule designs and stricter confidentiality to protect technological advantages [17]. - **Agent Applications**: The use of intelligent agents in workflow design is emerging, allowing for autonomous process design and execution, which can significantly enhance efficiency [20]. Future Trends - The pharmaceutical industry is likely to see a rise in the complexity of small molecule designs, the mainstreaming of multimodal fusion technologies, and the emergence of new encoders and deep learning algorithms to meet evolving demands [17][18].
十三年布局,一朝反超,谷歌AI崛起的真实故事
3 6 Ke· 2026-01-19 11:25
Core Insights - The article narrates the journey of Google in the AI sector, highlighting its comeback from setbacks to achieving significant milestones with the launch of products like Nano Banana and Gemini App, showcasing the importance of talent, time, and long-term vision in technology development [1][49][52]. Group 1: Key Events and Milestones - In August 2025, Google's image generator Nano Banana topped the LMArena charts, leading to a surge in global user engagement, generating billions of images [3][49]. - By September 2025, the Gemini App became the most downloaded app on the Apple App Store, with monthly active users increasing from 450 million in July to 650 million by October [49]. - In November 2025, Google released the Gemini 3 model, surpassing ChatGPT in multiple benchmarks, resulting in a significant increase in stock price [3][49]. Group 2: Historical Context and Strategic Moves - The origins of Google's AI success can be traced back to a secret auction in December 2012 at Lake Tahoe, where Google acquired DNNresearch for $44 million, marking a pivotal moment in its AI strategy [6][10][22]. - The acquisition of DeepMind in January 2014 for approximately $600 million further solidified Google's position in AI, bringing in top talent and innovative technology [24]. - The introduction of the Transformer model in June 2017 revolutionized AI, laying the groundwork for subsequent advancements in large language models [30][32]. Group 3: Challenges and Responses - Google's cautious approach to AI, particularly in the chatbot domain, led to missed opportunities, exemplified by the delayed release of Bard, which resulted in a significant drop in stock value after a failed launch in February 2023 [35][38]. - The return of co-founder Sergey Brin to active involvement in AI development was a crucial turning point, leading to strategic talent acquisitions and the eventual merger of Google Brain and DeepMind in April 2023 [39][42]. Group 4: Technological Advancements - The development of the TPU (Tensor Processing Unit) began in 2013, which later became a key competitive advantage for Google, enabling efficient AI model operations [28][48]. - By the end of 2025, Google had developed the Ironwood chip, achieving a performance of 4,614 TFLOPs per chip, significantly enhancing its computational capabilities [47][48]. Group 5: Themes and Conclusions - The overarching themes of talent and time are emphasized throughout Google's journey, illustrating that strategic investments in human capital and patience in technology development can lead to eventual success [53][55]. - The article concludes that despite challenges, Google's ability to adapt and innovate demonstrates that even major tech companies can recover and thrive in competitive landscapes [57][58].
L4数据闭环 | 模型 × 数据:面向物理 AI 时代的数据基础设施
自动驾驶之心· 2026-01-19 09:04
Core Viewpoint - The article emphasizes that in the pursuit of general physical intelligence, the model serves as the ceiling while the data infrastructure acts as the floor, highlighting the importance of both elements working in tandem as a competitive barrier [1]. Group 1: Shift in Talent Demand - There has been a noticeable shift in the automatic driving and AI sectors, with a growing emphasis on recruiting talent for "data infrastructure" [2]. - Leading companies like Tesla and Wayve are now focusing on extracting data from large-scale fleets rather than relying solely on manually written rules [3]. - The consensus is that while model algorithms are becoming rapidly replaceable, the foundational infrastructure for data extraction and defining quality remains a significant competitive advantage [5]. Group 2: Evolution of Physical AI - The article outlines three evolutionary stages of "Physical AI" using references from popular anime, illustrating the progression from early simulation to advanced world models [7]. - The first stage involves basic simulation and remote teaching, while the second stage incorporates augmented reality with real-world data [10][11]. - The third stage envisions a world model that allows for accelerated training in a virtual environment, significantly enhancing AI learning capabilities [13]. Group 3: Data Infrastructure Layers - The article describes a multi-layered approach to building a robust data infrastructure for autonomous driving, which includes metrics for physical world perception, data classification, and automated evaluation systems [16][20][22]. - The first layer focuses on creating a metric system to gauge physical world interactions, while the second layer emphasizes transforming raw data into structured, high-value information [18][20]. - The third layer involves tagging data for specific scenarios, enabling the creation of a comprehensive "question bank" for training AI models [21]. Group 4: Future of Physical AI - The article posits that as the industry moves towards end-to-end solutions and physical AI, the foundational infrastructure becomes increasingly valuable [27]. - Unlike text-based models, physical AI requires real-world data to avoid catastrophic errors, necessitating a closed-loop system for calibration [28]. - The future development model is expected to rely on a world model as a generator and the data infrastructure as a discriminator, ensuring that AI systems are guided by real-world parameters [29][36].
DeepMind CEO算了4笔账:这轮AI竞赛,钱到底花在哪?
3 6 Ke· 2026-01-18 02:21
Core Insights - The current focus in the AI sector has shifted from enhancing capabilities to maximizing profitability, as highlighted by the new CNBC podcast featuring Google DeepMind's CEO, Demis Hassabis [1][2]. Group 1: AGI Capabilities - Hassabis emphasizes that current large models exhibit significant shortcomings, particularly in their ability to generalize and learn continuously, which he refers to as "jagged intelligences" [2][4]. - True AGI must possess the ability to independently formulate questions and hypothesize about the world, rather than merely responding to queries [3][4]. - DeepMind is transitioning its focus from large language models (LLMs) to developing AI that understands the world, as demonstrated through projects like Genie, AlphaFold, and Veo [6][9]. Group 2: Commercialization Strategies - The commercial viability of AI models is not solely about their strength but also about their cost-effectiveness and deployment efficiency [10][11]. - DeepMind's strategy includes creating both Pro and Flash versions of models to cater to different user needs, ensuring broader accessibility [11][12]. - Hassabis advocates for integrating AI into everyday devices, moving beyond traditional web interfaces to enhance user interaction [15][16]. Group 3: Energy Challenges - As AI capabilities expand, energy consumption becomes a critical concern, with Hassabis stating that increased intelligence will require more power [20][21]. - The industry faces a significant bottleneck in energy supply, which could hinder the practical application of AGI [22][23]. - DeepMind aims to leverage AI to address energy challenges, focusing on both generating new energy sources and improving energy efficiency [24][27]. Group 4: Competitive Landscape - The competitive dynamics in AI have shifted, with companies needing to focus on integration and deployment rather than just technological advancements [29][30]. - DeepMind has consolidated its teams to streamline AI development and deployment, enhancing efficiency and speed in bringing products to market [33][37]. - The ability to effectively utilize energy resources will be a key determinant of success in the AI sector, as highlighted by Hassabis [36][38].
DeepMind首席执行官正“每日”与谷歌首席执行官沟通 该实验室正加大力度与OpenAI展开竞争
Xin Lang Cai Jing· 2026-01-16 08:01
Core Insights - In early 2025, investors questioned Google's ability to keep pace with OpenAI in the AI race, but by the end of the year, Alphabet's stock achieved its best performance since 2009 [1] - Google's resurgence in AI is largely attributed to DeepMind, which was acquired in 2014 for approximately £400 million [1][9] - DeepMind's CEO, Demis Hassabis, emphasized the company's role as the "core engine" of Google's AI development and noted adjustments made to accelerate product deployment in a competitive environment [1][9] Company Adjustments - In 2023, Google merged its Google Brain research division with DeepMind, laying the groundwork for the success of its flagship AI assistant, Gemini [3][11] - Key personnel changes, including the promotion of Josh Woodward to oversee Gemini-related operations, have also contributed to this shift [3][11] - Despite being behind OpenAI after the launch of ChatGPT in November 2022, Google has made strides in product commercialization and rapid deployment of AI technologies [3][11] Competitive Landscape - The current market competition is described as "fierce," with many industry veterans acknowledging it as one of the most intense periods in tech history [2][10] - Google faces competition not only from OpenAI but also from other companies like Amazon, Perplexity, and Anthropic [1][9] Product Development - Hassabis stated that the Gemini series models developed by DeepMind can be quickly integrated into various Google products, with a smoother deployment process observed over the past year [4][12] - The launch of Gemini 2.5 in March 2025 and Gemini 3 in November 2025 received high praise for their performance [4][11] Strategic Communication - Hassabis and Google CEO Sundar Pichai communicate almost daily to discuss strategic matters and technology development, highlighting DeepMind's significance in Google's overall planning [5][13] - This ongoing dialogue facilitates real-time adjustments to product roadmaps and long-term goals, aiming for the rapid and safe realization of general artificial intelligence [6][13] Market Dynamics - There is ongoing debate about whether the current AI boom represents a bubble, with significant investments flowing into AI startups, many of which have high valuations despite underdeveloped products [7][14] - Hassabis acknowledged that while some areas of AI may exhibit bubble-like characteristics, the technology itself is poised to be transformative for humanity [8][15] - He compared the current AI hype to the internet bubble of the late 1990s, suggesting that valuable companies will emerge from this period despite potential market corrections [8][15] Long-term Positioning - Hassabis expressed the need to ensure that the company is well-positioned to thrive regardless of future market conditions, whether they involve continued growth or a potential bubble burst [16] - He believes that the integration of AI with Google's core business places the company in a favorable position to benefit from future developments in the industry [16]