Workflow
可解释性
icon
Search documents
田渊栋的2025年终总结:关于被裁和26年的研究方向
自动驾驶之心· 2026-01-06 00:28
点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近30个 方向 学习 路线 >>自动驾驶前沿信息获取 → 自动驾驶之心知识星球 本文只做学术分享,如有侵权,联系删文 最近太忙,只能把年终总结放到1月1日之后再写了,不管怎样,能开始动笔就是好事。 作者 | 田渊栋@知乎 编辑 | 大模型之心Tech 原文链接: https://zhuanlan.zhihu.com/p/1990809161458540818 关于被裁 在2025年1月底被要求加入Llama4救火的时候,作为一直以来做强化学习的人,我事先画了一个2x2的回报矩阵(reward matrix),计算了一下以下四种可能(虽然在 那时,因为来自上面的巨大压力,不同意是几乎不可能的): | | 同意帮忙 | 拒绝帮忙 | | --- | --- | --- | | Llama4项目成功 | 成为英雄 | 被边缘化 | | Llama4项目未成功 | 为公司尽力 | 被人骂在公司需要时不出力 | 当时想的是我们去帮忙的话,即便最后项目未能成功,也至少尽力而为,问心无愧。不过遗憾的是,最后发生的是没在计算之内的第五种可能,这也让我对 ...
田渊栋2025年终总结:救火Llama4但被裁,现任神秘初创公司联创
机器之心· 2026-01-04 08:05
机器之心报道 去年 10 月,Meta 人工智能部门的裁员波及到了一大波人,其中包括了知名华人科学家田渊栋及其团队成员。 就在这两天,田渊栋分享了自己的 2025 年终总结。 他首先透露了自己「救火」Llama 4 项目的经历以及之后被裁、未来的工作规划;接着回顾了 2025 年的主要研究方向,包括大模型推理和打开模型的黑箱;最后 探讨了 AI 驱动下的社会变革、生产力重构以及个人价值的存续逻辑。 接下来为田渊栋知乎原文内容。 2025年终总结(一) 关于被裁 在 2025 年 1 月底被要求加入 Llama4 救火的时候,作为一直以来做强化学习的人,我事先画了一个 2x2 的回报矩阵(reward matrix),计算了一下以下四种可能 (虽然在那时,因为来自上面的巨大压力,不同意是几乎不可能的): | | 同意帮忙 | 拒绝帮忙 | | --- | --- | --- | | Llama4 项目成功 | 成为英雄 | 被边缘化 | | Llama4 项目未成功 | 为公司尽力 | 被人骂在公司需要时不出力 | 当时想的是我们去帮忙的话,即便最后项目未能成功,也至少尽力而为,问心无愧。不过遗憾的是,最后发生 ...
LeCun曝Meta作弊刷榜,田渊栋:我没想到这个结局
量子位· 2026-01-04 05:21
Core Viewpoint - The article discusses the fallout from the release of Meta's Llama 4, highlighting internal conflicts and the departure of key figures like LeCun and Tian Yuandong, who are now pursuing entrepreneurial ventures due to dissatisfaction with Meta's direction in AI development [1][3][22]. Group 1: Llama 4 and Internal Conflicts - Llama 4 faced significant criticism and allegations of cheating in benchmark tests, leading to a loss of confidence from Meta's leadership [1][10]. - The release of DeepSeek, a competing AI model, pressured Meta to accelerate its AI investments, resulting in internal turmoil and a shift in team dynamics [4][6]. - The communication breakdown within the team was exacerbated by differing priorities, with LeCun's team wanting to innovate while leadership preferred proven technologies [7][8]. Group 2: Departures and New Ventures - LeCun and Tian Yuandong both announced their intentions to start new companies after leaving Meta, with LeCun focusing on world models and Tian Yuandong on new AI initiatives [27][33]. - LeCun's new venture, Advanced Machine Intelligence (AMI), aims to explore advanced machine intelligence through open-source projects, while he will serve as the executive chairman [27][30]. - Tian Yuandong expressed a desire to co-found a startup, indicating a trend among former Meta employees to seek new opportunities outside the company [33]. Group 3: Future Directions in AI - LeCun's focus on the V-JEPA architecture aims to enhance AI's understanding of the physical world through video and spatial data, with expectations for significant progress within 12 months [32]. - The article emphasizes the need for AI to move beyond language limitations, as highlighted by LeCun's critique of the current focus on large language models [25][26].
港中深韩晓光:3DGen,人类安全感之战丨GAIR 2025
雷峰网· 2025-12-13 09:13
" 构建世界模型,为什么不能只靠「炼丹」? " 作者丨吴彤 编辑丨 林觉民 在香港中文大学(深圳),助理教授韩晓光的实验室名为GAP,意为"像素、点与多边形的生成与分析"。 现在看来,这个名字,也隐喻着他希望弥合真实世界和虚拟世界之间的"鸿沟"的意思。 2018年,韩晓光加入这所大学时,是当时唯一专注于计算机图形学研究的教师。2024年,他尝试从三维 重建拓展至具身智能与世界模型,又一次如入无人之境。 在小红书上,他的账号@韩晓光,简介仅有两行:港中深理工学院助理教授、图形学与三维视觉。他将小 红书视为传播平台,也视为个人思考的整理场所,会公开讨论"显式3D是否还有必要"、"世界模型为何需 要可解释性"等专业问题,也会记录与学生讨论时获得的启发。 这种直接、平实的分享,吸引了一批对技术本质感兴趣的读者,也代表了韩晓光这类青年教师群体打破学 术边界的自觉实践。从某一种角度看,构建世界模型需要理解真实世界的运行逻辑,而他的线上互动,本 身就是一场持续进行的、小规模的"世界模拟"。 在韩晓光的叙述中,他研究演进是自然发生的。从三维重建到动态生成,再到服务于机器人的虚拟环境构 建,核心始终是"三维内容的生成与理解"。 ...
英伟达开源最新VLA,能否破局L4自动驾驶?
Tai Mei Ti A P P· 2025-12-02 13:01
Core Insights - NVIDIA has officially open-sourced its latest autonomous driving Vision-Language-Action (VLA) model, Alpamayo-R1, which can process vehicle camera images and text instructions to output driving decisions [2][3] - The Alpamayo-R1 model emphasizes "explainability," providing reasons for its decisions, which aids in safety validation and regulatory review [3][4] - The VLA model is seen as the next core technology in intelligent driving, with various companies, including Li Auto, Xpeng Motors, and Great Wall Motors, already implementing it in production [3][4] Group 1: Model Features and Benefits - Traditional end-to-end models are often "black boxes," making them difficult to interpret, especially in complex scenarios [4] - VLA introduces a language modality as an intermediary layer, enhancing the model's ability to handle complex situations and providing a more human-like decision-making process [4][5] - The Alpamayo-R1 model has shown significant performance improvements, including a 12% enhancement in trajectory planning performance and a 25% reduction in near-collision rates [5][6] Group 2: Industry Impact and Ecosystem Development - NVIDIA aims to position itself as the "Android" of the autonomous driving sector, moving beyond being just a hardware supplier [6][8] - The company has announced plans to deploy 100,000 Robotaxis starting in 2027, collaborating with firms like Uber and Mercedes to create the world's largest L4 autonomous driving fleet [7][8] - The open ecosystem proposed by NVIDIA could facilitate data sharing among companies, potentially accelerating technological advancements in the industry [8][9] Group 3: Challenges and Future Considerations - Despite the advancements, the Alpamayo-R1 model requires high-performance hardware to meet automotive-grade latency, indicating a dependency on NVIDIA's hardware solutions [10][11] - The effectiveness of VLA technology is still under evaluation, and there are concerns about the limitations imposed by NVIDIA's platform on developers [11][12] - The successful commercialization of L4 autonomous driving will also depend on regulatory frameworks and the ability to balance data privacy with operational safety [11][12]
腾讯研究院AI每周关键词Top50
腾讯研究院· 2025-11-22 02:33
Group 1: Core Insights - The article presents a weekly roundup of the top 50 keywords related to AI developments, highlighting significant trends and innovations in the industry [2][3]. Group 2: Key Categories and Developments - **Computing Power**: - "Super Node Operating System" by openEuler and "NVLink Collaboration" by Arm are notable advancements in computing infrastructure [3]. - **Models**: - Key model updates include "Grok 4.1" by xAI, "Gemini 3" and "Gemini 3 Pro Image" by Google, and "GPT-5.1 Update" by OpenAI, indicating ongoing enhancements in AI capabilities [3]. - **Applications**: - Various applications are emerging, such as "SIMA 2" by DeepMind, "EverMemOS" by Shengda, and "MedGPT" by Future Doctors, showcasing the diverse use cases of AI technology [3][4]. - **Technology**: - "Space Supercomputing" by Zhongke Tiansuan represents advancements in computational technology for space applications [4]. - **Perspectives**: - Insights from industry leaders include discussions on AI interpretability by OpenAI, future outlooks on Grok by xAI, and the real bottlenecks in AI as highlighted by Andrew Ng [4]. - **Capital**: - Significant investments are noted, such as Bezos's focus on physical AI startups and Microsoft's investment in Anthropic, indicating strong financial backing for AI innovation [4]. - **Events**: - A global outage event by Cloudflare and the entrepreneurial departure of Yann LeCun are significant occurrences impacting the AI landscape [4].
智能早报丨“羊毛党”用AI骗取“仅退款”;华为将发布AI领域突破性技术
Guan Cha Zhe Wang· 2025-11-17 02:02
Group 1: Leadership Changes at Apple - Tim Cook may step down as CEO of Apple as early as next year after 14 years in the role, with John Ternus, the current Senior Vice President of Hardware Engineering, being the likely successor [1] - Ternus has been with Apple since 2001 and has played a significant role in the engineering design of major hardware products [1] - Apple typically announces major personnel changes after its January earnings report, allowing new management to acclimate before key events like WWDC and the iPhone launch [1] Group 2: E-commerce Fraud Trends - A new type of fraud involving AI-generated fake images for "refund only" claims is emerging in the e-commerce sector, with consumers using AI tools to create images of defective products [2] - Some individuals are reportedly learning this fraudulent technique for a fee, claiming they can successfully obtain refunds multiple times [2] Group 3: Semiconductor Supply Chain Issues - Several smartphone manufacturers, including Xiaomi, OPPO, and vivo, have paused their procurement of storage chips due to soaring prices, with some companies having less than two months of inventory [3] - The price of DRAM chips has increased by nearly 50%, leading manufacturers to hesitate in accepting these quotes [3] - The demand for storage chips has surged due to the AI model wave, with data centers willing to pay over 30% more than smartphone manufacturers for the same products [3] Group 4: Huawei's AI Technology Announcement - Huawei is set to unveil a breakthrough AI technology on November 21, aimed at improving the utilization efficiency of computing resources from an industry average of 30%-40% to 70% [4] - This technology will unify resource management across various computing hardware, enhancing support for AI training and inference [4] - The upcoming technology shares similarities with the core technology of Israeli AI startup Run:ai, which was acquired by NVIDIA for $700 million [4] Group 5: AI-Driven Scientific Discovery - A team from Peking University has developed the AI-Newton system, which can rediscover fundamental physical laws without prior knowledge [5] - The system identifies an average of 90 physical concepts and 50 general laws in test cases, showcasing its potential for autonomous scientific discovery [5] Group 6: OpenAI's Research on Model Interpretability - OpenAI has released new research on model interpretability, proposing a sparse model with fewer connections but more neurons to enhance understanding of internal mechanisms [6] - The research identifies the "minimal loop" for specific tasks, suggesting that larger sparse models can generate more powerful yet simpler models [6]
硅谷风投正集体押注一批“反叛”的AI实验室,一个月砸下25亿美元,AI研究需要巨头体系外的新范式
Xi Niu Cai Jing· 2025-11-13 07:43
Core Insights - A new wave of investment is emerging in "AI laboratories," referred to as neolabs, which aim to redefine AI research paradigms rather than replicate the paths of giants like OpenAI and Anthropic [1] - Five neolab startups have raised or negotiated up to $2.5 billion in funding within the past month, indicating a significant shift in capital allocation towards fundamental research [1] - The giants' dominance has created a paradox where their scale and processes hinder rapid experimentation, presenting an opportunity for smaller, agile teams to explore innovative theories [1] Neolab Startups - Isara, founded by former OpenAI researcher Eddie Zhang, is developing a software system for thousands of AI agents to collaborate on complex tasks, with a target valuation of $1 billion [2] - Humans&, founded by ex-xAI researcher Eric Zelikman, aims to create emotionally intelligent AI and is in discussions for $1 billion funding at a $4 billion valuation [3] - Periodic Labs, founded by a former OpenAI research head, focuses on automating scientific research, while Reflection AI, founded by ex-DeepMind researchers, challenges the closed-source model of giants [6] Investment Trends - Investors are drawn to neolabs not only out of curiosity but also because they offer a "safer risk" profile, with the potential for a "half-exit" by selling to giants like Amazon or Microsoft [5] - The trend indicates a shift from a competition of singular capabilities to a focus on multi-agent collaboration, long-term learning, and explainability in AI research [6] Challenges Ahead - The high cost of computing resources remains a significant challenge for neolabs, as giants dominate the high-end GPU supply chain [7] - There is a lack of mature evaluation systems for long-term tasks and agent collaboration quality, complicating the assessment of these new AI systems [7] - Neolabs must establish viable business models that connect foundational research to industry applications, ensuring a closed loop of "research-product-revenue" to avoid becoming mere incubators for larger companies [7]
商业银行应用大语言模型的可解释性挑战 | 金融与科技
清华金融评论· 2025-09-07 10:13
Core Viewpoint - The integration of large language models (LLMs) into the banking sector is driving digital transformation, but the inherent opacity of these models presents significant challenges in explainability, necessitating the establishment of a transparent and trustworthy AI application framework to ensure safe and compliant operations [3][4]. Regulatory Constraints on Explainability - Financial regulatory bodies are increasingly emphasizing the need for transparency in AI models, requiring banks to disclose decision-making processes to meet compliance standards and protect consumer rights, which serves as a primary external constraint on LLM applications [6]. - In scenarios like credit approval that directly affect customer rights, algorithmic decisions must provide clear justifications to ensure fairness and accountability. Regulations such as the EU's General Data Protection Regulation (GDPR) mandate transparency in automated decision-making, and domestic regulators also require banks to explain reasons for credit application rejections [7]. - Global regulatory trends are converging towards the necessity for AI model explainability, with frameworks like Singapore's FEAT principles and China's guidelines emphasizing fairness, ethics, accountability, and transparency. The upcoming EU AI Act will impose strict transparency and explainability obligations on high-risk financial AI systems [8]. Technical Explainability Challenges of LLMs - The architecture and operational mechanisms of LLMs inherently limit their technical explainability, as their complex structures and vast parameter counts create a "black box" effect [10]. - The attention mechanism, once thought to provide insights into model behavior, has been shown to have weak correlations with the importance of features in model predictions, undermining its reliability as an explanation tool. The sheer scale of parameters complicates traditional explanation algorithms, making it difficult to analyze high-dimensional models effectively [11]. - The phenomenon of "hallucination," where LLMs generate plausible but factually incorrect content, exacerbates the challenge of explainability. This issue leads to outputs that cannot be traced back to reliable inputs or training data, creating significant risks in financial contexts [12].
谷歌大脑之父首次坦白,茶水间闲聊引爆万亿帝国,AI自我突破触及门槛
3 6 Ke· 2025-08-25 03:35
Core Insights - Jeff Dean, a key figure in AI and the founder of Google Brain, shared his journey and insights on the evolution of neural networks and AI in a recent podcast interview [1][2][3] Group 1: Early Life and Career - Jeff Dean had an unusual childhood, moving frequently and attending 11 schools in 12 years, which shaped his adaptability [7] - His early interest in computers was sparked by a DIY computer kit purchased by his father, leading him to self-learn programming [9][11][13] - Dean's first significant encounter with AI was during his undergraduate studies, where he learned about neural networks and their suitability for parallel computing [15][17] Group 2: Contributions to AI - Dean proposed the concepts of "data parallelism/model parallelism" in the 1990s, laying groundwork for future developments [8] - The inception of Google Brain was a result of a casual conversation with Andrew Ng in a Google break room, highlighting the collaborative nature of innovation [22][25] - Google Brain's early achievements included training large neural networks using distributed systems, which involved 2,000 computers and 16,000 cores [26] Group 3: Breakthroughs in Neural Networks - The "average cat" image created by Google Brain marked a significant milestone, showcasing the capabilities of unsupervised learning [30] - Google Brain achieved a 60% relative error rate reduction on the Imagenet dataset and a 30% error rate reduction in speech systems, demonstrating the effectiveness of their models [30] - The development of attention mechanisms and models like word2vec and sequence-to-sequence significantly advanced natural language processing [32][34][40] Group 4: Future of AI - Dean emphasized the importance of explainability in AI, suggesting that future models could directly answer questions about their decisions [43][44] - He noted that while LLMs (Large Language Models) have surpassed average human performance in many tasks, there are still areas where they have not reached expert levels [47] - Dean's future plans involve creating more powerful and cost-effective models to serve billions, indicating ongoing innovation in AI technology [50]