机器之心

Search documents
三个月、零基础手搓一块TPU,能推理能训练,还是开源的
机器之心· 2025-08-24 04:02
Core Viewpoint - The recent advancements in large model technology have renewed interest in AI-specific chips, particularly Google's TPU, which has evolved significantly since its deployment in 2015, now reaching its 7th generation [1][9]. Group 1: TPU Overview - TPU is a specialized chip designed by Google to enhance the speed of machine learning model inference and training, focusing on executing mathematical operations efficiently [9]. - The architecture of TPU allows it to perform matrix multiplication efficiently, which constitutes a significant portion of computations in deep learning models [14][31]. Group 2: TinyTPU Project - The TinyTPU project was initiated by engineers from Western University in Canada to create an open-source ML inference and training chip, motivated by the lack of a complete open-source codebase for such accelerators [5][7]. - The project emphasizes a hands-on approach to learning hardware design and deep learning principles, avoiding reliance on AI tools for coding [6]. Group 3: Hardware Design Insights - The project team established a design philosophy of exploring unconventional ideas before consulting external resources, leading to the re-invention of many key mechanisms used in TPU [6]. - The hardware design process involves understanding clock cycles, using Verilog for hardware description, and implementing a systolic array architecture for efficient matrix multiplication [10][12][26]. Group 4: Training and Inference Mechanisms - The TinyTPU architecture allows for continuous inference by utilizing a double buffering mechanism, which enables the loading of new weights while processing current computations [61][64]. - The training process leverages the same architecture as inference, with additional modules for gradient calculation and weight updates, allowing for efficient training of neural networks [71][118]. Group 5: Control and Instruction Set - The control unit of TinyTPU employs a custom instruction set architecture (ISA) to manage control signals and data flow, enhancing the efficiency of operations [68][117]. - The ISA has evolved to include 94 bits, ensuring that all necessary control flags and data fields are accounted for without compromising performance [117].
视频生成 vs 空间表征,世界模型该走哪条路?
机器之心· 2025-08-24 01:30
机器之心PRO · 会员通讯 Week 34 --- 本周为您解读 ② 个值得细品的 AI & Robotics 业内要事 --- 1. 视频生成 vs 空间表征,世界模型该走哪条路? 视频预测生成的高质量画面,是否真的意味着模型理解了物理与因果规律?直接在潜在空间建模能否有效避免像素噪声干扰,同时保持决策与规划能力?混合路线是否能成为未来世界模型的 最优路径?随着生成模型和潜在表征技术的发展,AGI 的「思想实验沙盒」能否真正落地应用于物理世界任务?... 2. 抢天才还是拼算力?前 Llama 推理负责人详解 AI 的真实天花板 真正决定 AI 行业天花板的,是天才研究员的灵感,还是指数级增长的算力?如果算力增长放缓,AI 行业会否面临「增长乏力」的拐点?高阶概念想法,如果没有系统实验验证,能否真正推 动模型跃迁?模型泛化的天花板,到底靠升级模型,还是靠设计更高质量的新考题?... 本期完整版通讯含 2 项专题解读 + 30 项本周 AI & Robotics 赛道要事速递,其中技术方面 12 项,国内方面 8 项,国外方面 10 项。 本期通讯总计 20464 字,可免费试读至 9% 消耗 288 微信 ...
第一名方案公开,代码智能体安全竞赛,普渡大学拿下90%攻击成功率
机器之心· 2025-08-23 10:51
你的 AI 编程助手有多安全?也许比你想象的要脆弱得多。近期多项研究 [1-2] 表明,即使是经过安全对齐的大语言模型,也可能在正常开发场景中 无意间生成存 在漏洞的代码 ,为后续被利用埋下隐患;而在恶意用户手中,这类模型还能显著 加速恶意软件的构建与迭代 ,降低攻击门槛、缩短开发周期。许多风险源自模型 推理链条中的细微缺陷,而不仅仅是输入输出层面的显性问题。 在亚马逊举办的针对代码智能体的安全比赛 (Amazon Nova AI Challenge) 中,普渡大学的团队 PurCL 作为红队以超过 90% 的攻击成功率获得比赛第一名,赢得 25 万美元奖金。 在比赛中,12 名团队成员耗时八个月和百万美元开发出基于 AI 认知建模的全过程红队系统,现开放给领域研究者共享使用。 他们的研究发现,对齐代码模型的关键问题在于把对齐技术扩大到复杂的真实领域问题中和提升模型推理的安全相关性。 亚马逊代码模型安全比赛 亚马逊代码模型安全比赛是一场针对大模型代码安全的比赛。 举办方邀请全球顶尖研究队伍提交科研企划,最终在 90 份企划中资助 10 个团队参加比赛,每个团 队在半年的时间内获得了 25 万美元的科研基金和 ...
OpenAI重大发现:GPT-4b micro改造诺奖研究,山中因子重编程效率提高50倍
机器之心· 2025-08-23 10:51
Core Viewpoint - The collaboration between OpenAI and Retro Bio aims to enhance the efficiency of stem cell reprogramming through the development of a new model, GPT-4b micro, which significantly improves the reprogramming efficiency of Yamanaka factors by 50 times compared to standard methods [2][3][26]. Group 1: Collaboration and Investment - OpenAI announced its partnership with Retro Bio to develop a new model, GPT-4b micro, which focuses on enhancing Yamanaka factors for stem cell reprogramming [2]. - Sam Altman personally invested $180 million in Retro Bio prior to this collaboration [3]. Group 2: Technological Advancements - The new model, GPT-4b micro, has a similar architecture to GPT-4o but employs a novel training method and a custom biological dataset to allow scientists to redesign proteins according to their needs [9]. - The model can handle a context length of up to 64,000 tokens, a first for protein sequence models, and exhibits scaling laws similar to language models, indicating predictable improvements with larger datasets [12]. Group 3: Research Findings - The Retro team utilized human fibroblasts to create a wet lab screening platform, where GPT-4b micro proposed diverse "RetroSOX" sequences that outperformed wild-type SOX2 in expressing pluripotency markers [14][15]. - For KLF4, the model generated enhanced RetroKLF variants, achieving a hit rate close to 50%, significantly higher than traditional methods [18]. - Combining the best RetroSOX and RetroKLF variants led to notable increases in early and late pluripotency markers, with the appearance of late markers occurring days earlier than with standard OSKM combinations [20]. Group 4: Clinical Potential and Validation - The study demonstrated that over 30% of cells began expressing key pluripotency markers within 7 days using mRNA delivery methods, with over 85% activating endogenous expression of critical stem cell markers by day 12 [24]. - The engineered variants showed robust genomic stability and the ability to differentiate into all three germ layers, supporting their potential for cell therapy applications [24]. Group 5: Future Outlook - OpenAI's work illustrates that specialized models can lead to rapid breakthroughs in scientific research, potentially solving problems in days that previously took years [32].
「只参与,不参赛」奖牌数却仅次于宇树,这个幕后玩家如何做到的?
机器之心· 2025-08-23 10:51
Core Viewpoint - The article highlights the emergence of "Accelerated Evolution" as a significant player in the humanoid robotics industry, showcasing its innovative approach and competitive edge against established companies like "Yushu Technology" [1][12][30]. Group 1: Competition and Achievements - The 2025 World Humanoid Robot Games showcased a mix of humor and advanced technology, with robots competing in various events, including soccer [1]. - Yushu Technology's robots, G1 and H1, won the most medals, while the startup "Accelerated Evolution" with its T1 robot secured the third most medals [3][4]. - In the pure AI soccer project, teams predominantly used the T1 robot, indicating its dominance and the shift towards a standardized platform for competition [5][6]. Group 2: Technological Philosophy - Accelerated Evolution focuses on a "product + ecosystem" strategy, emphasizing the development of a robust developer ecosystem rather than just competing on product features [12][25]. - The company prioritizes enhancing the physical capabilities of robots before integrating advanced AI, contrasting with the industry trend of rapidly deploying AI models [17][18]. - The choice of soccer as a testing ground allows for the development of practical skills applicable in real-world scenarios, such as dynamic balance and autonomous decision-making [24]. Group 3: Market Position and Strategy - Accelerated Evolution aims to establish itself as a platform company in the humanoid robotics field, akin to Apple's ecosystem, by providing a comprehensive platform for developers [25][29]. - The company has achieved significant production capabilities, delivering hundreds of robots within a year and maintaining a strong market presence [35][36]. - The team comprises experts from top institutions and tech companies, combining hardware expertise with software development skills, which enhances its competitive advantage [36][37]. Group 4: Future Outlook - The humanoid robotics market in China is projected to grow significantly, with estimates reaching 6 trillion yuan by 2050, indicating vast potential for companies like Accelerated Evolution [37][38]. - The company is positioned to tap into various market segments, including education, research, and home assistance, aiming for a comprehensive approach to robotics [38][41]. - The ongoing evolution in the robotics field suggests that Accelerated Evolution is not just competing for medals but is also focused on shaping the future of personal robotics [44].
Chain-of-Agents: OPPO推出通用智能体模型新范式,多榜单SOTA,模型代码数据全开源
机器之心· 2025-08-23 04:42
针对上述瓶颈,本文提出了一种全新的智能体推理范式——Chain-of-Agents(CoA)。与传统的 TIR 模型仅支持单一智能体的「思考-行动-观察」模式不同,CoA 框架能够灵活定义多个角色和工具的智能体,在单一模型内动态激活,实现端到端的多智能体协作。 本文通讯作者周王春澍,OPPO个性化AI实验室负责人,主要研究方向是AI个性化、智能体的自主进化和强化学习、以及大模型和智能体的记忆系统等。本文核 心贡献者均来自OPPO个性化AI实验室的AI智能体团队。 近年来,以多智能体系统(MAS)为代表的研究取得了显著进展,在深度研究、编程辅助等复杂问题求解任务中展现出强大的能力。现有的多智能体框架通过多 个角色明确、工具多样的智能体协作完成复杂任务,展现出明显的优势。然而,现阶段的 MAS 依然面临一些关键限制: 同时,近期兴起的工具融合推理(TIR)模型,通过显式地将工具使用融入推理过程,显著提升了单智能体框架(如 ReAct)在信息检索任务中的表现。然而,传 统的 TIR 模型,无法直接支持多智能体系统的原生训练与协作。 计算开销高 : 智能体之间频繁冗余的通信和复杂的工作流设计导致效率不高。 泛化能力有 ...
Coinbase强制全员上手AI工具,拒绝者直接开除
机器之心· 2025-08-23 04:42
Core Viewpoint - The article discusses Coinbase's controversial decision to fire engineers who refused to adopt AI programming tools, emphasizing the company's stance that AI is essential for their operations [5][11]. Group 1: AI Adoption in Programming - The use of AI in programming has become standard among developers, with Google claiming that 50% of its code is AI-generated [2]. - There is a growing community of developers who rely entirely on AI for coding, known as Vibe Coders, while some programmers still prefer traditional coding methods [4]. Group 2: Coinbase's Decision - Coinbase CEO Brian Armstrong announced the firing of engineers who did not use AI programming tools, stating that the company had purchased enterprise licenses for GitHub Copilot and Cursor [6]. - Armstrong expressed shock at the slow adoption rate of AI among engineers and implemented a mandatory trial period for AI tools, leading to the dismissal of those who did not comply [8][10]. Group 3: Reactions and Implications - The decision sparked significant discussion online, with mixed reactions from the tech community, including claims that the prevalence of AI programming is overestimated [13][14]. - Armstrong acknowledged that his approach was high-pressure and not well-received by some employees, but he aimed to convey that using AI is not optional [11].
抢天才还是拼算力?前 Llama 推理负责人详解 AI 的真实天花板
机器之心· 2025-08-23 01:30
本文来自PRO会员通讯内容,文末关注「机器之心PRO会员」,查看更多专题解读。 前 Llama 推理负责人 Ross Taylor 在接受 Interconnects 的播客访谈中直言,「目前 AI 竞赛中的混乱和挖角都只是噪声,最终都会被指数级算力所淹没」。他以这一 判断为起点,阐释了算力指数曲线如何决定 AI 竞赛的天花板,为什么在实验中保持系统性与坚持比单纯挖掘天才更重要,以及为什么高质量的评估才是真正推动模 型能力前沿的关键。 目录 01.前沿实验室天天「翻烧饼」式换方向,为何进展却没被拖慢? 真正决定 AI 行业天花板的,是天才研究员的灵感,还是指数级增长的算力?如果算力增长放缓,AI 行业会否面临「增长乏力」的拐点?... 高阶概念想法,如果没有系统实验验证,能否真正推动模型跃迁?... 02 . 天才的天价「转会」,真能换来下一次模型跃迁吗? 03 . 模型泛化的天花板不靠升级,而靠新考题的设计? 2、在 Ro ss 看来,行业的瓶颈不在管理,而在算力。所谓「让模型思考得更久、更深」,在工程上几乎可以直接翻译成「扩大 GPU/TPU 规模」。 3、他举例道,近期 IMO(国际数学奥林匹克竞赛)上 ...
KDD 2025 Best Paper Runner-Up | EI-BERT:超紧凑语言模型压缩框架
机器之心· 2025-08-22 07:55
本文第一作者王茂林,为香港城市大学博士生,导师为赵翔宇教授。合作者包括蚂蚁集团储俊,臧晓玲,赵耀,谢锶聪和钟文亮。该论文荣获 2025 年 KDD ADS Track Best Paper Award Runner-Up。 研究背景与动机 在移动计算时代,将高效的自然语言处理模型部署到资源受限的边缘设备上面临巨大挑战。这些场景通常要求严格的隐私合规、实时响应能力和多任务处理功 能。 现有的 BERT 模型压缩技术仅能实现 15-20MB 的压缩,远不能满足 移动设备 4MB 的严格内存限制 。特别是在金融应用场景中,本地 AI 处理对保护用户隐私至 关重要,同时还需确保 约 300 毫 秒的实时 响应 。这种差距凸显了对极致压缩框架的迫切需求。 方法:多阶段的极值压缩框架 EI-BERT 框架通过三个关键步骤实现极致压缩: 硬令牌剪枝 智能筛选重要词汇,大幅减少存储需求; 交叉蒸馏 确保高效知识传递,突破传统方法局限; 模 块 化 量化 采用 INT8 量化进一步优化存储。 其中,交叉蒸馏方法创新性地让 教师模 型 "站 在学生模型的角度" ,通过参数集成和师生互动的动态适应机制,实现精准的知识转移。该方法有 ...
全球首款AI原生游戏引擎再进化:GTA6再不来,我们就AI一个
机器之心· 2025-08-22 07:55
Core Viewpoint - The article discusses the delay of GTA 6 and highlights the advancements in AI-driven game engines, particularly focusing on the evolution of the Mirage game engine from version 1 to version 2, which aims to create interactive worlds similar to GTA [1][22]. Group 1: Mirage Game Engine Development - Mirage 1 was the first real-time world model-driven AI native UGC game engine, but it had limitations in scene generation [3][4]. - Mirage 2 has been released just over a month after Mirage 1, showcasing significant improvements in flexibility, intelligence, and performance [5][8]. - The new version allows users to create, experience, and modify any game world instantly, supporting image uploads and real-time dialogue for world modifications [8][17]. Group 2: Performance Enhancements - Mirage 2 has made notable advancements in generation performance, with improvements in object proportions, scene understanding, and overall scene precision [19][21]. - The interaction latency has been reduced to 200 milliseconds, allowing for smoother gameplay on a single consumer-grade GPU [18][20]. - The engine supports various styles of scenes, enhancing user experience and engagement [10][13]. Group 3: Comparison with Competitors - Mirage 2 is positioned to compete with DeepMind's Genie 3, offering more interactive capabilities such as running, jumping, and attacking, with a longer interaction horizon [17][18]. - Despite its advancements, Mirage 2 still faces challenges in visual consistency and character control precision, particularly during rapid scene changes [21][24]. Group 4: Future Prospects - The rapid development of Mirage 2 within a month raises questions about the potential advancements in AI-driven UGC game engines by the time GTA 6 is released in nine months [22].