Workflow
机器之心
icon
Search documents
刚刚,微软全新一代自研AI芯片Maia 200问世
机器之心· 2026-01-27 04:00
机器之心编辑部 一觉醒来,我们看到了微软自研 AI 芯片的最新进展。 微软原定于 2025 年发布的下一代 AI 芯片 Maia 200,终于在今天问世! 微软 CEO Satya Nadella 根据微软官方介绍,Maia 200 作为一款强大的 AI 推理加速器,旨在显著改善 AI token 生成的经济性。 Maia 200 基于台积电的 3 纳米工艺打造,配备原生 FP8/FP4 张量核心、重新设计的内存系统,拥有 216GB HBM3e 内存、7TB/s 带宽以及 272MB 片上 SRAM,并配有数据传输引擎,从而能够保证大规模模型高效、快速地进行数据流动。 这些使得 Maia 200 成为任何超级计算平台中表现最强的第一方硅片,其 FP4 性能是第三代 Amazon Trainium 的三倍,FP8 性能超越了谷歌第七代 TPU。 与此同时,Maia 200 还是微软迄今为止最高效的推理系统,每美元性能比该公司当前集群中的最新一代硬件提升了 30%。 Maia 200 是微软异构 AI 基础设施的重要组成部分,将为包括 OpenAI 最新 GPT-5.2 在内的多个大模型提供支持,为 Micro ...
跨境电商版Sora发布:全球首个AI原生电商视频Multi-Agent来了
机器之心· 2026-01-27 04:00
编辑|Youli 你的下一个视频团队,不一定非得是人。 做电商的朋友,一定对这样的时刻不陌生:前期找团队、磨脚本、 拍视频 ,筹备了半个月,好不容易在亚马逊或 TikTok 后台上新一款潜力爆款。谁想到,半夜在 TikTok 刷到竞品的一条爆火视频,作为行家,你一眼就看出这是泼天流量,你也想接住。 可粗略一算: 找模特、找摄影师、约场地、等剪辑,整套流程走完费用不低,且制作周期没半个月下不来…… 等把视频做出来,流量窗口早就关了,爆款也成了 库存。 这时候你一定幻想过: 如果有一个工具,能跳过所有拍摄流程,直接生成一条能出单的视频就好了。 你可能想到了 Sora。 还记得 Sora 刚发布时,全行业都在狂欢,以为这一时刻终于来了。但后来却被现实狠狠「打脸」:Sora 懂物理世界,懂光影,能生成惊艳画 面,可它不懂生意,不知道什么是「点击率」,更不知道什么是「卖点」。而且动辄几美元一秒的成本,让量产成了奢望。 但现在都 2026 年了,技术狂奔的当下,Sora 没能做到的事情, 一个由营赛 AI 发布的名为 inSai Hilight 的中国产品做到了 。 不需要任何拍摄素材,不需要复杂提示词,只「扔」进去一个 ...
大模型哪里出问题、怎么修,这篇可解释性综述一次讲清
机器之心· 2026-01-27 04:00
过去几年,机制可解释性 (Mechanistic Interpretability) 让研究者得以在 Transformer 这一 "黑盒" 里追踪信息如何流动、表征如何形成:从单个神经元到注意力头,再到 跨层电路。但在很多场景里,研究者真正关心的不只是 "模型为什么这么答",还包括 "能不能更稳、更准、更省,更安全"。 正是在这一背景下,来自 香港大学、 复旦大学 、慕尼黑大学、曼切斯特大学、腾讯 等机构的研究团队联合发布了 "可实践的机制可解释性" (Actio nable Mechanistic Interpretability) 综述。文章通过 "Locate, Steer, and Improve" 的三阶段范式,系统梳理了如何将 MI 从 "显微镜" 转化为 "手术刀",为大模型的对齐、能力增强和效 率提升提供了一套具体的方法论。 从 "显微镜" 到 "手术刀" 的范式转移 尽管大语言模型(LLM)近年来在多种任务上展现出了强大的能力,但其内部的运作机制依然在很大程度上不透明,常被视为一个 "黑盒"。围绕如何理解这一黑 盒,机制可解释性 (Mechanistic Interpretability, ...
DeepSeek-R1推理智能从哪儿来?谷歌新研究:模型内心多个角色吵翻了
机器之心· 2026-01-26 04:08
Core Insights - The article discusses the significant leap in reasoning capabilities of large models over the past two years, highlighting the advancements made by models like OpenAI's o series, DeepSeek-R1, and QwQ-32B in complex tasks such as mathematics and logic [1][2] - It emphasizes that the improvement in reasoning ability is not merely due to increased computational steps but rather stems from a complex, multi-agent-like interaction structure termed "society of thought," where models simulate internal dialogues among different roles to arrive at correct answers [2][3] Group 1: Reasoning Mechanisms - The research indicates that reasoning models exhibit higher diversity of perspectives compared to baseline models, activating a broader range of features related to personality and expertise during reasoning tasks [2][3] - Controlled reinforcement learning experiments show that even with reasoning accuracy as the only reward signal, base models spontaneously increase dialogic behaviors, suggesting that socialized thinking structures enhance exploration of solution spaces [3][4] Group 2: Dialogic Behaviors - The study identifies four types of dialogic behaviors in reasoning trajectories: question-answer sequences, perspective shifts, viewpoint conflicts, and viewpoint harmonization, which collectively enhance cognitive strategies [7][8] - The Gemini-2.5-Pro model's evaluations show high consistency with human scoring, indicating reliable identification of these dialogic behaviors [9][13] Group 3: Social Emotional Roles - The analysis categorizes social emotional roles in reasoning trajectories into 12 types, which are further summarized into four high-level categories, demonstrating a balanced interaction among roles rather than isolated usage [10][22] - The Jaccard index is used to measure the co-occurrence of roles, revealing that models like DeepSeek-R1 organize different roles in a more coordinated manner during reasoning processes [10][22] Group 4: Cognitive Behaviors - The study identifies four cognitive behaviors that influence reasoning accuracy, including information provision, information inquiry, positive emotional roles, and negative emotional roles [11][12] - The consistency of the Gemini-2.5-Pro model's evaluations with human scoring reinforces the reliability of these cognitive behavior classifications [13] Group 5: Experimental Findings - The findings demonstrate that even with similar reasoning trajectory lengths, models exhibit a higher frequency of dialogic behaviors and social emotional roles, particularly in complex tasks [16][23] - Experiments show that guiding dialogic features positively impacts reasoning accuracy, with a notable increase from 27.1% to 54.8% in a specific task when dialogic surprise features are positively reinforced [24][29] Group 6: Reinforcement Learning Insights - A self-taught reinforcement learning experiment indicates that dialogic structures can spontaneously emerge and accelerate the formation of reasoning strategies when only correct answers are rewarded [30]
5000万用户、5000万美金ARR,全球第一AI创作消费平台要做AI时代Roblox
机器之心· 2026-01-26 04:08
编辑|杨文 2026 年,AI 大模型的军备竞赛仍在继续。 各家公司争相发布更强大的模型版本,比拼参数量、推理速度、benchmark 得分,整个行业陷入了一种近乎狂热的「性能偏执」。在这种逻辑下,大部分人都认 为只要技术足够强,用户便会涌来。 然而,市场给出一个反直觉的反馈:用户侧出现了「智能过剩」。 刚刚履新腾讯 AI 首席科学家的姚顺雨首次公开露面时就表示: 对于 C 端用户,大部分人大多数时候并不需要用到这么强的智能 。 风投公司 Menlo Ventures 合伙人 @deedydas 也曾表达过同样的观点, 更广泛的用户群体其实并不太在意模型的智能水平 。 事实也佐证了这一点。截至目前,仍有大量用户坚持使用 Studio Diffusion 1.5 等「过时」模型进行创作。这也许从另一个层面说明了,用户真正消费的是风格、 情绪,并不是什么模型版本号。 过去两年,SeaArt 始终保持着每年用户规模与收入的高速增长。 2024 年,平台用户规模同比提升 7.7 倍,收入同比增长 5.5 倍。进入 2025 年,通过发力多模态与视频创作场景,平台流量与收入规模较 2024 年同期均实现 4-5 倍增长 ...
关于多模态大模型Token压缩技术进展,看这一篇就够了
机器之心· 2026-01-26 04:08
近年来多模态大模型在视觉感知,长视频问答等方面涌现出了强劲的性能,但是这种跨模态融合也带来了巨大的计算成本。高分辨率图像和长视频会产生成千上 万个视觉 token ,带来极高的显存占用和延迟,限制了模型的可扩展性和本地部署。 正是这种紧迫的需求催生了 MLLM Toke n Compression ,迅速成为研究爆点,两年内在该垂直领域产出了约 200 篇论文。但是随着研究工作的快速涌现,领域内 的方法也变得极其庞杂难以归类,进一步具体到落地场景里面,往往因为方法多样而难以选择。 针对这一背景, 来自 北京大学、中国科学技术大学等机构 的研究人员, 首先基于 压缩位置 对方法进行了系统归类,然后讨论了 对于特定的部署场景应该选择何 种压缩机制 ,最后探讨了目前的挑战和具有前景的方向。 Github 链接: https://github.com/yaolinli/MLLM-Token-Compression 论文链接: https://www.techrxiv.org/doi/full/10.36227/techrxiv.176823010.07236701/v1 图 1. MLLMs 中 T oken 压缩 ...
AAAI 2026杰出论文奖 | ReconVLA:具身智能研究首次获得AI顶级会议最佳论文奖
机器之心· 2026-01-26 03:08
在长期以来的 AI 研究版图中,具身智能虽然在机器人操作、自动化系统与现实应用中至关重要,却常被视 为「系统工程驱动」的研究方向,鲜少被认为能够在 AI 核心建模范式上产生决定性影响。 近年来,Vision-Language-Action(VLA)模型在多任务学习与长时序操作中取得了显著进展。然而,我们 在大量实验中发现,一个基础但被长期忽视的问题严重制约了其性能上限: 视觉注意力难以稳定、精准地 聚焦于任务相关目标。 以指令「将蓝色积木放到粉色积木上」为例,模型需要在复杂背景中持续锁定「蓝色积木」和「粉色积 木」。但现实中,许多 VLA 模型的视觉注意力呈现为近似均匀分布,不同于人类行为专注于目标物体, VLA 模型容易被 无关物体或背景干扰 ,从而导致抓取或放置失败。 而 ReconVLA 获得 AAAI Outstanding Paper Awards,释放了一个清晰而重要的信号: 让智能体在真实世界 中「看、想、做」的能力,已经成为人工智能研究的核心问题之一 。 这是具身智能(Embodied Intelligence / Vision-Language-Action)方向历史上, 首次获得 AI 顶 ...
这届网友太狠了:Clawdbot爆火,狂囤40台Mac mini来跑
机器之心· 2026-01-26 03:08
整个周末,大家都被一个名为 Clawdbot 的 AI 助手刷屏了。 被它顺带带火的,还有 Mac mini。谁也没想到,这个长期在书桌角落吃灰的小主机,会因为一个开源 AI 项目重新翻红,摇身一变成了社交平台的流量担当。 有人为此专门下单一台 Mac mini,只为让 Clawdbot 24 小时在线运行;也有人翻箱倒柜,把那台曾经被当作垫杯子的旧 Mac mini 请回桌面,重新通电: https://x.com/BahaGkc/status/2015007581404570071 相比只买一台 Mac mini 体验一下,更有狠人直接一步到位,一口气买了 40 台 Mac mini,全部用来运行 Clawdbot,并统一绑定 Claude Max 订阅。用他自己的话来 说,这不是冲动消费,而是一场对未来的押注: 你必须投资自己,才能成功。这是我的选择。现在是 2026 年,不要被甩在时代后面。 机器之心编辑部 也有人选择在闲置的树莓派上运行 Clawdbot。由于这款应用可以访问整台电脑上的软件和系统资源,因此更稳妥的做法并不是把它直接跑在自己的主力工作电脑 上,而是部署在一台独立设备上,或者至少进行隔 ...
谷歌、Anthropic双重围剿下的OpenAI,正面临「生死抉择」
机器之心· 2026-01-25 08:08
Core Insights - OpenAI is shifting its strategic focus from consumer products to enterprise markets, aiming to provide comprehensive support for business clients [1][3][10] - The company plans to implement a new business model that includes taking a percentage of profits generated by clients using its AI technologies [1][3] - OpenAI's CEO, Sam Altman, has indicated a desire to become a one-stop shop for AI needs, offering a range of products from chatbots to programming tools [5][10] Group 1: Strategic Shift - In 2026, OpenAI is prioritizing enterprise-level solutions alongside its consumer offerings [1][3] - The company is expected to release new AI hardware designed by former Apple design chief Jony Ive, potentially a wearable device [4] - A recent gathering led by Altman included executives from major companies like Disney, signaling OpenAI's intent to enhance its enterprise services [5] Group 2: Competitive Landscape - OpenAI is facing increasing competition from Anthropic, which has gained attention with its programming and office automation tools [6][7] - The market share of OpenAI has decreased from nearly 90% to around 65% over the past year, with Google Gemini emerging as a significant competitor [12][13] - Other competitors, including DeepSeek, Meta, Claude, Perplexity, and Grok, are also gradually capturing market share [14] Group 3: Product Development and Revenue - OpenAI's revenue from enterprise clients is currently about 40%, with expectations to rise to 50% by the end of the year [10] - The company is enhancing its products, including ChatGPT, by adding collaborative features and industry-specific applications [9] - OpenAI has restructured its sales approach to streamline the process for enterprise clients, focusing on a single sales representative for multiple products [9]
没博士没论文,这些人靠什么「野路子」杀进OpenAI等顶级AI大厂?
机器之心· 2026-01-25 04:01
Core Insights - The article emphasizes that individuals without traditional academic backgrounds can still secure opportunities in leading AI research labs like OpenAI through personal effort and strategic actions [2][25]. Group 1: Success Stories - Keller Jordan, who graduated from UC San Diego without any published papers, improved a research paper by a Google researcher, which led to a collaboration and a published paper [5][6]. - Keller's project, NanoGPT speed run, gained significant attention in the community, showcasing his ability to optimize a Transformer model and document his work thoroughly [6][7]. - Sholto Douglas transitioned from McKinsey to AI by engaging in independent research and asking insightful questions on GitHub, which caught the attention of a Google engineer and led to an interview opportunity [10][11]. - Andy L. Jones, a semi-retired quantitative trader, wrote a self-published paper that impressed xAI's Igor Babuschkin, leading to his recruitment at Anthropic [14][19]. - Kevin Wang, a student with a strong recommendation and a notable paper at NeurIPS, successfully joined OpenAI, highlighting the importance of mentorship in the recruitment process [21][23]. Group 2: Industry Trends - The article notes that AI research is becoming increasingly closed, with fewer public projects, but improving existing work remains a viable way to demonstrate capability [6]. - It highlights that many successful researchers in AI are not active on social media or traditional academic platforms, yet they contribute significantly to advancements in the field [13]. - The current era presents unique opportunities in AI research, where individuals can influence technology development while also receiving competitive compensation [26][28]. - The article concludes that a PhD is not a strict requirement for becoming a successful researcher or engineer; proactive engagement and impactful independent projects are key [28][29].