量子位
Search documents
小众架构赢麻了!通过编辑功能让100B扩散模型飙出892 tokens/秒的速度!
量子位· 2026-02-11 01:55
Core Viewpoint - The article discusses the emergence of the LLaDA2.1 model from Ant Group, which has achieved a remarkable speed of 892 tokens per second in complex programming tasks, marking a significant advancement over traditional autoregressive models [1][3][11]. Group 1: Model Performance and Features - LLaDA2.1 operates on a 100 billion parameter scale and has transitioned from a research model to a practical tool, demonstrating superior efficiency [3][4]. - The model introduces a dual-mode decoding strategy, allowing users to switch between Speedy Mode and Quality Mode with a single configuration, thus enhancing usability [9][10]. - In Speedy Mode, LLaDA2.1 achieves a peak speed of 892 tokens per second on the HumanEval+ benchmark, while in Quality Mode, it surpasses previous models in various reasoning tasks [11][31]. Group 2: Technical Innovations - The model employs an Error-Correcting Editable (ECE) mechanism, enabling it to generate drafts quickly and then refine them, addressing the limitations of traditional diffusion models [16][21]. - LLaDA2.1 successfully implements reinforcement learning (RL) on a 100 billion scale, enhancing its performance in instruction-following tasks and demonstrating that diffusion models can achieve both speed and understanding [23][26]. - The introduction of the EBPO algorithm allows for efficient training and editing, marking a significant milestone in the application of RL to diffusion models [25][28]. Group 3: Competitive Advantage - LLaDA2.1's performance in benchmark tests shows a significant advantage over mainstream autoregressive architectures, achieving high speeds without compromising quality [29][30]. - The model's ability to maintain quality even in Speedy Mode demonstrates its robustness, achieving a balance between speed and accuracy [32]. - A lighter 16 billion parameter Mini version has been released, achieving peak speeds exceeding 1500 tokens per second, indicating potential for more lightweight deployments [33].
人类画了100年的脑图,AI仅用几小时!还绘制出新脑区
量子位· 2026-02-10 11:59
听雨 发自 凹非寺 量子位 | 公众号 QbitAI 好消息,AI也可以帮科学家画脑图了! 近期,一个来自加州大学旧金山分校的神经科学团队提出了一种新的机器学习算法—— CellTransformer ,仅 花费几个小时 就完成了对5 只小鼠大脑图谱的分类和绘制工作。 这五只小鼠大脑的基因数据中包含1040万个细胞,每个细胞包含数百个基因。但通过这一创新性算法,研究团队不仅清晰地划分出了小鼠大 脑内的已知区域,还 绘制出了新的脑区 。 更夸张的是,这项技术很可能会进一步 应用于人类 。 画脑图的最新黑科技:CellTransformer 大脑图谱绘制是一门古老的学科,过去画脑图的方法相当复杂,需要科学家用铅笔在脑部图像上画线,连接不同区域。 2020年发布的艾伦小鼠脑通用坐标框架 (Allen Mouse Brain Common Coordinate Framework), 就是采用这种方法画出来的。 这幅脑图基于1675只小鼠的脑部数据,涵盖了1000多个不同的脑区,具有很高的价值。 但这类手工特征很强的图谱也不可避免地存在一个问题: 具有主观性 。 宾夕法尼亚州立大学医学院的神经解剖学家金永洙 (Yon ...
中文版Nano Banana来了?Qwen-Image-2.0炸场:1K长文本硬吃,中文生图彻底不拧巴了
量子位· 2026-02-10 11:59
梦瑶 发自 凹非寺 量子位 | 公众号 QbitAI 文本一长就糊、指令一杂就撂挑子、遇到中文更是一整个变形freestyle…… 「AI生图」的这点苦,到底有谁懂啊!!! 停,不用拧巴了,因为现在的AI,已经能稳稳吃下 1K token 的超长文字指令了: 复杂指令 也不在怕的,最近OpenClaw贼火,我索性让AI直接帮roll出一个赛博信息图海报(你就说牛不牛吧): 中文渲染 表现也不孬,《兰亭集序》这种公认的高难度文本,这AI居然能做到文字1:1还原,排版、笔锋都在线: 你以为到这儿就结束了,NONONO!因为它还能—— 多图编辑 。 随手丢给了它一张照片,人家直接给我甩出一组影棚级的9宫格写真!!(诶,突然感觉怒省一笔钱… 刚才帮我干活的这位,正是阿里刚刚发布的新一代图像生成及编辑模型—— Qwen-Image-2.0 。 1K token长文本、复杂指令、中文渲染、图片编辑、2K分辨率一次 性梭哈,连国际评测里的表现都已经冲到了仅次于Nano Banana Pro的 位置。 在AI生图界,最让人崩溃的倒不是写Prompt词,而是写了太多,AI根本不吃消,好的提示词真无!处!施!展! 不知道千问团队 ...
蚂蚁投了一家上海具身智能公司
量子位· 2026-02-10 07:00
Core Viewpoint - The article discusses the rapid development and investment activities in the field of embodied intelligence, highlighting a significant investment by Ant Group in a Shanghai-based startup, Daxiao Robotics, marking a notable entry into this sector for 2026 [2][5][6]. Investment Activities - Ant Group has led a financing round for Daxiao Robotics, which has gained attention in both academic and industrial circles [2][3][8]. - The funding round included participation from various investors such as Qiming Venture Partners, JinJing Capital, and others, with the capital aimed at advancing Daxiao's ACE embodied full-stack R&D paradigm and accelerating the development of its Kairos 3.0 world model [8][9]. Market Trends - The investment events in the embodied intelligence sector surged from 173 in the previous year to 447 in 2025, with total funding increasing from 13.7 billion to 55.4 billion yuan, representing year-on-year growth rates of over 250% and 400%, respectively [5]. Daxiao Robotics' Approach - Daxiao Robotics has introduced the ACE embodied full-stack R&D paradigm, which emphasizes a human-centered approach, contrasting with the traditional robot-centric development paths [11][10]. - The ACE paradigm focuses on environmental data collection as a foundational capability, utilizing multi-modal hardware to gather diverse information for training embodied models [13]. Technological Innovations - The Kairos 3.0 model aims to create a unified understanding framework across robotic entities, integrating physical laws and human behavior patterns to enhance the system's predictive capabilities [14][16]. - Daxiao Robotics is addressing common challenges in the field, such as data scarcity and generalization difficulties, by prioritizing data entry and world modeling before advancing to the deployment of embodied brain modules [18]. Team Composition - Daxiao Robotics boasts a strong leadership team, including Xiaogang Wang, a top-ranked computer scientist, and Da Cheng Tao, a distinguished professor with significant contributions to AI [19][21][27][31]. - The founding team includes researchers from prestigious institutions, enhancing the company's capability to tackle advanced AI challenges [33].
GLM-5架构曝光,智谱两日涨60%:采用DeepSeek同款稀疏注意力
量子位· 2026-02-10 07:00
梦晨 发自 凹非寺 量子位 | 公众号 QbitAI 不管Pony Alpha是不是智谱的,下一代旗舰大模型 GLM-5 都要来了。 GitHub代码确认,新一代架构细节曝光。 GLM-5采用了DeepSeek-V3/V3.2架构,包括稀疏注意力机制 (DSA)和多Token预测(MTP) ,总参数量745B,是上一代GLM-4.7的2 倍。 | 98 | + | | | --- | --- | --- | | ਰੇਰੇ | | | | | - | if model_arch == "DeepseekV32ForCausalLM": | | 100 | + | if model arch in ["DeepseekV32ForCausalLM", "GlmMoeDsaForCausalLM"]: | | 101 | | from vllm.platforms import current_platform | | 102 | | | | 103 | | capability = current platform.get device capability() | | | ) vllm/config/specu ...
ChatGPT开测广告!OpenAI终于向“钱”看
量子位· 2026-02-10 07:00
柚子 发自 凹非寺 量子位 | 公众号 QbitAI 该来的还是来了! ChatGPT+广告 终于一锤定音—— OpenAI刚刚正式宣布,开始在全美免费版及Go版测试ChatGPT广告功能。 不仅用户差评不断,对家Anthropic也在火上浇油,斥资数百万美元只为在超级碗广告中嘲讽OpenAI这一决定。 广告正在入侵AI,但 Claude不会 。 (doge) 结果不出所料,评论区依旧骂声一片…… 那为啥如此头铁,强行要将广告端上桌呢? 答案藏在OpenAI同步发出的25分钟播客里——为了支撑免费用户使用,AKA 缺钱 。 上线广告功能 至于为什么要上线广告功能,OpenAI是这样描述的: 为了实现AGI的全民普及。 众所周知,广告是相当成熟的商业模式。谷歌、脸书、Ins最初都是免费的,后来才通过定向广告实现了盈利。 尤其对于低消费群体而言,广告投放是提高这部分用户转化率的关键,例如Netflix在最新推出的每月8美元套餐中就引入了广告模式。 根据官方公告,本次上线只面向美国地区的 免费版 和 Go版 (8美元/月) 客户,其余订阅版本不受影响。 此外,大众所关心的ChatGPT内容污染,也得到了官方的明确答 ...
一个大脑搞定所有模态,百度ERNIE 5.0技术报告公布
量子位· 2026-02-10 05:33
克雷西 发自 凹非寺 量子位 | 公众号 QbitAI 模型发布近3个月后,百度ERNIE 5.0的技术报告终于来了。 其底座采用超级稀疏的 Ultra-Sparse MoE 架构,参数量高达万亿,但推理时真正激活的参数不到3%,是目前公开模型中首个实现这一规模 的统一自回归模型 。 而且架构上拒绝"拼接",真正做到了 四种模态的原生自回归统一 ,让所有模态从零开始就在同一个Transformer Backbone里跑。 ERNIE 5.0的成绩单也相当漂亮:VBench视频语义评分拿下83.40,语音识别AISHELL-1字错率低至0.31,MATH推理也跑出了73.89,妥妥 的六边形战士。 看了这份报告,网友表示ERNIE的模式非常有意思。 MoE路由调度不看模态 为了打破不同模态数据之间的隔阂,ERNIE 5.0在核心架构上采用了一种 模态无关的专家路由 (Modality-Agnostic Expert Routing)机 制。 这和以往那些"分而治之"的传统模型大不相同,拆除了人为设立的模态壁垒,不再预先给数据贴上"视觉"或"语言"的标签。 ERNIE 5.0中,研发团队构建了一个 共享专家池 ( ...
1700个OpenClaw技巧,我用多邻国的方式学会的!
量子位· 2026-02-10 05:33
金磊 发自 凹非寺 量子位 | 公众号 QbitAI AI来了,学习方式也变样了。 就好比面对最近大火的 OpenClaw (原Clawdbot、Moltbot),单单是与它相关的使用技巧,就已经有 1700多个 。 这个GitHub项目里有如此多的Skills,到底该 怎么学才能记得住? That's a big problem~ 但也正如我们刚才说的,AI来了,一切都变了。 现在,你可以把这个GitHub下载成PDF,然后直接喂给AI: 在开始学习之后,教程会先对OpenClaw的基础知识做一个梳理,而且还是 图文并茂 的那种: 与此同时,为了保证 准确性 ,课程还专门设置了 对照学习 的功能,可以比对上传文件对应的位置边比对边学习: 在每一个环节过后,还会有一个 小测 环节,例如它会问你: ClawHub注册表在筛选技能时,剔除了哪些类型的低质内容? 作答过后,AI也会基于正确答案,给予 解析 反馈: 不大一会儿,一个与之相关的 多邻国式学习 的课程就被AI搞出来了: 点进去,映入眼帘的,就是一个涵盖10节课的教程,包括对整个OpenClaw的 知识框架 和 知识卡片 : 如此一来,海量的新知识就会以 ...
华为发布业界首个扩散语言模型Agent,部分场景提速8倍!
量子位· 2026-02-10 05:33
允中 发自 凹非寺 量子位 | 公众号 QbitAI 大模型通往现实世界的"最后三公里", Agent 已然成为最具代表性的入场券。 但当下的共识发生了微妙的变化: https://noah-dllm.github.io/ 核心结论一览 在完全相同的Agent工作流、训练数据和交互预算下,研究发现: 衡量一个Agent够不够强,早已不再看它能不能"答对问题",而是看它在面对多轮推理、工具调用及复杂协作时,能否用 最短的路径、最少的 交互预算,稳定地搞定任务 。 在这一背景下,一个长期被行业忽视的底层命题浮出水面: 当Agent的框架、工具、数据和训练方式都保持一致时,仅仅改变语言模型的生成范式(Autoregressive vs Diffusion),是否会系统 性地改变Agent的规划与行为模式? 近日,来自 华为诺亚方舟实验室、华为先进计算与存储实验室、UCL、南洋理工大学、清华大学和北京大学 的研究团队, 在最新工作 《DLLM Agent: See Farther, Run Faster》中,对这一问题给出了迄今为止最"对照实验式"的回答。 他们发现,仅仅是把"底座"换成了扩散式大模型(DLLM),A ...
量子位编辑作者招聘
量子位· 2026-02-10 05:33
Core Viewpoint - The article emphasizes the ongoing AI boom and invites individuals to join the company "Quantum Bit," which focuses on tracking AI advancements and has established itself as a leading content platform in the industry [1]. Group 1: Job Opportunities - The company is hiring for three main directions: AI Industry, AI Finance, and AI Product, with positions available for both experienced professionals and fresh graduates [2][4]. - Positions are open for various levels, including editors, lead writers, and chief editors, with a focus on matching roles to individual capabilities [6]. Group 2: Job Responsibilities - **AI Industry Direction**: Responsibilities include tracking innovations in infrastructure, such as chips, AI infrastructure, and cloud computing, as well as interpreting technical reports from conferences [6][7]. - **AI Finance Direction**: Focuses on venture capital, financial reports, and capital movements within the AI industry, requiring strong analytical skills and a passion for interviews [11]. - **AI Product Direction**: Involves monitoring AI applications and hardware developments, producing in-depth evaluations of AI products, and engaging with industry experts [11]. Group 3: Benefits and Work Environment - Employees will have the opportunity to engage with cutting-edge AI technologies, enhance their work efficiency through new tools, and build personal influence in the AI field [6]. - The company offers competitive salaries, comprehensive benefits including social insurance, meal allowances, and performance bonuses, along with a dynamic and open team culture [6]. Group 4: Company Growth and Reach - By 2025, Quantum Bit aims to have over 2.4 million subscribers on WeChat and more than 7 million users across platforms, with a daily reading volume exceeding 2 million [12].