量子位
Search documents
整整21个月,豆包大模型正式进入2.0时代!
量子位· 2026-02-14 08:13
Core Insights - The article discusses the launch of Doubao Model 2.0, which is the largest update in 21 months, showcasing significant advancements in AI capabilities [2][8]. Group 1: Model Enhancements - Doubao Model 2.0 exhibits improvements in multi-modal understanding, enterprise-level agent capabilities, reasoning, and coding skills [9][10]. - The model achieved top scores in various benchmarks, including MathVista and LogicVista, outperforming its predecessor Seed1.8 and competing models like GPT-5.2 and Claude [11][12]. Group 2: Performance Metrics - In mathematical reasoning benchmarks, Doubao Model 2.0 scored 89.8 in MathVista and 90.5 in MathKangaroo, indicating a significant performance boost [11]. - The model also excelled in perception and recognition tasks, achieving 98.6 in VLMsAreBlind and 86.0 in RealWorldQA, showcasing its advanced capabilities [12]. Group 3: Practical Applications - Doubao Model 2.0 demonstrates strong performance in complex tasks such as coding and physics simulations, effectively handling intricate projects like a 3D Monopoly game and interactive applications [16][21]. - The model's enhanced reasoning and coding abilities allow it to solve complex mathematical problems and assist in project completion, indicating its potential for enterprise applications [28][30]. Group 4: Market Positioning - The timing of the Doubao Model 2.0 release suggests a strategic move to capitalize on advancements in data quality and training efficiency, positioning it favorably in the competitive AI landscape [33]. - The model's cost-effectiveness is highlighted, as it maintains high performance without significant delays, making it suitable for enterprise use in customer service and data analysis [35][36].
清华新框架让大模型学会「精读略读」!实现12倍端到端加速,基准评分翻倍
量子位· 2026-02-14 08:13
RAM团队 投稿 量子位 | 公众号 QbitAI 让大模型像人类一样阅读!通过精读略读实现性能与效率的双重飞跃。 在长上下文场景中,Transformer架构的二次计算复杂度让推理速度急剧下降,而人类面对长文档时却能游刃有余——我们不会逐字阅读整本 小说,而是 对关键情节精读,对背景描述略读 。 来自清华大学、鹏城实验室与阿里巴巴未来生活实验室的联合研究团队发现:现有任务相关的压缩方法不仅陷入效率瓶颈——要么一次性加 载全文 (效率低) ,要么自回归逐步压缩 (速度慢) ,更难以兼顾"保留关键信息"与"保持自然语言可解释性"。 受人类阅读认知启发,他们提出全新框架RAM (Read As HuMan) ,首次将 "精读+略读" 的混合策略引入上下文压缩,不仅在多个长文 本基准上取得卓越表现,更在平均1.6万token的输入上实现 12倍端到端加速 。 像人类一样阅读:精读重要内容,略读背景内容 研究团队从认知科学中汲取灵感:人类阅读时会动态分配注意力——对与目标高度相关的内容进行 精读 (close reading) ,保留全部语义 细节;对次要背景信息采用 略读 (skimming) ,快速提取核心语义。 ...
量子位编辑作者招聘
量子位· 2026-02-14 08:13
Core Viewpoint - The article emphasizes the ongoing AI boom and invites individuals to join the company "Quantum Bit," which focuses on tracking AI advancements and has established itself as a leading content platform in the industry [1]. Group 1: Job Opportunities - The company is hiring for three main directions: AI Industry, AI Finance, and AI Product, with positions available for both experienced professionals and fresh graduates [2][4]. - Positions are full-time and based in Beijing, with various levels of roles open for application [2][4]. Group 2: Job Responsibilities - **AI Industry Direction**: Focuses on innovations in infrastructure, including chips, AI infrastructure, and cloud computing [6]. - **AI Finance Direction**: Involves tracking venture capital and financial reports in the AI sector, monitoring capital movements within the industry [6]. - **AI Product Direction**: Concentrates on the application and hardware advancements in AI [6]. Group 3: Benefits and Growth Opportunities - Employees will have the chance to engage with the latest AI technologies, enhance their work efficiency through new AI tools, and build personal influence by creating original content [6]. - The company offers competitive salaries, comprehensive benefits including social insurance, meal allowances, and performance bonuses [6]. Group 4: Company Achievements - As of 2025, Quantum Bit has over 2.4 million subscribers on WeChat and more than 7 million users across platforms, with a daily reading volume exceeding 2 million [12]. - The company is recognized as the top new media outlet in the AI and frontier technology sector according to third-party data platforms [12].
情人节最硬核“Kiss”!中国AI突破300年亲吻数难题,连刷多维度纪录
量子位· 2026-02-14 08:13
Core Viewpoint - The article discusses the breakthrough in solving the Kissing Number Problem using AI, specifically through a system called PackingStar, which has achieved significant advancements in high-dimensional geometry [1][10][49]. Group 1: Kissing Number Problem Overview - The Kissing Number Problem investigates how many equal-sized spheres can touch another sphere without overlapping in n-dimensional space [2][4]. - The problem has historical significance, originating from a debate between Newton and Gregory in 1694 regarding the arrangement of spheres in three-dimensional space [5][6]. - Recent advancements have been limited, with only seven substantial progressions in nearly 50 years [9]. Group 2: Breakthrough Achievements - The PackingStar system, developed by a collaborative team from Shanghai Science and Technology Institute, Peking University, and Fudan University, has set new records for dimensions 25 to 31 [10][11]. - The system has also discovered over 6,000 new configurations in various dimensions and broken long-standing records in generalized kissing numbers [10][11]. Group 3: Methodology and AI Integration - PackingStar transforms the high-dimensional geometric problem into a multi-agent game, allowing AI to explore potential structures autonomously [18][24]. - The approach involves using a cosine matrix to represent the positions of spheres, which is well-suited for parallel computation on GPUs [18][24]. - The system employs a collaborative mechanism between two agents to fill, prune, and reconstruct geometric structures, significantly reducing the complexity of high-dimensional exploration [25][31]. Group 4: Implications for Mathematics and AI - The discoveries made by PackingStar challenge traditional human intuitions about symmetry in geometric structures, revealing many non-symmetric configurations that yield better results [27][28]. - The project exemplifies a shift in AI's role from merely assisting in calculations to actively participating in scientific exploration, marking a new phase in AI for Science [64][65]. - The results have implications across various mathematical fields, connecting concepts from sphere packing, number theory, and group theory, thus enhancing the overall mathematical discourse [34][60]. Group 5: Infrastructure and Future Directions - The project highlights the importance of robust AI infrastructure, which is crucial for tackling complex mathematical problems that require extensive computational resources [39][40]. - The development of custom CUDA operators and an automatic checkpointing system has improved the efficiency and stability of long-duration tasks [42][46]. - The success of PackingStar indicates a promising future for AI in mathematics, suggesting that previously unsolvable problems may become accessible through innovative AI methodologies [49][60].
「斯坦福AI小镇」创业即获投1亿美元!李飞飞卡帕西都投了
量子位· 2026-02-14 04:12
闻乐 发自 凹非寺 量子位 | 公众号 QbitAI "OpenClaw们"让2026彻底卷成了智能体大战。 当初那个"智能体早期真神",让25个智能体自己聊天、传八卦、谈恋爱的AI小镇Smallville团队也官宣创业了。 不仅当年在GitHub开源的时候爆火,现在一创业,热度依旧拉满。 公司名叫 Simile ,直接拿下了Index Ventures领投的1亿美元融资,连 卡帕西 、 李飞飞 也跟了。 当时AI小镇横空出世,这个"学术玩具"被视为 生成式智能体 领域的开创性成果,证明大模型能在虚拟世界里"活"起来。 现在创业是要把25个智能体升级为千级的大规模智能体平台,帮助人类进行决策的风险预测。 以防有朋友不知道AI小镇是干啥的,咱先来个前情回顾复习一下。 AI小镇 2023年,斯坦福团队在论文《Generative Agents》里,构建了一个叫Smallville的像素风虚拟空间。 他们把25个基于GPT的AI放进了小镇里,只给每个AI设定了一个简单的身份描述,比如外向的咖啡馆老板、乐于助人的药店店主等。 结果这25个智能体能在小镇里完全自主生活。 早上醒来刷牙、做早餐、去上班,下午闲聊八卦,晚上 ...
GPT-5.2改写粒子物理教科书!人类手算32项算不出,AI一行公式搞定
量子位· 2026-02-14 04:12
Core Viewpoint - The article discusses a groundbreaking discovery in particle physics where a long-held conclusion about gluon scattering amplitudes has been overturned, thanks to the contributions of GPT-5.2, which identified a key formula that was later proven by an OpenAI internal model [1][2][15]. Group 1: Discovery and Research Process - A previously believed zero scattering amplitude for a specific type of gluon scattering is shown to not be zero under certain kinematic conditions [2][14]. - The research team, led by Harvard's Andrew Strominger, initially struggled with complex calculations that grew exponentially with the number of parameters [4][19]. - The team turned to OpenAI to see if AI could assist, leading to GPT-5.2 proposing a crucial formula that simplified the problem [6][7]. Group 2: Technical Details of the Findings - Scattering amplitudes are central to particle physics, providing quantum probabilities for particle collisions, but their calculations are notoriously difficult [9]. - The article highlights the historical context, referencing the work of Parke and Taylor, who previously derived a complex expression for scattering amplitudes that was later simplified to a single line [11][13]. - GPT-5.2 identified a simplification when restricting to a specific region, leading to a significant reduction in the complexity of the expressions involved [26][28]. Group 3: Validation and Implications - The formula proposed by GPT-5.2 was proven by an OpenAI internal model after over 12 hours of computation, confirming its validity [29][30]. - The research team verified the proof manually, ensuring it met various consistency conditions, although these properties were not immediately apparent from the formula itself [32][33]. - This discovery marks the third instance of GPT-5.2 making original contributions to fundamental science, indicating the potential for AI to uncover previously unknown physical laws [34][40]. Group 4: Future Directions - The article notes that while the current formula represents a dramatic simplification, there is potential for even more concise expressions to be discovered [38][39]. - Future work is expected to explore the implications of these findings in broader contexts, including gravitational amplitudes and supersymmetry [37].
万亿思考模型新速度!蚂蚁开源Ring-2.5-1T:IMO金牌水平,强;混合线性架构,快!
量子位· 2026-02-14 01:15
Core Viewpoint - Ant Group has launched the world's first open-source hybrid linear architecture trillion-parameter model, Ring-2.5-1T, which excels in mathematical logic reasoning and long-range autonomous execution capabilities [2][3]. Group 1: Model Capabilities - Ring-2.5-1T achieved a gold medal level score of 35 in IMO and an impressive score of 105 in CMO, significantly surpassing national training team standards [3]. - The model can independently handle complex tasks such as search and coding, demonstrating its robust task execution abilities [3][8]. - It has broken the industry norm that deep reasoning requires sacrificing inference speed and memory usage, achieving a 3x increase in throughput while reducing memory usage to below 1/10 during long sequence generation [5][7][16]. Group 2: Architectural Innovations - The model employs a hybrid linear attention architecture, evolving from the Ring-flash-linear-2.0 technology, utilizing a 1:7 design of Multi-Head Latent Attention (MLA) combined with Lightning Linear Attention [9]. - Incremental training methods were used to maintain strong reasoning capabilities while achieving linear inference speeds, converting parts of the original GQA layers to Lightning Linear Attention [12]. - The activation parameter count increased from 51 billion to 63 billion, yet inference efficiency saw significant improvements compared to Ling 2.0 [15]. Group 3: Training Mechanisms - A dense reward mechanism was introduced to enhance logical reasoning, focusing on the rigor of the reasoning process, which significantly reduced logical flaws and improved advanced proof techniques [18]. - The model underwent large-scale asynchronous Agentic Reinforcement Learning training, enhancing its autonomous execution capabilities in long-chain tasks [18]. Group 4: Practical Applications - In practical tests, Ring-2.5-1T successfully solved complex abstract algebra proof problems, demonstrating high logical sensitivity and rigorous reasoning [20][24]. - The model also showcased its programming skills by writing a high-concurrency thread pool in Rust, effectively managing memory safety and concurrency [27]. - In an official demo, Ring-2.5-1T developed a miniature operating system, further proving its capabilities in system-level programming [31]. Group 5: Broader AI Developments - Ant Group also released the diffusion language model LLaDA2.1 and the multimodal model Ming-flash-omni-2.0, which significantly enhance inference speed and provide unique token editing and reverse reasoning capabilities [33][36]. - The goal is to create a reusable foundation for developers, allowing for easier access to multimodal applications without the need to piece together various models [39][40]. - The company aims to tackle complex challenges in video temporal understanding, intricate image editing, and real-time long audio generation, indicating a commitment to advancing multimodal AI technology [41].
OpenClaw同时收到Meta和OpenAI收购邀约!小扎闭关一周亲测,奥特曼祭出算力诱惑
量子位· 2026-02-13 13:19
Core Viewpoint - OpenClaw, a new AI project by Peter Steinberger, has attracted significant interest from major tech companies like Meta and OpenAI, with both offering enticing collaboration terms to leverage its capabilities [1][3][4]. Group 1: Company Interest and Offers - Peter Steinberger, the creator of OpenClaw, has received offers from Meta's Mark Zuckerberg and OpenAI's CEO, showcasing the competitive interest in the project [1][3]. - Meta's approach involved direct engagement, with Zuckerberg personally testing OpenClaw and providing feedback, while OpenAI offered substantial computational resources as an incentive [3][22]. - Other companies, including Microsoft, are also vying for a stake in OpenClaw, indicating its high value in the tech landscape [4][14]. Group 2: OpenClaw's Popularity and Development - OpenClaw has gained immense popularity, surpassing 189,000 stars on GitHub within a month, highlighting its viral growth and user interest [10]. - The project was initially a personal endeavor for Steinberger, who created the prototype in just one hour out of boredom during retirement, leading to its rapid development and subsequent success [34][39]. - The project has evolved from a simple tool to a significant player in the AI space, with its capabilities allowing for autonomous problem-solving by AI agents [41][46]. Group 3: Future Directions and Collaboration - Steinberger is considering collaboration with larger companies rather than starting a new venture, citing the need for resources to maintain OpenClaw and the benefits of working within established organizations [29][30]. - He emphasizes the importance of keeping OpenClaw open-source, potentially following a model similar to Chrome and Chromium, which balances open access with proprietary features [27][30]. - The collaboration aims to enhance the project's reach and impact, allowing more users to benefit from its capabilities while ensuring Steinberger remains involved in its development [32][64]. Group 4: Industry Insights and Predictions - Steinberger predicts that 80% of applications will eventually disappear, replaced by APIs as AI agents take over user interactions, fundamentally changing the programming landscape [56][59]. - He believes that while AI will not completely replace programmers, it will significantly alter the role of developers, making it easier for more people to create tools through AI [61][63]. - The ongoing competition between major tech firms for AI advancements, as seen with OpenClaw, reflects the industry's rapid evolution and the increasing importance of AI in various applications [66].
人形机器人放无人机,还能上天入海!有点过于赛博了吧
量子位· 2026-02-13 13:19
Core Viewpoint - The article highlights the advancements in humanoid robots and drones, particularly focusing on the innovative capabilities of China Telecom's TeleAI in developing integrated intelligent systems that enhance operational efficiency in complex environments [6][7]. Group 1: Technology and Innovation - China Telecom's TeleAI has introduced a humanoid robot, TeleBot-M, and a versatile drone, TeleAqua-Bee, showcasing a significant leap in embodied intelligence [9][16]. - TeleBot-M features a lightweight design with a single-arm and a six-degree-of-freedom leg, enabling stable movement and interaction with drones [10]. - The robot is powered by a self-developed high-performance neural system, TeleBotOS, which ensures smooth operation and precise control even under heavy computational loads [12]. - TeleBot-M's learning capabilities are enhanced by a simulation platform that allows it to adapt to complex environments through reinforcement learning [14][15]. Group 2: Functional Capabilities - TeleAqua-Bee is a compact drone that can operate both in the air and underwater, with a flight time of 10 minutes and a diving capability of 30 minutes to a depth of 10 meters [20][21]. - The TeleAqua family includes various models designed for specific tasks, such as TeleAqua-H8, which can carry a 5 kg load and operate for extended periods [24]. Group 3: Connectivity and Communication - The AI Flow architecture developed by TeleAI ensures seamless data transmission between robots and drones, even in challenging environments with limited connectivity [29][30]. - The GVC technology compresses video data significantly, allowing for efficient transmission of semantic information rather than raw pixels, enhancing communication capabilities [31][32]. Group 4: Applications and Future Prospects - The integration of these technologies aims to facilitate operations in hazardous environments, such as disaster response and industrial inspections, where human presence is risky [40][41]. - The vision for the future includes a network of intelligent agents that can collaborate effectively, enhancing operational capabilities across various sectors [45].
32k微调处理百万Token:21倍的推理加速,10倍的峰值显存节省,实现恒定内存消耗
量子位· 2026-02-13 13:19
CoMeT团队 投稿 量子位 | 公众号 QbitAI 当大模型试图处理一段包含100万token的超长文档时,会发生什么?答案是: 内存爆炸,计算崩溃 。 无论是分析整个代码库、处理万字研报,还是进行超长多轮对话,LLM的"长文本能力"都是其走向更高阶智能的关键。然而,Transformer架 构的固有瓶颈── 与上下文长度成平方关系的计算复杂度和线性增长的KV Cache ,使其在面对超长序列时力不从心,变成了一个既"算不 动"也"存不下"的"吞金巨兽"。 为了"续命",现有方案要么选择上下文压缩,但这本质上是有损的,信息丢失不可避免;要么采用循环机制,但这类模型又常常"健忘",难以 保留贯穿全文的关键信息,也记不清刚刚发生的细节。 △ CoMeT在32k上下文训练后,可在1M token中精准大海捞针,且推理速度和内存占用远优于全注意力模型 鱼与熊掌兼得:"协同记忆"架构 CoMeT的巧妙之处在于,它没有试图用单一机制解决所有问题,而是设计了一套双轨并行的协同记忆系统,让模型既能"记得牢",又能"看得 清"。 1. 全局记忆(Global Memory):一个带"门禁"的记忆保险箱 为了解决长期遗忘问题 ...