Workflow
量子位
icon
Search documents
2025最大AI赢家的凡尔赛年度总结,哈萨比斯Jeff Dean联手执笔
量子位· 2025-12-24 00:42
Core Insights - The article emphasizes that 2025 marks a significant year for AI advancements, particularly in reasoning, collaboration, and scientific discovery, led by Google [1][3][9] Group 1: AI Development and Integration - Google has made substantial progress in reasoning, multi-modal understanding, model efficiency, and generative capabilities, significantly enhancing model performance [15][4] - The Gemini series, particularly Gemini 3 Pro, has set new standards in multi-modal reasoning and achieved top scores in various benchmark tests, including a 23.4% record in MathArena Apex [18][19] - AI has been deeply integrated into Google's core products, transforming from a tool to a practical asset for users [5][10][23] Group 2: Generative Media and Creative Tools - 2025 is highlighted as a transformative year for generative media, with AI providing unprecedented capabilities for video, image, audio, and virtual world generation [24][25] - Google has collaborated with creative professionals to develop tools like Flow and Music AI Sandbox, enhancing creative workflows [25][21] Group 3: Scientific and Mathematical Advancements - AI has significantly contributed to advancements in life sciences, health, natural sciences, and mathematics, empowering researchers with new tools and resources [27][28] - The AI system AlphaFold, which addresses protein folding, has been widely adopted by researchers globally, marking a milestone in scientific research [28] Group 4: Quantum Computing and Physical World Research - Google has made notable advancements in quantum computing and energy-efficient technologies, including the launch of a new TPU designed for the reasoning era [33][32] - The company has also made strides in robotics and visual understanding, integrating AI agents into both physical and virtual environments [33] Group 5: Addressing Global Challenges - Google's AI-driven scientific progress is being applied to tackle critical global challenges, including climate resilience, public health, and education [36][38] - The company has developed advanced forecasting models that enhance decision-making in various sectors, including weather prediction [36] Group 6: Responsibility and Safety - Google emphasizes the importance of combining research breakthroughs with responsibility and safety, continuously improving tools and frameworks to mitigate risks [42][43] - The Gemini 3 model is noted as the safest model to date, undergoing comprehensive safety assessments [44] Group 7: Collaboration and Open Ecosystem - Google advocates for cross-sector collaboration to responsibly advance AI, establishing partnerships with leading AI labs and educational institutions [46][45] - The company aims to continue promoting cutting-edge technology safely and responsibly for the benefit of humanity [47]
AI Coding新王登场!MiniMax M2.1拿下多语言编程SOTA
量子位· 2025-12-23 13:40
克雷西 发自 凹非寺 量子位 | 公众号 QbitAI MiniMax最新旗舰级Coding & Agent模型 M2.1 ,刚刚对外发布了。 一边是港交所聆讯通过新进展,另一边新模型还在嗖嗖嗖上新——而且还SOTA了。 这一次,它直接甩出了一份硬核成绩单,在衡量多语言软件工程能力的Multi-SWE-bench榜单中,以仅10B的激活参数拿下了49.4%的成绩, 超越了Claude Sonnet 4.5等国际顶尖竞品,拿下全球SOTA。 它试图解决的,就是此前模型身上严重的"学科偏科"问题。 所谓偏科,指的是过去的模型,写写Python脚本或Web前端页面表现还可以,可一旦涉及到后端架构,亦或底层逻辑,表现往往会出现断崖 式下跌。 M2.1的核心进化,就在于它终于突破了这个难题,掌握了后端的开发规范。 M2.1的发布,也证明了MiniMax在推进上市流程的同时,仍保持着高频的研发节奏。 更懂底层,10B激活参数拿下SOTA M2.1将对工程上下文的理解,转化为了对开发工具链的深度适配。它不仅能生成代码,更能熟练配合Cursor、Claude Code等主流编程工 具,在存量代码库中执行精准的修复(Fix)或 ...
AI狼人杀终极决战!GPT、Qwen、DeepSeek大乱斗,人类高玩汗流浃背
量子位· 2025-12-23 04:16
鹭羽 发自 凹非寺 量子位 | 公众号 QbitAI 我真栓Q了!围观了场 狼人杀 ,看得我汗流浃背…… 半小时全程高能,根本停不下来: 天崩开局倒钩狼悍跳预言家、冲锋狼死于话多、神职上大分每晚都是平安夜。 结果你跟我说,这些玩家都是 AI ??? 果然会玩还得看 淘宝 ~最近他们整活的这个AI狼人杀大乱斗 WhoisSpy.ai ,大模型在里面简直咔咔乱杀。 D老师、Qwen、Kimi、GLM一个个都化身心机boy推拉博弈,be like: …… 不过u1s1,虽然这些Agent看似性格迥异,实则一个个都是狼人杀高玩来着。 而且门槛也不高,自己就能手搓一个出来。 是不是有点手痒了? (咳咳) 不卖关子了,这就是我最近刷到的一个AI狼人杀比赛,还是淘宝办的——首届 「高校生VS开发者对抗赛」 。 展开来说,就是淘宝发了个召集令,广邀高校学生和AI开发者,带着自家Agent来真刀实枪碰一场,看看谁的Agent思维更缜密、更会盘逻 辑。 六边形战士 Kimi :武力值MAX,第六感Next Level。 老实人 DeepSeek :虽然我只是一介平民,虽然我只会划水,但我相信跟对人走对路,奥利给! 喜剧人 Qwe ...
单卡训练1亿高斯点,重建25平方公里城市:3DGS内存墙被CPU「外挂」打破了
量子位· 2025-12-23 04:16
Core Viewpoint - The article discusses the introduction of CLM (CPU-offloaded Large-scale 3DGS training), a system that allows for city-scale 3D reconstruction using a single consumer-grade GPU, specifically the RTX 4090, by offloading memory-intensive parameters to CPU memory, significantly lowering hardware requirements for large-scale neural rendering [1][21]. Group 1: 3D Gaussian Splatting (3DGS) Challenges - 3DGS has become a crucial technology in neural rendering due to its high-quality output and rendering speed, but it faces significant challenges when applied to complex scenes like urban blocks, primarily due to GPU memory limitations [2]. - A high-precision 3DGS model typically contains tens of millions to over a hundred million Gaussian points, with each point requiring substantial memory for parameters, gradients, and optimizer states. Even high-end GPUs like the RTX 4090, with 24GB of memory, can only handle about 15-20 million points, which is insufficient for city-scale scenes [2][3]. Group 2: CLM Design Principles - CLM is based on the observation that only a small fraction of Gaussian points are actively used during each rendering pass, with less than 1% of points accessed in large scenes [3]. - The system design of CLM involves dynamically loading Gaussian parameters from CPU memory as needed, rather than keeping all parameters in GPU memory [4]. Group 3: Key Mechanisms of CLM - **Attribute Segmentation**: CLM retains only "key attributes" (10 parameters) necessary for visibility checks in GPU memory, while the remaining 80% of "non-key attributes" are stored in CPU memory and loaded on demand [6][7]. - **Pre-rendering Visibility Culling**: Unlike traditional methods, CLM calculates visible Gaussian point indices before rendering, reducing unnecessary GPU computations and memory usage by only loading visible points from CPU memory [9][10]. - **Efficient CPU-GPU Collaboration**: CLM employs a multi-layered design to mitigate data transfer delays, including micro-batching, caching mechanisms, and intelligent scheduling to maximize efficiency and minimize communication overhead [12][13][14][15]. Group 4: Performance Results - CLM technology significantly increases model size, allowing for the training of 102.2 million Gaussian points on the "MatrixCity BigCity" dataset, a 6.7-fold increase compared to traditional methods, which maxed out at 15.3 million points [16]. - The quality of reconstruction improves with more parameters, achieving a PSNR of 25.15dB for the 102.2 million point model, compared to 23.93dB for the smaller model [18]. - Despite communication overhead, CLM maintains a training throughput of 55% to 90% of the enhanced baseline on the RTX 4090, and up to 86% to 97% on the slower RTX 2080 Ti [19]. Group 5: Broader Implications - CLM represents a significant advancement in addressing deployment bottlenecks in 3DGS training, integrating CPU resources into the training process without the need for multi-GPU setups, thus providing a cost-effective solution for large-scale scene reconstruction [21]. - The growing demand for efficient and low-cost 3D reconstruction tools in applications like digital twins and large-scale map reconstruction highlights the importance of CLM's approach in optimizing existing computational resources [21].
智能体落地元年,Agent Infra是关键一环|对话腾讯云&Dify
量子位· 2025-12-23 04:16
Core Viewpoint - The year 2025 is anticipated to be the "Agent Year," marking a significant shift in the industry towards practical applications of Agent technology [1][2]. Group 1: Development and Challenges of Agents - The Agent technology has transitioned from a nascent stage to practical engineering applications throughout the year [3][7]. - Key challenges in the implementation of Agents include the need for a robust engineering approach to manage complex systems and the importance of Agent Infrastructure (Infra) [6][21]. - The industry recognizes the value of Agents as they effectively address real-world problems, moving from theoretical discussions to tangible applications [6][12]. Group 2: Perspectives from Industry Leaders - Industry experts highlight a clear divide between traditional narratives from Silicon Valley and practical applications seen in smaller businesses, indicating a shift towards realism in Agent development [8][10]. - The emergence of AI coding tools is noted as a significant development, changing software engineering paradigms and serving as a universal interface for Agents [7][34]. - The consensus among experts is that the capital market is seeking new organizational methods, as the previous internet era's benefits have been largely exhausted [12][13]. Group 3: Engineering and Infrastructure - The concept of Agent Infra is crucial for managing the uncertainties inherent in Agent systems, with a focus on creating a safe and effective operational environment [21][22]. - The development of safety sandboxes and observability tools is essential for addressing the risks associated with autonomous Agent operations [22][23]. - The distinction between essential complexity and incidental complexity in enterprise problem-solving is emphasized, with a focus on building a common subset of solutions for various challenges [27][28]. Group 4: Future Trends and Directions - Future developments in Agent Infra are expected to focus on ensuring safe and reliable operations while optimizing the intelligence of Agents through continuous data utilization [38][39]. - The integration of memory management and semantic context is highlighted as a key area for enhancing Agent capabilities [40]. - The industry anticipates a significant transformation in mobile development ecosystems as Agents become mainstream, necessitating a shift in development methodologies and collaborative practices [41][44].
量子位编辑作者招聘
量子位· 2025-12-23 04:16
加入我们,你可以获得: 以下是岗位详情: 所有岗位不同能力层级职位均在开放,欢迎结合个人履历和经验申请。 编辑部 发自 凹非寺 量子位 | 公众号 QbitAI AI热潮还在汹涌,但如果你还不知道如何参与……那为什么不来 量子位 呢? 我们是一家以 追踪AI新进展 为核心的内容平台,经过8年积累,目前拥有顶流影响力,广泛且备受认可的产业资源,以及时代风口的最佳观 测和学习生态位。 目前,我们有 三大方向 岗位招聘,希望你是 (或者能成为) 这三个方向的内容专家: 岗位均为全职,工作地点:北京中关村。 岗位面向: AI产业方向 岗位职责: AI产业方向 :关注基建层创新,包含芯片、AI Infra、云计算; AI财经方向 :关注AI领域创投和财报,跟踪产业链资本动向; AI产品方向 :关注AI在应用和硬件终端方向的进展。 社招:覆盖编辑、主笔、主编各个层级,按能力匹配岗位; 校招:应届毕业生,接受实习且可转正。 站在AI浪潮之巅 :第一时间接触和了解AI领域最新技术和产品,构建完整的AI认知体系。 玩转AI新工具 :将各种AI新技术、新工具应用于工作,提升工作效率和创造力。 打造个人影响力 :通过撰写独家原创内 ...
我们走访全国百强三甲医院,发现40%都选了同一家AI公司
量子位· 2025-12-23 03:01
Core Viewpoint - The article discusses the challenges and opportunities in the medical AI sector, particularly focusing on the company Yunzhisheng, which has established itself as a leading player in the field by successfully integrating AI solutions into hospitals and demonstrating significant operational efficiency improvements [1][9][64]. Group 1: Medical AI Challenges - Patients increasingly consult AI chatbots before visiting doctors, leading to communication challenges in clinical settings [2][3]. - The high "hallucination rate" of general AI models in medical contexts can reach up to 40%, raising concerns about their reliability [4][5]. - Medical AI must navigate stringent requirements for stability, acceptance across various healthcare systems, and the high costs associated with medical errors [12][18][19]. Group 2: Yunzhisheng's Position - Approximately 40% of top-tier hospitals in China have adopted Yunzhisheng's medical AI solutions, indicating its strong market presence [9][22]. - The company has deployed its solutions in 400 hospitals, with a nearly 90% direct usage rate of generated medical records, significantly reducing doctors' time spent on documentation [22][23][25]. - Yunzhisheng's medical AI solutions are designed to integrate seamlessly into existing workflows, enhancing efficiency without adding to the workload of healthcare professionals [58][60]. Group 3: Technological Advancements - Yunzhisheng's latest model, "Shanhai·Zhimed 5.0," employs a dual-core system capable of processing structured information and multimodal inputs, enhancing its diagnostic capabilities [34][36]. - The model's architecture includes a three-layer data paradigm that improves its understanding of medical contexts and reduces hallucination rates to below 3% [42]. - The company has consistently ranked at the top of medical AI evaluation platforms, demonstrating its technological superiority [45][46]. Group 4: Business Growth and Market Trends - Yunzhisheng's medical business revenue reached 0.70 billion, a 22.3% increase year-on-year, highlighting its growth trajectory [66]. - The average revenue per medical client has more than doubled, indicating a significant increase in customer value [66]. - The broader market for medical AI is expected to grow, with increasing investments and policy support aimed at integrating AI into healthcare workflows [79][82][86].
易烊千玺的华为绿手机,真的AI了
量子位· 2025-12-23 00:15
Core Viewpoint - The article discusses the launch of Huawei's nova 15 series, highlighting its innovative features, design, and pricing strategy, which aims to attract a younger audience while enhancing AI capabilities in photography and communication. Product Overview - The nova 15 series includes three models: standard, Pro, and Ultra, all equipped with HarmonyOS 6 [4] - The Ultra and Pro versions feature a horizontal stacked design and dual star lens module, resembling "two big eyes" [5] - The series introduces the Kirin 9 series chip for the first time in the Pro and Ultra versions, aligning performance with Mate and Pura series [6] - Pricing starts at 4199 yuan for the Ultra version, 3499 yuan for the Pro version, and 2699 yuan for the standard version [7][10] AI Capabilities - The nova 15 series incorporates advanced AI features, particularly in photography, with the Ultra and Pro versions featuring a dual red maple imaging system [14] - Color accuracy has improved by 120%, and spatial resolution has increased by 100,000 times due to the red maple original color lens [17] - AI assists in photo composition and includes a unique "color dipping" feature that allows users to apply colors and styles from online images to their own photos [21][22] - The series also offers AI-driven photo editing capabilities and one-click content creation [25][28] Communication Features - The phone includes a call summary feature that automatically generates key points after a call, which can be synced to a memo [31] - Dual-direction call noise reduction is optimized for noisy environments like subways and malls [33] - A family fraud prevention feature allows family members to share risk information and assist in handling suspicious calls [35] Design and Specifications - The nova 15 series maintains a recognizable design with a focus on youthfulness and high identification [39] - The Ultra version features a 2.5D flat screen and is available in four colors: "vibrant green," "good match purple," "zero degree white," and "fantasy night black" [40] - Both the Ultra and Pro versions come with a 6500mAh battery and support for 50W wireless fast charging [42][51] - The Ultra version has a thickness of 6.8mm and weighs approximately 209g [43] - The rear camera system includes three 50MP RYYB lenses, supporting variable aperture and optical stabilization [45] - The series is equipped with Kunlun glass and supports IP68 & IP69 dust and water resistance [47] Performance Improvement - According to Huawei's lab tests, the overall performance of the nova 15 series has improved by 62% compared to the previous generation [53]
智谱IPO敲钟前,连夜把开源编程大模型SOTA了
量子位· 2025-12-23 00:15
AIME 25和人类最后 考试 (HLE) 等基准 中,GLM-4.7分数超GPT-5.1; SWE-Bench分数达(73.8%,+5.8%),创开源新高。 鱼羊 henry 发自 麦蒿寺 量子位 | 公众号 QbitAI 2025倒计时,新SOTA模型涌现没有放缓迹象。 一夜之间,编程SOTA模型易主,而且上线即开源,依然来自中国大模型公司—— 智谱AI,GLM-4.7。 这波更新,技术报告里满眼都是 Coding , Coding ,还是 Coding 。 而能力的提升,带来的最直观效果是: 官方Demo显示,写个植物大战僵尸不费劲: 官网Chatbot和API均已就为,现在就能在线开玩。 Demo来吧,展示 在前端生成质量上,GLM-4.7展现出明显升级:页面结构更干净、组件层级更清晰。 相比GLM-4.6,更像是现代的Web UI,网友元素中更加美观。 总而言之,模型这么一发,双旦的节庆氛围一下到位了(doge)。 在复杂几何结构与空间关系的表达上,GLM-4.7模型能够保持较好的结构一致性与细节稳定性。 3D资产的生成质量也有显著提升。 在PPT与视觉物料生成方面,GLM-4.7标题层级明确、元素 ...
为什么Agent总是Demo猛如龙实战一条虫?
量子位· 2025-12-22 09:30
Core Viewpoint - The article discusses the limitations of AI agents in real-world applications compared to their impressive demonstrations, emphasizing that adaptability is a key factor for improvement [1]. Summary by Sections Definition and Functionality of Agents - Agents are defined as AI systems that can plan, utilize tools (such as search engines and databases), and remember information to complete complex tasks independently [3]. Adaptability Framework - The core bottleneck in current agent systems is adaptability, specifically how models adjust their behavior based on feedback signals [6]. - A 2x2 classification framework is proposed to categorize existing adaptation methods into four paradigms based on two dimensions: who is optimized (the agent or the tools) and where the feedback signal comes from (tool execution results or agent output evaluations) [7][8][9]. Four Paradigms of Adaptation - **A1 Paradigm**: Agents learn from feedback based on tool execution, such as whether code runs successfully [10]. - **A2 Paradigm**: Uses the agent's final output as the optimization signal, exemplified by models like DeepSeek-R1 that train reasoning capabilities through reinforcement learning [11]. - **T1 Paradigm**: Tools are pre-trained independently and then called by the agent, allowing for plug-and-play functionality [12]. - **T2 Paradigm**: Tools optimize themselves based on the agent's output, creating a symbiotic relationship [13]. Benefits of Classification - This classification helps developers avoid trial and error when improving AI capabilities, allowing for targeted adaptations based on specific needs [15]. - It also clarifies trade-offs: modifying AI (A1/A2) is flexible but costly, while modifying tools (T1/T2) is cheaper but limited by the AI's inherent capabilities [16]. Key Findings on Data Efficiency - The T2 paradigm demonstrates significantly higher data efficiency compared to the A2 paradigm. For instance, the Search-R1 using A2 requires approximately 170,000 training samples, while T2 only needs 2,400 samples, achieving comparable results [18][19][20]. Frontiers in Adaptability Research - The article identifies four cutting-edge directions for agent adaptability research: - **Co-Adaptation**: Aims for agents and tools to optimize together within the same learning cycle, presenting challenges in credit assignment [21]. - **Continual Adaptation**: Addresses the need for agents to continuously learn new skills without forgetting old ones in a changing environment [23]. - **Safe Adaptation**: Highlights concerns that large models may erode safety measures established during supervised fine-tuning, making them more vulnerable to attacks [25]. - **Efficient Adaptation**: Focuses on resource-constrained scenarios, discussing techniques like LoRA and FlashRL for efficient learning [27]. Additional Resources - The article mentions that a GitHub repository has been opened to continuously collect related papers and resources, serving as a guide for developers building agent systems [29].