机器之心
Search documents
英伟达、AMD本月起或涨价,5090两千美元变五千
机器之心· 2026-01-01 03:42
Core Viewpoint - The price increase of GPUs by Nvidia and AMD is becoming a certainty, with expected adjustments starting in early 2026 [1][3]. Group 1: Price Increase Details - Nvidia and AMD plan to gradually raise prices for their GPUs in the coming months, with AMD starting in January and Nvidia in February [3]. - The price hike will initially affect consumer-grade GPUs, such as Nvidia's GeForce RTX 50 series and AMD's Radeon RX 9000 series, with the flagship RTX 5090 expected to rise from an official price of $1999 to around $5000 this year [4][6]. Group 2: Cost Structure and Drivers - The primary driver for the price increase is the rapid growth of memory costs within the GPU cost structure, with memory now accounting for over 80% of the overall manufacturing cost [7][8]. - The procurement cost of the 16GB GDDR7 memory used in the RTX 5070 Ti has surged from $65-80 in May 2025 to $210-260 by December 2025, complicating the maintenance of current GPU prices [8]. Group 3: Impact on AI and Other Products - The price increase will likely extend across all product lines, including GPUs used in AI data centers and servers, as new contracts signed in 2026 will reflect the increased memory prices [6][9]. - The flagship AI GPU H200 from Nvidia, priced between $30,000 and $40,000, is also expected to see further price increases this year due to rising memory costs [9]. Group 4: Market Reactions - Asus has announced a price increase for some products starting January 5, citing rising DRAM and storage costs driven by AI demand [10]. - Dell has previously indicated a price increase of 30%, reflecting similar market conditions [14].
AAAI 2026 Oral | 给多流数据配「私教+外援」,漂移来了也不慌
机器之心· 2026-01-01 03:42
本文作者为:En Yu, Jie Lu, Kun Wang, Xiaoyu Yang, Guangquan Zhang。所有作者均来自于悉尼科技大学(UTS)澳大利亚人工智能研究院(AAII)。 在智慧城市、社交媒体、工业物联网等真实开放动态环境中,数据往往以多流(Multistream)形式并发产生。然而,现实世界并非完美的实验室,这些数据流往往 存在异构性,且分布变化各不相同,伴随着复杂的异步概念漂移。 如何让模型既能 "专精" 于单一流的特性,又能 "博采众长" 利用流间相关性,同时还能自适应分布变化? 悉尼科技大学(UTS)研究团队提出了一种全新的 漂移感知协作辅助混合专家学习框架 —— CAMEL (Collaborative Assistance Mixture of Experts Learning)。 CAMEL 巧妙地将混合专家模型(MoE)引入流式学习,通过 "私有专家" 与 "辅助专家" 的协作机制,以及自动化专家生命周期管理,完美解决了异构多流学习中 的关键问题。该工作已被 AAAI 2026 接收为 Oral 论文。 01 引言 在真实应用场景中,数据通常以连续且无限的数据流形式产生 ...
「视频世界模型」新突破:AI连续生成5分钟,画面也不崩
机器之心· 2025-12-31 09:31
Core Insights - The article discusses the emergence of AI-generated videos and the challenges of creating videos that not only look realistic but also adhere to the laws of the physical world, which is the focus of the "Video World Model" [2] - The LongVie 2 framework is introduced as a solution to generate high-fidelity, controllable videos lasting up to 5 minutes, addressing the limitations of existing models [2][6] Group 1: Challenges in Current Video Models - Current video world models face a common issue where increasing generation length leads to a decline in controllability, visual fidelity, and temporal consistency [6] - The degradation of quality in long video generation is nearly unavoidable, with issues such as visual degradation and logical inconsistencies becoming significant bottlenecks [2][12] Group 2: LongVie 2 Framework - LongVie 2 employs a three-stage progressive training strategy to enhance controllability, stability, and temporal consistency [9][14] - Stage 1 focuses on Dense & Sparse multimodal control, utilizing dense signals (like depth maps) and sparse signals (like keypoint trajectories) to provide stable and interpretable world constraints [9] - Stage 2 introduces degradation-aware training, where the model learns to maintain stability in generation despite imperfect inputs, significantly improving long-term visual fidelity [13] - Stage 3 incorporates historical context modeling, explicitly integrating information from previous segments to ensure smoother transitions and reduce semantic breaks [14] Group 3: Performance Metrics - LongVie 2 demonstrates superior controllability compared to existing methods, achieving state-of-the-art (SOTA) levels in various metrics [21][29] - Ablation studies validate the effectiveness of the three-stage training approach, showing improvements in quality, controllability, and temporal consistency across multiple indicators [26] Group 4: LongVGenBench - The article introduces LongVGenBench, the first standardized benchmark dataset designed for controllable long video generation, containing 100 high-resolution videos over 1 minute in length [28] - This benchmark aims to facilitate systematic research and fair evaluation in the field of long video generation [28]
超DeepEP两倍!无问芯穹FUSCO以「空中变阵」突破MoE通信瓶颈,专为Agent爆发设计
机器之心· 2025-12-31 09:31
Core Viewpoint - The article discusses the increasing adoption of the Mixture-of-Experts (MoE) architecture in large models like ChatGPT and Gemini, highlighting the challenges in communication and data rearrangement that arise from this architecture, particularly in high concurrency and long context scenarios [1][2]. Group 1: MoE Architecture and Challenges - MoE models introduce significant global distributed data exchange due to their sparse structure and expert parallelism, leading to performance bottlenecks in existing communication libraries like DeepEP [2]. - The communication and data rearrangement overhead increases with the scale of expert parallelism, making distributed data shuffling a critical performance bottleneck in training and inference [11][14]. Group 2: Introduction of FUSCO - FUSCO, developed in collaboration with several universities, aims to optimize communication for MoE models by integrating communication processes with data layout transformations, eliminating redundant data rearrangement [3][4]. - Experimental results show that FUSCO can improve communication performance by up to 3.84 times compared to NCCL and 2.01 times compared to DeepEP, especially as the number of concurrent requests and text length increases [4][44]. Group 3: FUSCO Design and Functionality - FUSCO's design allows for data rearrangement to occur during the communication process, maximizing GPU and network bandwidth utilization while minimizing additional memory operations [16][27]. - The communication interface of FUSCO is built around logical segments, allowing precise data access and placement without intermediate buffering or post-processing rearrangement [21][23]. Group 4: Performance Evaluation - In tests involving 64 GPUs, FUSCO demonstrated significant improvements in communication efficiency across various traffic configurations, effectively reducing communication overhead and enhancing load balancing [44][45]. - FUSCO's end-to-end performance improvements in training and inference tasks were notable, with enhancements of up to 1.39 times compared to NCCL and 1.19 times compared to DeepEP [47][48].
刚刚,稚晖君发布的人形机器人Q1,小到能塞进书包
机器之心· 2025-12-31 08:11
Core Viewpoint - The article discusses the launch of Q1, the world's first small-sized humanoid robot by Zhiyuan Robotics, which aims to redefine personal robotics by combining full-size robot capabilities with a compact design [1][4][24]. Group 1: Product Features and Innovations - Q1 is designed to maintain the capabilities of full-sized humanoid robots while significantly reducing research costs and physical interaction barriers [6][15]. - The robot features whole-body control (WBC), allowing it to coordinate multiple degrees of freedom for precise task execution [15][22]. - Q1 utilizes a modular hardware design, enabling easy replacement of parts and user customization through 3D printing [12][8]. Group 2: Market Position and Strategy - Zhiyuan Robotics targets both academic research teams and the hardcore hobbyist market, providing open development tools and interfaces [8][24]. - The company has rapidly increased its valuation to 15 billion RMB within three years and has made strategic moves, including acquiring a controlling stake in a listed company to pivot towards robotics [24][26]. - The launch of Q1 is expected to make humanoid robots more accessible to ordinary users, expanding the market for personal robotics [27][28]. Group 3: Technical Challenges and Achievements - The development of small-sized humanoid robots like Q1 presents significant challenges, including high integration requirements and advanced manufacturing processes [20][21]. - Q1's QDD joints represent a breakthrough in miniaturization, achieving high torque density in a compact form, which enhances its performance [18][22]. - The company has reached a milestone of 5,000 units produced, indicating strong demand and successful scaling of production [26].
7B扩散语言模型单样例1000+ tokens/s!上交大联合华为推出LoPA
机器之心· 2025-12-31 08:11
Core Insights - The article discusses a breakthrough in the field of diffusion large language models (dLLMs) through a new decoding algorithm called LoPA (Lookahead Parallel Decoding), which significantly enhances inference speed and parallelism [2][3][36]. Group 1: LoPA Algorithm Features - LoPA achieves a high degree of parallelism, increasing the tokens generated per step (TPF) from 3.1 to 10.1, thus surpassing traditional methods [3][7]. - The algorithm is plug-and-play, requiring no retraining or fine-tuning of the model [8]. - It introduces a lookahead parallel decoding mechanism that actively explores different token filling orders to avoid local optima [9]. - The accompanying LoPA-Dist system maximizes hardware utilization by supporting both CUDA and Ascend platforms [10]. Group 2: Performance Metrics - LoPA has demonstrated a single-sample throughput of 1073.9 tokens/s on the Huawei Ascend 910C platform, significantly outperforming baseline models [3][33]. - In experiments, LoPA integrated with D2F-Dream achieved a TPF of 10.1 on the GSM8K benchmark, drastically reducing the total inference steps [28][31]. - The system's performance metrics indicate that it can effectively convert algorithmic parallelism into substantial real-time acceleration, achieving over 1000 tokens/s on dedicated engines [34]. Group 3: System Design and Optimization - The LoPA-Dist distributed inference system employs a new branch parallelism strategy, which can be combined with existing tensor parallelism methods [25]. - It is optimized for different hardware platforms, with LoPA-Dist-NV designed for low-latency scenarios and LoPA-Dist-Ascend aimed at high-throughput service environments [26]. Group 4: Future Directions - The team plans to explore the application of LoPA in other dLLM architectures, such as SDAR, to further advance efficient generative models [36].
重塑语音安全!上海交大联合宇生月伴,研发高性能高泛化语音鉴伪大模型
机器之心· 2025-12-31 04:09
在生成式 AI 技术日新月异的背景下,合成语音的逼真度已达到真假难辨的水平,随之而来的语音欺诈与信息伪造风险也愈演愈烈。作为应对手段,语音鉴 伪技术已成为信息安全领域的研究重心。 然而,当前的语音鉴伪模型正面临严峻的「泛化性挑战」:许多在特定实验室数据集上表现优秀的模型,在面对现实世界中从未见过的生成算法时,检测性 能往往会出现剧烈下滑。这种「泛化瓶颈」严重限制了鉴伪技术在复杂多变的真实场景中的应用价值。 针对这一难题,上海交通大学听觉认知与计算声学实验室和宇生月伴公司(VUI Labs)联合发表了最新研究成果,提出了一种以数据为中心的研究范式。 该研究深入探究了训练数据分布与模型泛化能力之间的底层逻辑,通过系统性的实证研究与策略优化,构建了兼具高性能与高泛化性的语音鉴伪大模型。 基于上述视角,论文旨在通过系统性的实证分析探索两个核心问题: 规模定律: 论文标题: A Data-Centric Approach to Generalizable Speech Deepfake Detection 论文链接: https://arxiv.org/pdf/2512.18210 核心视角: 从单一构建到多源聚合 不 ...
视远 · 正心明智——「AI 中国」机器之心2025年度评选正式揭晓
机器之心· 2025-12-31 04:09
Core Insights - The article emphasizes the rapid evolution of large models in 2025, highlighting advancements in model architecture, training paradigms, and inference strategies, pushing the boundaries of technology [3] - It notes the emergence of next-generation models like GPT-5 and Gemini 3, which enhance core capabilities in understanding, generation, and reasoning, making the contours of general intelligence clearer [4] - The article stresses the importance of identifying AI technologies that provide long-term value, focusing on their ability to reshape production methods and establish foundational capabilities over time [4][5] Industry Developments - The domestic AI landscape in 2025 is described as vibrant, with Chinese large models closing the gap with international leaders and even surpassing them in certain areas, while also accelerating in open-source, engineering, and application adaptation [4] - The article presents the "AI China" Machine Heart 2025 Annual Selection, which records the advancements in Chinese artificial intelligence and outlines a promising future for technological innovation [6] Rankings and Recognitions - The article announces the top 10 companies/institutions with the strongest technical capabilities in AI for 2025 [7] - It lists the top 20 leading AI enterprises, showcasing the key players in the industry [11][13] - The best large models and large model products are also recognized, with a detailed list of the top 20 in each category [16][20]
NUS尤洋教授深度探讨智能增长的瓶颈:或许我们将这样实现AGI?
机器之心· 2025-12-31 04:09
Core Insights - The essence of intelligent growth is not about architectural changes but how computational power translates into intelligence [6][7] - The current paradigm (Transformer + massive computational power) faces a bottleneck in fully utilizing the increasing computational resources, leading to diminishing returns on pre-training [6][8] - Future directions should focus on breakthroughs in foundational paradigms rather than mere engineering optimizations [8][9] Group 1: Current State of Intelligence - There is no clear definition of intelligence, and even top experts struggle to define AGI (Artificial General Intelligence) [15][16] - The core of intelligence is seen as prediction and creation, with significant advancements needed to approach AGI [17][18] Group 2: Bottlenecks in Intelligent Development - The main source of bottlenecks in intelligent growth is the inefficiency in converting computational power into usable intelligence [19][20] - Pre-training is the most significant contributor to model intelligence, consuming the most computational resources [20][21] - The current model architectures, particularly Transformers, are unable to fully leverage the continuous growth in computational power [33] Group 3: Future Directions - There is a need for higher precision computing and more advanced optimizers to enhance model intelligence [45] - The exploration of scalable model architectures and loss functions is crucial for better utilization of computational resources [45] - The industry must find ways to "consume" more energy in a unit of time and effectively convert it into intelligence [42][45]
摩尔线程天使投资人:对近期AI的四十个观察
机器之心· 2025-12-30 12:10
Core Viewpoint - The article discusses the emergence of the AI economy, highlighting its rapid development and the structural changes it brings to various industries and society as a whole [3][4]. Group 1: AI Economic Characteristics - The AI industry is characterized by non-linear and non-uniform growth, with economic activities related to AI advancing at an unprecedented scale while traditional industrial activities maintain their usual pace [3]. - Industry leaders, such as Elon Musk and Jensen Huang, predict significant economic transformations due to AI, including a potential fivefold increase in global GDP to $500 trillion [4]. Group 2: Scaling Law and AI Development - The Scaling Law is a foundational principle for the development of large AI models, with current research focusing on when and under what conditions it will converge [7]. - Key metrics indicate that the reasoning cost of large language models decreases by 90% every 12 months, and their capability doubles approximately every seven months [7]. Group 3: Digital Layer and Economic Impact - The "digital layer" is proposed as a crucial infrastructure for the AI economy, consisting of personal AI assistants and specialized AI agents that enhance understanding of consumers and producers [10][16]. - This digital layer is expected to significantly reduce transaction costs and improve efficiency in economic activities by automating information collection, decision-making, and actions [17][18]. Group 4: Employment and Workforce Changes - The emergence of AI employees is anticipated, with organizations likely to see changes in management, recruitment, and collaboration between human and AI workers [30]. - The shift towards a task-centered work system is expected to enhance economic efficiency by breaking down jobs into smaller, manageable tasks that AI can perform [26]. Group 5: Global Economic Dynamics - The article suggests that the global distribution of GDP will change as AI capabilities become more uniform across countries, potentially altering traditional international divisions of labor [35]. - Countries will need to assess their energy, computing power, data, and algorithm capabilities to effectively integrate AI into their economies [38].