Scaling Law
Search documents
奥特曼发红色警报,大模型走进死胡同了吗 ?
3 6 Ke· 2025-12-03 04:31
昨天,OpenAI CEO奥特曼发出了一份内部备忘录,宣布公司进入"Code Red"(红色警报)紧急状态。 表面上看,这是OpenAI针对谷歌、Anthropic这两位强力竞争对手的应急响应。 但更深层的问题是,OpenAI正在面临一个整个行业都无法回避的技术困境。那就是训练成本飙升,模型规模不断扩大,但性能提升却越来越有限。 根据斯坦福大学的《2025年AI指数报告》,2019年到2022年间,训练成本每增加10倍,模型在主流基准测试上的性能平均能提升25%-35%。但到了2023 年之后,同样10倍的成本投入,性能提升就只剩下10%-15%。 更糟糕的是,2024年以来,即使训练成本再翻倍,性能提升往往不足5%,投入产出比正在断崖式下跌。 各家头部模型的表现开始趋同,仿佛集体撞上了某种看不见的天花板。 这引发了一个在AI学术界和产业界激烈争论的问题:大语言模型,是否已经走进了死胡同? 根据半导体行业分析公司SemiAnalysis的爆料,自2024年5月GPT-4o发布以来,OpenAI的顶尖研究人员就再也没有成功完成过一次大规模的全面预训练。 这意味着GPT-5跟GPT-4o之间,其实没有经历真正意义 ...
ChatGPT三岁生日,谷歌却为它准备了“葬礼”
虎嗅APP· 2025-12-02 23:55
本文来自微信公众号: 新智元 ,作者:新智元,编辑:好困、定慧,题图来自:AI生成 如果将时间拨回三年前的今天,也就是2022年12月1日,那是一个相对安静的周三。 位于旧金山的一家名为OpenAI的非营利实验室,悄无声息地发布了一个名为"ChatGPT"的研究预览 版。 没有盛大的发布会,没有乔布斯式的演讲,只有一个朴素的对话框。 当时的人们并不知道,这个对话框将彻底改变世界。 ChatGPT早已不是那个偶尔会算错数学题的聊天机器人,它和它的继承者、竞争者们已经成为了人类 在数字AI世界赖以生存的"氧气"。 然而,伴随着技术的指数级跃迁,一种难以名状的群体性焦虑正在全球蔓延,和每个人都息息相关。 ChatGPT三年前的样子 三年后的今天,2025年12月12日,当我们站在这个时间节点回望,世界已经被彻底重塑。 这三年里,围绕ChatGPT和生成式AI,我们见证了前所未有的狂热与恐慌交织在一起:硅谷高歌猛 进,华尔街亦疯狂逐利,但普通人和各行各业从业者却充满焦虑和不安。 正如《大西洋月刊》评论所言,我们正身处"ChatGPT建造的世界"! 一个充满不稳定性的时代,大家都在战战兢兢地等待下一只靴子落地。 年轻人 ...
从开源最强到挑战全球最强:DeepSeek新模型给出了解法
Guan Cha Zhe Wang· 2025-12-02 11:38
Core Insights - DeepSeek has released two official models: DeepSeek-V3.2 and DeepSeek-V3.2-Speciale, with the former focusing on balancing reasoning ability and output length for everyday use, while the latter enhances long-form reasoning and mathematical proof capabilities [1][2][4] - The open-source large model ecosystem has seen significant growth, with DeepSeek's advancements posing a challenge to closed-source models, particularly in light of the recent release of Google Gemini 3.0, which has raised the competitive bar [2][15] - DeepSeek's models are positioned to bridge the gap between open-source and closed-source models through innovative architecture and training strategies, despite limitations in computational resources compared to industry giants [8][15][16] Model Performance - DeepSeek-V3.2 has achieved performance levels comparable to GPT-5 and is slightly below Google’s Gemini 3 Pro, demonstrating its effectiveness in reasoning tasks [6][7] - The Speciale version has outperformed Gemini 3 Pro in several reasoning benchmarks, including the American Mathematics Invitational Exam (AIME) and the Harvard-MIT Mathematics Tournament (HMMT) [7][8] - Speciale's design focuses on rigorous mathematical proof and logical verification, making it a specialized tool for complex reasoning tasks [6][8] Technological Innovations - DeepSeek employs a novel DSA (DeepSeek Sparse Attention) mechanism to optimize computational efficiency, allowing for effective long-context processing without sacrificing performance [8][12] - The concept of "Interleaved Thinking" has been integrated into DeepSeek's models, enhancing the interaction between reasoning and tool usage, which is crucial for AI agents [9][12] - The focus on agent capabilities signifies a strategic shift towards creating actionable AI, moving beyond traditional chat-based interactions to more complex task execution [13][14] Industry Context - The competitive landscape is shifting, with DeepSeek acknowledging the widening gap between open-source and closed-source models, particularly in complex task performance [15][16] - DeepSeek aims to address its limitations by increasing pre-training computational resources and optimizing model efficiency, indicating a clear path for future improvements [16][19] - The release of DeepSeek-V3.2 has been seen as a significant achievement in the open-source community, suggesting that the gap with leading closed-source models is narrowing [16][19]
从芯粒到机柜:聊聊大模型浪潮下的开放互连
半导体行业观察· 2025-12-02 01:37
Core Insights - The article emphasizes the importance of open interconnect standards like UCIe, CXL, UAL, and UEC in the AI infrastructure landscape, highlighting their roles in enhancing hardware ecosystems and addressing the challenges posed by large model training and inference [2][10]. Group 1: Background and Evolution - The establishment of the CXL Alliance in March 2019 aimed to tackle challenges related to heterogeneous XPU programming and memory bandwidth expansion, with Alibaba being a founding member [4]. - The UCIe Alliance was formed in March 2022 to create an open Die-to-Die interconnect standard, with Alibaba as the only board member from mainland China [4]. - The UEC Alliance was established in July 2023 to address the inefficiencies of traditional Ethernet in AI and HPC environments, with Alibaba joining as a General member [4]. - The UAL Alliance was formed in October 2024 to meet the growing demands for Scale-up networks due to increasing model sizes and inference contexts, with Alibaba also joining as a board member [4]. Group 2: Scaling Laws in AI Models - The article outlines three phases of scaling laws: Pre-training Scaling, Post-training Scaling, and Test-time Scaling, with a shift in focus towards Test-time Scaling as models transition from development to application [5][8]. - Test-time Scaling introduces new challenges for AI infrastructure, particularly regarding latency and throughput requirements [8]. Group 3: UCIe and Chiplet Design - UCIe is positioned as a critical standard for chiplet interconnects, addressing cost, performance, yield, and process node optimization in chip design [10][11]. - The article discusses the advantages of chiplet-based designs, including improved yield, process node optimization, cross-product reuse, and market scalability [14][15][17]. - UCIe's protocol stack is designed to meet the specific needs of chiplet interconnects, including low latency, high bandwidth density, and support for various packaging technologies [18][19][21]. Group 4: CXL and Server Architecture - CXL aims to redefine server architectures by enabling memory pooling and extending host memory capacity through CXL memory modules [29][34]. - Key features of CXL include memory pooling, unified memory space, and host-to-host communication capabilities, which enhance AI infrastructure efficiency [30][35]. - The article highlights the challenges CXL faces, such as latency issues due to PCIe PHY limitations and the complexity of implementing CXL.cache [34][35]. Group 5: UAL and Scale-Up Networks - UAL is designed to support Scale-Up networks, allowing for efficient memory semantics and reduced protocol overhead [37][43]. - The UAL protocol stack includes layers for protocol, transaction, data link, and physical layers, facilitating high-speed communication and memory operations [43][45]. - UAL's architecture aims to provide a unified memory space across multiple nodes, addressing the unique communication needs of large AI models [50][51].
ChatGPT三岁生日,谷歌却为它准备了「葬礼」
3 6 Ke· 2025-12-01 07:20
Core Insights - The launch of ChatGPT by OpenAI three years ago marked a significant turning point in AI technology, evolving from a simple chatbot to a critical component of digital life [1][6][34] - The rapid advancement of AI has led to a mix of excitement and anxiety among the public, with concerns about job displacement and the implications of AI on various industries [8][21] - Google’s recent launch of Gemini 3 is seen as a strategic move to reclaim dominance in the AI space, challenging OpenAI's previous lead [10][21] Group 1: Evolution of AI Technology - Over the past three years, OpenAI has consistently led AI advancements with models like GPT-3.5, GPT-4o, and GPT-5, which have set new standards in speed, accuracy, and reasoning ability [12][13] - The introduction of multi-modal AI, such as GPT-4o and Midjourney, has expanded AI capabilities beyond text to include images, audio, and video [17][21] - The user engagement with Gemini has surged, with monthly active users increasing from approximately 400 million in May to 650 million [21][23] Group 2: Market Dynamics and Competition - OpenAI's market share remains significant with over 800 million users, but user engagement with Gemini has surpassed that of ChatGPT [23][27] - The competitive landscape has shifted, with industry leaders like Google leveraging their resources to challenge OpenAI's position [21][27] - OpenAI's CEO faces immense pressure to accelerate monetization and maintain stability amid fierce competition [27][28] Group 3: Financial Strategies and Risks - OpenAI is pursuing an aggressive financial strategy, planning to invest $1.4 trillion in computing power over the next eight years, significantly exceeding its current revenue [28][31] - The financial burden of OpenAI's operations is largely borne by its partners, with estimates suggesting that nearly $1 trillion in debt is associated with its collaborations [29][31] - Analysts predict that substantial borrowing will be necessary to fulfill OpenAI's contracts, raising concerns about the sustainability of its financial model [32]
AI泡沫?
GOLDEN SUN SECURITIES· 2025-11-30 06:26
Investment Rating - The report maintains an "Overweight" rating for the industry, indicating a positive outlook for investment opportunities [5]. Core Insights - The advancements in AI technology, such as the release of DeepSeekMath-V2 and Google's Gemini 3 Pro, demonstrate that the potential of large models is far from being fully realized. Continuous innovation in algorithms and the scaling law are key drivers in dispelling the notion of an "AI bubble" [18][19]. - Alibaba's recent financial results show strong growth in AI-related products, with a 34% increase in revenue for Alibaba Cloud and a 29% acceleration in external commercialization revenue. The company emphasizes that AI is not a bubble, as the demand for AI solutions is robust and supported by solid return potential [19][20]. Summary by Sections AI Innovations - DeepSeek launched a new mathematical reasoning model, DeepSeekMath-V2, which utilizes a self-verifying training framework and has achieved gold medal levels in competitions [11]. - Google's Gemini 3 Pro highlights the importance of high-quality training data, showcasing the ongoing effectiveness of the scaling law in AI model development [16]. Alibaba's AI Strategy - Alibaba's CEO stated that there is no "AI bubble" in the next three years, supported by strong demand and reasonable return potential. The company is focusing on both AI to B and AI to C strategies [22][23]. - The demand for AI capabilities is increasing across various industries, with Alibaba Cloud's AI-related product revenue growing for nine consecutive quarters [20][21]. Market Dynamics - The global AI server supply chain is experiencing shortages, with a significant expansion cycle required to meet the growing demand. This supply-demand imbalance is expected to persist for the next two to three years [22][23]. - The report suggests monitoring companies involved in computing power, such as Cambrian, Huagong Information, and others, as potential investment opportunities [4][25].
通信行业专题研究:高端光芯片供不应求,国产替代加速
Zhongyuan Securities· 2025-11-28 08:48
Investment Rating - The report maintains an "Outperform" rating for the communication industry [1] Core Insights - The high-end optical chip market is experiencing a supply shortage, accelerating domestic substitution [1] - The demand for optical chips is driven by the rapid development of AI and the need for high-speed communication networks [4][45] - The Chinese government is actively supporting the optical chip industry through various policies and initiatives [42] Summary by Sections 1. Scaling Law and AI Impact - The Scaling Law demonstrates a positive correlation between model performance and the scale of models, data, and computational resources [10] - North American cloud providers are increasing capital expenditures significantly, with a 76.9% year-on-year growth in Q3 2025 [13] - Chinese cloud providers are also ramping up investments in AI infrastructure, with a 32.2% increase in capital expenditures [14] 2. Optical Chips as Core Components - Optical chips are critical for modern high-speed communication networks, directly influencing transmission efficiency [27][30] - The optical chip market is expected to grow at a CAGR of 17% from 2025 to 2030, with total sales projected to rise from approximately $3.5 billion in 2024 to over $11 billion by 2030 [46] - The demand for high-speed optical chips is increasing due to the transition from 800G to 1.6T modules, necessitating advanced chip technologies [4][41] 3. Domestic Market Dynamics - The domestic optical chip industry is transitioning from low-end to high-end production, supported by government policies [42] - The AI data center market is becoming a core growth driver for the optical chip industry, with significant investments in high-speed optical modules [45] - The telecommunications market is also evolving, with increasing demand for high-speed, integrated, and intelligent communication networks [51] 4. Investment Recommendations - The report suggests focusing on companies like Yuanjie Technology and Shijia Photon, which are well-positioned to benefit from the growing optical chip market [5]
谷歌用Gemini 3同时革了OpenAI和英伟达两家的命
3 6 Ke· 2025-11-26 10:39
Core Insights - Google's Gemini 3 launch signifies a major shift in the AI landscape, challenging the dominance of Nvidia and OpenAI by introducing a self-sufficient AI model that reduces reliance on external hardware and software [1][10][24]. Group 1: Impact on AI Industry - The release of Gemini 3 disrupts the previously established narrative where Nvidia was the sole provider of essential hardware (GPUs) for AI development, positioning Google as a formidable competitor [10][24]. - OpenAI's reliance on scaling laws for AI development is challenged by Gemini 3's innovative approach, which emphasizes native reasoning over mere parameter scaling [5][23]. - The AI industry is entering a new phase where companies must focus on integrated capabilities, including hardware, software, and talent, rather than just scaling existing models [44][56]. Group 2: Technological Advancements - Gemini 3 represents a significant advancement in AI technology, achieving a level of multimodal understanding that allows it to process information more intuitively, akin to human cognition [20][23]. - The TPU (Tensor Processing Unit) technology developed by Google is tailored specifically for AI applications, enhancing performance and efficiency compared to Nvidia's offerings [26][34]. - The introduction of the Ironwood TPU, designed for high-throughput and low-latency AI inference, marks a leap in Google's hardware capabilities, enabling it to compete directly with Nvidia's GPUs [30][34]. Group 3: Market Dynamics - Google's strategy includes selling TPU technology directly to major companies, aiming to capture a portion of Nvidia's revenue, which could significantly alter the competitive landscape [24][26]. - Nvidia's stock price has reacted negatively to the emergence of Gemini 3, indicating investor concerns about its market position in light of Google's advancements [7][66]. - The financial dynamics are shifting, with Nvidia leveraging its high profit margins to invest in retaining clients, while Google aims to reduce dependency on Nvidia's hardware [66].
机械设备行业点评报告:GoogleGemini3表现超预期,看好AI算力需求的成长性
Soochow Securities· 2025-11-26 06:35
Investment Rating - The report maintains an "Overweight" rating for the mechanical equipment industry [1] Core Insights - The release of Google Gemini 3 has exceeded market expectations, showcasing superior scoring capabilities and multimodal understanding [1] - Gemini 3 achieved a significant lead in Benchmark testing, with a 37.5% score in HLE testing (no tools), surpassing Gemini 2.5 Pro's 21.6% and GPT-5.1's 26.5% [2] - The model's "generative UI" capability allows for dynamic generation of customized, interactive interfaces, marking a step towards AI Agents [2] - Google DeepMind emphasizes the effectiveness of Scaling Law, indicating that more data and computational power are key to enhancing model intelligence [3] - The demand for computational power is expected to continue growing, with a focus on hardware investment opportunities in Google's chain, NVIDIA's chain, and domestic computational power chains [3] - The importance of PCB and liquid cooling in servers is increasing, with PCB usage and layers expected to rise due to higher integration levels [4] - Liquid cooling is becoming essential for meeting the thermal management needs of high-power server cabinets [4] Summary by Sections Investment Recommendations - Recommended companies in the PCB equipment segment include Dazhu CNC and Chipone Microelectronics, with a focus on consumables like Zhongtung High-tech and Dingtai High-tech [5] - In the server liquid cooling segment, Hongsheng Co. is a key recommendation, with attention to Yingweike [5]
中兴发了一篇论文,洞察AI更前沿的探索方向
机器之心· 2025-11-26 01:36
Core Insights - The AI industry is facing unprecedented bottlenecks as large model parameters reach trillion-level, with issues such as low efficiency of Transformer architecture, high computational costs, and disconnection from the physical world becoming increasingly prominent [2][4][38] - ZTE's recent paper, "Insights into Next-Generation AI Large Model Computing Paradigms," analyzes the core dilemmas of current AI development and outlines potential exploratory directions for the industry [2][38] Current State and Bottlenecks of LLMs - The performance of large language models (LLMs) is heavily dependent on the scaling laws, which indicate that ultimate performance is tied to computational power, parameter count, and training data volume [4][5] - Building advanced foundational models requires substantial computational resources and vast amounts of training data, leading to high sunk costs in the training process [5][6] - The efficiency of the Transformer architecture is low, with significant memory access demands, and the current hardware struggles with parallel operations in specific non-linear functions [6][7] Challenges in Achieving AGI - Current LLMs exhibit issues such as hallucinations and poor interpretability, which are often masked by the increasing capabilities driven by scaling laws [9][10] - There is ongoing debate regarding the ability of existing LLMs to truly understand the physical world, with criticisms focusing on their reliance on "brute force scaling" and lack of intrinsic learning and decision-making capabilities [9][10] Engineering Improvements and Optimizations - Various algorithmic and hardware improvements are being explored to enhance the efficiency of self-regressive LLMs, including attention mechanism optimizations and low-precision quantization techniques [12][13][14] - Innovations in cluster systems and distributed computing paradigms are being implemented to accelerate training and inference processes for large models [16][17] Future Directions in AI Model Development - The industry is exploring next-generation AI models that move beyond the Next-Token Prediction paradigm, focusing on models based on physical first principles and energy dynamics [24][26] - New computing paradigms, such as optical computing, quantum computing, and electromagnetic computing, are being investigated to overcome traditional computational limitations [29][30] ZTE's Exploration and Practices - ZTE is innovating at the micro-architecture level, utilizing advanced technologies to enhance AI accelerator efficiency and exploring new algorithms based on physical first principles [36][38] - The company is also focusing on the integration of hardware and software to create more efficient AI systems, contributing to the industry's shift towards sustainable development [38]