Workflow
专用模型
icon
Search documents
现有路径不通?OpenAI、亚马逊考虑改变大模型训练方式
美股研究社· 2026-01-26 10:27
以下文章来源于硬AI ,作者专注科技产研的 AI时代,快人一步~ 随着人工智能领域竞争进入深水区,行业顶尖研究人员正对现有的模型训练范式提出质疑。 来源 |硬AI 硬AI . 来自OpenAI、Thinking Machines Lab以及亚马逊的研究人员正在探讨一种根本性的转变: 放弃目前通用的"先预训练、后后训练"的标准流 程,转而采用针对特定任务更早引入精选数据的训练模式,以解决现有模型的效率低下和"裂脑问题"等缺陷。 这一潜在的转变由亚马逊的David Luan等人大力倡导。其核心观点在于,目前的通用训练路径——即先赋予模型广泛的世界知识(如诗歌或园 艺),再针对特定任务(如代码编写或客户退款)进行微调——在逻辑上并不总是合理的。研究人员认为, 如果模型的最终用途已经确定,那 么在预训练阶段就应引入与任务高度相关的精选数据,从而更直接地为最终目标服务。 这种方法论的调整若付诸实践,将深刻改变AI行业的开发格局。这不仅意味着开发团队可能不再需要按照预训练和后后训练进行人为分割,更 预示着市场将从"一个通用模型适应所有场景"走向"基于不同数据集构建专用模型"的时代。这种转变将迫使开发者在训练初期就对数据进 ...
现有路径不通?OpenAI、亚马逊考虑改变大模型训练方式
硬AI· 2026-01-25 11:33
Core Viewpoint - The article discusses a fundamental shift in AI training paradigms, advocating for the abandonment of the "pre-train then fine-tune" model in favor of introducing curated data for specific tasks earlier in the training process, which could reshape the AI development landscape [2][3][4]. Group 1: Restructuring Training Logic - Current AI training practices mimic human learning but are being questioned for their efficiency, particularly the extensive pre-training on unrelated domains, which wastes resources [6]. - The proposed approach suggests using pre-training to engage with task-relevant curated data, potentially eliminating the need for separate teams for different training phases [6][8]. Group 2: Rise of Specialized Models and Organizational Restructuring - The shift towards specialized models will require developers to make early decisions on data inclusion, directly impacting the model's capabilities and limitations [8]. - OpenAI is already adapting to this demand by routing queries to different models and developing specialized versions like GPT-5-Codex, indicating a market trend away from a single universal model [4][9]. Group 3: Hardware Breakthroughs and Capital Investment - Innovations in hardware are accelerating, with companies like Neurophos raising $110 million to develop photonic chips aimed at enhancing AI computational efficiency [11]. - OpenAI is also investing in its infrastructure, with significant progress on its custom inference chips and the Stargate infrastructure project, which is over 50% complete [11]. Group 4: Industry Consolidation and Competitive Dynamics - The AI sector is witnessing active mergers and acquisitions, with companies like Lightning AI merging with Voltage Park, and Yelp acquiring Hatch for $300 million, reflecting a trend towards consolidation [13]. - Major players like Apple and Google are negotiating to enhance their AI capabilities, with Apple planning to leverage cloud infrastructure for an updated Siri by 2027 [13][14].
现有路径不通?OpenAI、亚马逊考虑改变大模型训练方式
Hua Er Jie Jian Wen· 2026-01-23 06:42
随着人工智能领域竞争进入深水区,行业顶尖研究人员正对现有的模型训练范式提出质疑。 来自OpenAI、Thinking Machines Lab以及亚马逊的研究人员正在探讨一种根本性的转变:放弃目前通用 的"先预训练、后后训练"的标准流程,转而采用针对特定任务更早引入精选数据的训练模式,以解决现 有模型的效率低下和"裂脑问题"等缺陷。 这一潜在的转变由亚马逊的David Luan等人大力倡导。其核心观点在于,目前的通用训练路径——即先 赋予模型广泛的世界知识(如诗歌或园艺),再针对特定任务(如代码编写或客户退款)进行微调—— 在逻辑上并不总是合理的。研究人员认为,如果模型的最终用途已经确定,那么在预训练阶段就应引入 与任务高度相关的精选数据,从而更直接地为最终目标服务。 这种方法论的调整若付诸实践,将深刻改变AI行业的开发格局。这不仅意味着开发团队可能不再需要 按照预训练和后后训练进行人为分割,更预示着市场将从"一个通用模型适应所有场景"走向"基于不同 数据集构建专用模型"的时代。这种转变将迫使开发者在训练初期就对数据进行更严格的筛选,从而决 定模型在特定领域的专长与短板。 市场已经出现了这种分化的迹象。Ope ...