Workflow
预训练Scaling Law
icon
Search documents
Gemini 3预训练负责人警告:模型战已从算法转向工程化!合成数据成代际跃迁核心,谷歌碾压OpenAI、Meta的秘密武器曝光
AI前线· 2025-12-26 10:26
作者 | 高允毅 2025 年底,大模型行业的"年终决战"正式打响,各家纷纷亮出压箱底的杀手锏,就在这场激烈角逐中,Gemini 3 以绝对王者 之姿强势突围,一登场就刷新了行业的认知边界。 11 月 18 日,Gemini 3 直接"横扫"多项权威基准测试,以"世界最强多模态理解""交互最深智能体""推理怪兽"的姿态,强势碾压 全球所有同类模型。谷歌 CEO 桑达尔·皮查伊亲自为其站台,直言这是"迄今为止最智能的模型"。消息一出,整个 AI 圈瞬间沸 腾,所有人都在追问:Gemini 3 的强悍,到底藏着什么秘诀? 答案在发布当天就有了初步线索。Google DeepMind 研究与深度学习副总裁 Oriol Vinyals 直接在推特上"剧透": "Gemini 3 这么强,核心秘诀就两点:更好的预训练,更好的后训练。 "这番直白的表态,让"预训练"与"后训练"瞬间成为行业热议的核心 话题。 作为从强化学习转向表征学习的资深研究者,Sebastian Borgeaud 的预训练功底堪称深厚:从 Transformer 架构,到 BERT、 XLNet,再到 DeepMind 第一篇大语言模型论文 Goph ...
国泰海通|海外科技:Gemini 3、TPU、端侧AI应用更新报告——模型多模态升级加速端侧AI落地,TPU冲击算力格局
报告来源 以上内容节选自国泰海通证券已发布的证券研究报告。 报告名称: Gemini 3、TPU、端侧AI应用更新报告——模型多模态升级加速端侧AI落地,TPU冲击算力 格局;报告日期:2025.12.02 报告作者: 秦和平(分析师),登记编号:S0880523110003 刁云鹏(研究助理),登记编号:S0880125070016 重要提醒 本订阅号所载内容仅面向国泰海通证券研究服务签约客户。因本资料暂时无法设置访问限制,根据《证 券期货投资者适当性管理办法》的要求,若您并非国泰海通证券研究服务签约客户,为保证服务质量、 控制投资风险,还请取消关注,请勿订阅、接收或使用本订阅号中的任何信息。我们对由此给您造成的 不便表示诚挚歉意,非常感谢您的理解与配合!如有任何疑问,敬请按照文末联系方式与我们联系。 报告导读: 模型:预训练 Scaling Law 仍然成立;算力: TPU 助谷歌构建全栈 AI 生 态,长期或与英伟达 GPU 互补;应用:多模态推理能力为端侧 GUI 操控提供可能,豆包 手机助手率先落地,看好谷歌全栈集成、苹果系统掌控、阿里模型能力。 Gemini 验证了预训练 Scaling Law ...
AI展望:NewScaling,NewParadigm,NewTAM
HTSC· 2025-06-10 01:43
Group 1: Global AI Outlook - The report highlights a new paradigm in AI development characterized by new scaling, new architecture, and new total addressable market (TAM) opportunities [1] - The demand for computing power is expected to rise due to advancements in both training and inference processes, potentially unlocking new TAMs [1][3] - The report maintains a positive outlook on AI industry investments, anticipating that global AI applications will enter a performance harvesting phase [1] Group 2: Model Development - The pre-training scaling law is anticipated to open a new starting point for model development, with significant innovations in architecture being explored [2][23] - The report notes that the classic transformer architecture has reached a parameter scale bottleneck, with existing public data nearly exhausted [2][20] - Major tech companies are experimenting with new architectures, such as Tencent's Hunyuan TurboS and Google's Gemini Diffusion, which may accelerate scaling law advancements [23][24] Group 3: Computing Power Demand - The report identifies a clear long-term upward trend in computing power demand, driven by both training and inference needs [3][32] - New scaling paths are emerging in the post-training phase, with ongoing exploration of new architectures that may reignite pre-training demand narratives [3][33] - The deployment of large-scale computing clusters, such as OpenAI's StarGate, is expected to support the exploration of pre-training [38] Group 4: Application Development - The report indicates that the rapid advancement of agent applications is leading to a performance harvesting phase for global AI applications [4][67] - The commercialization of agent products is accelerating, with domestic AI applications quickly iterating and entering the market [4][67] - The report emphasizes that agent applications are evolving from simple tools to complex solutions, with significant growth expected in various sectors [5][68] Group 5: Business Model Transformation - The shift from traditional software delivery to outcome-based delivery is highlighted as a key trend, with quantifiable ROI accelerating the adoption of agent applications [5] - Specific sectors such as consumer-facing scenarios (advertising, e-commerce) and AI in marketing/sales are expected to lead in commercialization due to their inherent advantages [5][67] - The report notes that AI applications in HR are transitioning from efficiency tools to strategic hubs, indicating a broader transformation in business models [5][67]