Scaling Law - filings, earnings calls, financial reports, news - Reportify

Scaling Law

Search documents

具身智能的数据困境？简智正以闭环飞轮推进解决

具身智能之心· 2025-12-17 10:00

点击下方卡片，关注" 具身智能之心 "公众号 "模仿学习（如看视频）必要，但真正掌握技能，真机数据是关键。" 香港大学李弘扬近期在多场具身智能行业论坛上的发言，精准戳中了赛道发展的核心痛点。这一观点在行业内已形成广泛共识——智源研究院院长王仲远就曾直言， "数据，尤其是高质量的数据，决定模型能力的上限" ，而当前具身智能最突出的困境正是高质量真机数据的极度匮乏。2025年，具身智能融资热度飙升、政策持续加码，可数据基建的滞后却成了行业规模化落地的"绊脚石"。做过具身智能研究的人都清楚，真机数据稀缺、采集效率低下、处理链路冗长，这些问题足以让多数企业陷入"巧妇难为无米之炊"的困境。这片蓝海市场中，简智机器人在赛道中逐渐崭露头角。作为专注于具身智能全链路解决方案的科技企业，其核心理念是"具身智能源于人、回归人"，并凭借全栈自研的"产品+产线"双轨战略，搭建起 "人类技能数字化 - 云端AI数据治理 - 机器人应用"的完整闭环。行业痛点如何破解？简智给出了自己的答案自变量机器人 CTO 王昊曾直言，具身智能领域正面临显著的"数据困境"。在行业内，Aloha设备已是常见的真机采 ...

Gen Data 1+x产品矩阵

Gen Data 1+x产品矩阵

大模型的进化方向：Words to Worlds | 对话商汤林达华

量子位· 2025-12-17 09:07

金磊发自凹非寺量子位 | 公众号 QbitAI 李飞飞团队最新的空间智能模型 Cambrian-S ，首次被一个国产开源AI超越了。从这张展示空间感知能力的雷达图中，一个名为 SenseNova-SI 的模型，它在多个维度上的能力评分均已将Cambrian-S给包围。而且从具体的数据来看，不论是开源或闭源，不论是2B或8B大小，SenseNova-SI在各大空间智能基准测试中都拿下了SOTA的成绩： | Model | vsı | MMSI | MindCube-Tiny | ViewSpatial | SITE | | --- | --- | --- | --- | --- | --- | | Open-source Models (~2B) | | | | | | | InternVL3-2B | 32.9 | 26.5 | 37.5 | 32.5 | 30.0 | | Qwen3-VL-2B-Instruct | 50.3 | 28.9 | 34.5 | 36.9 | 35.6 | | MindCube-3B-RawQA-SFT | 17.2 | 1.7 | 51.7 | 24.1 | 6. ...

SENSETIME(HK:00020)

原生多模态

原生多模态

从「密度法则」来看Scaling Law撞墙、模型密度的上限、豆包手机之后端侧想象力......｜DeepTalk回顾

锦秋集· 2025-12-15 04:09

Core Insights - The article discusses the transition from the "Scaling Law" to the "Densing Law," emphasizing the need for sustainable development in AI models as data growth slows and computational costs rise [2][3][15]. - The "Densing Law" indicates that model capability density increases exponentially, with capability density doubling approximately every 3.5 months, while the parameter count and inference costs decrease significantly [11][28]. Group 1: Scaling Law and Its Limitations - The "Scaling Law" has faced challenges due to bottlenecks in training data and computational resources, making it unsustainable to continue increasing model size [15][16]. - The current training data is limited to around 20 trillion tokens, which is insufficient for the expanding needs of model scaling [15]. - The computational resource requirement for larger models is becoming prohibitive, as seen with LLaMA 3, which required 16,000 H100 GPUs for a 405 billion parameter model [16]. Group 2: Introduction of Densing Law - The "Densing Law" proposes that as data, computation, and algorithms evolve together, the density of model capabilities grows exponentially, allowing for more efficient models with fewer parameters [11][28]. - For instance, GPT-3 required over 175 billion parameters, while MiniCPM achieved similar capabilities with only 2.4 billion parameters [24]. Group 3: Implications of Densing Law - The implications of the Densing Law suggest that achieving specific AI capabilities will require exponentially fewer parameters over time, with a notable case being Mistral, which achieved its intelligence level with only 35% of the parameters in four months [32][33]. - Inference costs are also expected to decrease exponentially due to advancements in hardware and algorithms, with costs for similar capabilities dropping significantly over time [36][39]. Group 4: Future Directions and Challenges - The future of AI models will focus on enhancing capability density through a "four-dimensional preparation system," which includes efficient architecture, computation, data quality, and learning processes [49][50]. - The article highlights the importance of high-quality training data and stable environments for post-training data, which are critical for the performance of models in complex tasks [68][70]. Group 5: End-User Applications and Market Trends - By 2026, significant advancements in edge intelligence are anticipated, driven by the need for local processing of private data and the development of high-capacity edge chips [11][45][76]. - The article predicts a surge in edge applications, emphasizing the importance of privacy and personalized experiences in AI deployment [76][77].

密度法则（Densing Law）

通用人工智能

密度法则（Densing Law）

通用人工智能

错过GPT时刻后，闫俊杰和中国“草根”们准备赢回来

Guan Cha Zhe Wang· 2025-12-12 06:58

Core Insights - Anthropic announced a complete ban on access for Chinese capital entities, reflecting the ongoing tech war between the US and China [1] - The founders of Anthropic and MiniMax, Dario Amodei and Yan Junjie, share a common history as former interns at Baidu, where they first encountered the concept of Scaling Law [1][2] - MiniMax, founded by Yan Junjie after leaving SenseTime, aims to develop general large models, addressing the question of why a Chinese company has not yet produced a model like ChatGPT [4] Group 1: Company Developments - MiniMax and other Chinese open-source model companies are now competing directly with US closed-source models like OpenAI and Anthropic, marking a significant shift in the AI landscape [5] - MiniMax's M2 model achieved significant success on the OpenRouter platform, surpassing 50 billion tokens in consumption, indicating strong market acceptance [9] - MiniMax's annual recurring revenue (ARR) reached $100 million, demonstrating its ability to achieve positive cash flow while many competitors continue to incur losses [14] Group 2: Competitive Landscape - The rise of DeepSeek, another Chinese company, showcases that local teams can produce top-tier models without relying on high-profile talent from Silicon Valley [7] - MiniMax's approach emphasizes the importance of imagination and effective organization over merely hiring expensive talent, challenging the notion that only "genius" individuals can drive innovation [6] - The competitive dynamics have shifted, with Chinese companies now seen as leaders in practical applications of AI, contrasting with the US focus on high valuations and capital games [14] Group 3: Strategic Insights - MiniMax's founder, Yan Junjie, emphasizes a technology-driven approach over traditional mobile internet strategies, focusing on the model itself as the product [10] - The company has established principles of direct user service, globalization, and a technology-driven focus, which have contributed to its success [10] - The efficiency of MiniMax is highlighted by its low training costs compared to OpenAI, achieving high performance with significantly lower capital expenditure [12] Group 4: Future Outlook - The narrative suggests that China is poised to seize a "second opportunity" in AI, moving from a follower to a leader in application and implementation [14] - The confidence in Chinese AI development is bolstered by a belief in the potential of local entrepreneurs to lead the global market in the coming years [15][18] - The ongoing competition between Chinese and US AI firms is framed as a battle of efficiency versus capital, with Chinese companies demonstrating remarkable organizational effectiveness [10][12]

稀疏混合专家（MoE）架构

Artificial Intelligence

稀疏混合专家（MoE）架构

Artificial Intelligence

大模型的第一性原理：（一）统计物理篇

机器之心· 2025-12-11 10:00

机器之心发布作者：白铂博士白铂博士，华为 2012 实验室理论研究部主任信息论首席科学家 2022 年底，ChatGPT 横空出世，其能力震惊了整个世界。2024 年底，DeepSeek 以极低的训练成本和极高的性能再次震惊了世界。短短几年间，大模型疯狂迭代，能力不断提升，仅在美国，AI 领域的投资规模便超过了许多国家全年的 GDP！2025 年底，Google 强势推出 Gemini 3，模型能力突飞猛进，TPU 训练范式也对英伟达的生态发起了颠覆式挑战。业界普遍认为 Gemini 3 是迈向通用人工智能（Artificial General Intelligence，AGI）和超级人工智能（ASI，Artificial Super Intelligence，ASI）的关键突破，是人类和机器合作的惊人之作。然而，正如 Ilya Sutskever 于 11 月 26 日的访谈中指出：大模型 Scaling Law 和摩尔定律一样，迟早会因为物理限制而失效。因此，如何打开大模型训练的炼丹炉，看清黑盒子背后的基本原理，回答大模型是否已逼近其能力极限就成为迫在眉睫的问题了。但是，前人对大模 ...

大模型第一性原理

Granger 因果推断

语义信息论

Artificial Intelligence

大模型第一性原理

Granger 因果推断

语义信息论

Artificial Intelligence

MiniMax 闫俊杰和罗永浩四小时访谈：走出中国AI的第三条路，大山并非不可翻越

3 6 Ke· 2025-12-11 08:11

当整个 AI 圈都在为 DAU（日活跃用户数）和融资额焦虑时，MiniMax 创始人闫俊杰却表现出一种近乎冷酷的淡漠。坐在罗永浩对面的闫俊杰，并不像一位掌管着 AI 独角兽企业的技术新贵。他拒绝谈论改变世界，反而坦承恐惧。那种恐惧不是来自商业竞争，而是来自技术本身——当模型的能力开始超越人类时，创造者反而成了最先感到不安的人。用 1/50 的筹码通往 AGI 在巨头环伺、算力短缺、热钱褪去的 2025 年，MiniMax 正在进行一场关于认知的修正：不再沿用移动互联网的逻辑，即通过大规模投放换取增长、通过堆砌功能留住用户，而是回归本质：把模型当作最重要的产品。在大模型时代，真正的产品其实是模型本身，传统意义上的产品更像是一个渠道。如果模型不够聪明，产品做得再好也没有用。在罗永浩和闫俊杰这期对谈里，我发现 MiniMax 这家 AI 公司从创业第一天就选择了注定与主流背道而驰的技术路径。当所有人都试图寻找中国的 OpenAI 和 Sam Altman 时，闫俊杰却在试图证明「非天才」的价值。MiniMax 的故事不是关于天才的灵光乍现，而是一场关于如何在资源受限的缝隙中，通过极度理性地计算 ...

交错思维（Interleaved Thinking）

Artificial Intelligence

交错思维（Interleaved Thinking）

Artificial Intelligence

资深科技投资者：如果没有Scaling Law的突破，2024年AI就崩了

Hua Er Jie Jian Wen· 2025-12-10 08:26

关于预训练Scaling Law，Baker强调，Gemini 3的发布具有里程碑意义，因为它明确证实了该定律仍然有效。在此之前，没有人能从原理上完全解释为何Scaling Law会起作用，它更多是一种类似古埃及人观测天象的"经验观察"——虽然能够精确测量金字塔轴线与星象的对齐，却并不理解背后的轨道力学。对于投资者而言，每一次对Scaling Law的确认都至关重要。如果这一经验定律失效，意味着海量的资本支出将无法转化为更强的智能表现。 Gemini 3证明了即便在现有硬件架构下，通过增加算力和数据，模型基座的能力依然在提升。但Baker 同时指出，仅靠预训练阶段的Scaling Law，并不能解释过去半年的市场繁荣。 Gavin Baker指出，Gemini 3的发布证明大模型的扩展定律（Scaling Law）依然有效。周二，资深科技投资者Gavin Baker在最近的播客访谈中指出，谷歌Gemini 3模型的推出验证了即使在硬件算力受限的窗口期，AI仍能通过新的推理机制实现能力跃升。他强调若非模型推理能力的及时涌现，全球AI产业本将在2024年中期至Gemini 3发布期间陷入完全停滞 ...

KAIFA(SZ:000021)

带有验证奖励的强化学习

测试时计算

英伟达Blackwell芯片

带有验证奖励的强化学习

测试时计算

英伟达Blackwell芯片

当千亿参数撞上5毫米芯片

Tai Mei Ti A P P· 2025-12-10 03:19

Core Insights - The global tech industry is experiencing a shift from cloud-based AI to edge AI, driven by the limitations of cloud dependency and the need for real-time processing in critical applications [1][4][18] - The current trend emphasizes the development of smaller, more efficient AI models that can operate independently on edge devices, rather than relying on large cloud models [16][18] Group 1: Challenges of Cloud Dependency - Cloud-based AI systems face significant latency issues, which can be detrimental in time-sensitive applications like autonomous driving [2][4] - Privacy concerns arise from the need to transmit sensitive data to cloud servers, making edge computing a more attractive option for users [2][4] Group 2: The Shift to Edge AI - The industry is moving towards a "cloud-edge-end" architecture, where complex tasks are handled by cloud models while real-time tasks are managed by edge devices [7][18] - Edge AI must overcome the "impossible triangle" of high intelligence, low latency, and low power consumption, necessitating innovative solutions [7][8] Group 3: Techniques for Edge AI Implementation - Knowledge distillation is a key technique that allows smaller models to retain the intelligence of larger models by learning essential features and reasoning paths [8][10] - Extreme quantization reduces model size and increases speed by compressing model weights, allowing for efficient processing on edge devices [10][11] - Structural pruning eliminates redundant connections in neural networks, further optimizing performance for edge applications [10][11] Group 4: Hardware Innovations - The "memory wall" issue in traditional architectures leads to inefficiencies, prompting the development of specialized architectures that integrate storage and computation [11][13] - Companies are exploring dedicated chip designs that optimize performance for specific AI tasks, enhancing efficiency in edge computing [13][14] Group 5: Industry Evolution - The focus is shifting from general-purpose AI models to specialized models that excel in specific applications, improving reliability and performance [15][16] - The Chinese AI industry is collectively recognizing the importance of practical applications over sheer model size, leading to a more grounded approach to AI development [16][18]

Nvidia(US:NVDA)

云 - 边 - 端三级分层架构

云 - 边 - 端三级分层架构

月之暗面迎来一名女总裁

Hua Er Jie Jian Wen· 2025-12-09 13:01

作者 | 周智宇编辑 | 张晓玲张予彤，这位一度引起争议的金沙江创投前主管合伙人，以一个全新身份走向台前。近日真格基金在清华大学举办的一场交流会上，张予彤首次以"Kimi总裁"的身份公开亮相。她负责的是 Kimi整体战略与商业化。张予彤也借着这场演讲，回应了外界对于"独角兽资金不足、算力匮乏"的质疑，强调Kimi的效率优势。从某种程度上来说，这也是场另类路演。放眼望去，曾经并肩作战的"大模型六小虎"已在分岔路口渐行渐远：抢滩上市的急迫、无奈折叠万亿参数雄心的妥协，以及被价格屠夫无情击穿底线的恐慌，共同交织成一幅残酷的众生相。在巨头围剿与资本退出的双重夹击下，所有的技术信仰最终都必须兑换成财务报表上的数字。张予彤走向台前，正是月之暗面试图穿越这片商业"无人区"的最后一搏，也预示着这场关乎生死的中场战事，来到重要赛点。走向台前杨植麟需要张予彤。或者更准确地说，处于"中场战事"的月之暗面，急需一位懂资本、懂战略、更懂如何把技术兑换成商业价值的操盘手。这是一场跨越十年的重逢，也是一次角色的彻底重塑。作为清华系的"师姐"，张予彤曾是杨植麟上一家创业公司循环智能的伯乐。如今，她正式成为这家 ...

Kimi K2 Thinking

Kimi K2 Thinking

Scaling Law 仍然成立，企业搜广推怎么做才能少踩“坑”？

AI前线· 2025-12-09 06:26

作者 | AICon 全球人工智能开发与应用大会策划 | 罗燕珊编辑 | 宇琪当大模型从通用技术探索深入产业场景，搜索、广告与推荐系统作为连接用户需求与业务价值的核心链路，正迎来全链路智能重构。那么，生成式推荐真正落地后的关键挑战是什么？又应该如何解决？近日 InfoQ《极客有约》X AICon 直播栏目特别邀请了京东内容推荐架构负责人颜林担任主持人，和荣耀 AI 算法专家冯晓东、京东算法总监张泽华、中科大计算机学院副教授王皓一起，在 AICon 全球人工智能开发与应用大会 2025 北京站即将召开之际，共同探讨生成式推荐的落地洞察。部分精彩观点如下：完整直播回放可查看： https://www.infoq.cn/video/0ViWrdqyQwNvO7TdQpyD 以下内容基于直播速记整理，经 InfoQ 删减。行业真正做到端到端的统一 pipeline 仍有较大差距，更多工作还是在 pipeline 的单点与大模型结合。搜广推场景中的 scaling law 依然成立，并且仍在快速上升阶段。低价值场景用小模型覆盖，高价值场景用大模型争取额外收益。不应拘泥于某项技术 ...

生成式推荐

Advertising and Recommendation

生成式推荐系统

生成式推荐

Advertising and Recommendation

生成式推荐系统