Workflow
Scaling Law
icon
Search documents
中金:具身智能走向数据驱动 高价值信息量成具身智能竞争核心
智通财经网· 2025-11-17 01:37
分层控制是基础架构范式,以两级结构实现工程化;VLA范式(以VLM为基础)强化泛化与交互能力,是 当前活跃的研究方向。世界模型通过环境建模与未来预测提供物理约束,处于科研主导阶段。该行认 为,短期分层架构因工程可控性仍是主流,VLA在复杂任务和人机交互中展现潜力,世界模型因具备 跨设备迁移能力被视为长期方向。 具身智能数据:高价值信息量成竞争核心 机器人数据涵盖多模态,产业找寻低数据成本获取&高数据效率应用路径。1)获取端:包括真机、视频 (第一人称/第三人称)、仿真等路线。2)安全端:数据安全为不容忽视的底线,人形机器人厂商面临权限 隔离、数据加密体系、跨境传输政策等多方挑战。3)应用端:传统数据应用策略为 "同构闭环",仅能在 同类型硬件上复现策略。异构训练通过模块化Transformer架构,跨机器人本体共享算法模型。 具身智能热点议题解析 智通财经APP获悉,中金发布研报称,短期分层架构因工程可控性仍是主流,VLA在复杂任务和人机交 互中展现潜力,世界模型因具备跨设备迁移能力被视为长期方向。机器人数据涵盖多模态,产业找寻低 数据成本获取&高数据效率应用路径。具身智能大脑正处于"路线分化"向"融合落地" ...
中国曾经也有一家“OpenAI”
虎嗅APP· 2025-11-16 09:08
Core Insights - The article discusses the evolution and strategic direction of Zhiyuan Research Institute, emphasizing its commitment to non-profit research in AI, contrasting with the commercialization seen in companies like OpenAI [5][8][14]. Group 1: Zhiyuan's Strategic Direction - Zhiyuan Research Institute initially considered establishing a commercial subsidiary similar to OpenAI but ultimately decided to remain a non-profit research organization [5]. - The institute has successfully incubated several startups, such as Zhipu AI and Moonlight, with valuations around 30 billion RMB each, showcasing its role as a supportive force in the AI ecosystem [5][8]. - The new research direction proposed by Wang Zhongyuan, "Wujie," focuses on multi-modal models, distinguishing it from the previous "Wudao" series, which centered on large language models [6][8]. Group 2: Multi-Modal Models and Scaling Law - The recent release of the EMU3.5 world model is seen as a significant step towards achieving a "Scaling Law" in multi-modal AI, although it is still considered a preliminary stage [7][25]. - EMU3.5's architecture allows for learning from multi-modal data, which has shown improved performance in tasks like image-text editing, indicating a potential path towards more human-like intelligence [23][24]. - The current model's parameters are around 300 billion, comparable to GPT-3.5, but achieving true "Scaling Law" will require significantly more data and computational resources [25][28]. Group 3: Research Philosophy and Talent Attraction - Zhiyuan's non-profit model has proven sustainable in China's AI landscape, attracting young researchers who prioritize long-term scientific value over immediate financial rewards [12][14]. - The institute encourages its researchers to pursue entrepreneurial ventures while providing academic and resource support, fostering a culture of innovation without direct commercialization [15][18]. - The emphasis on open-source research and collaboration is central to Zhiyuan's mission, aiming to lead in AI innovation while maintaining a commitment to societal benefits [18][19].
本体无关:Generalist 27万小时要掀真机采集场桌子
3 6 Ke· 2025-11-14 00:17
Core Insights - The key turning point in the data race is no longer a debate over data solutions but a return to the "first principles" of data collection, focusing on reusable, scalable, and evolvable data streams [1][24] - Generalist AI's announcement of its GEN-0 embodied foundation model, trained on 270,000 hours of human operation video data, marks a significant validation of the Scaling Law in the robotics field, akin to a "ChatGPT moment" for embodied intelligence [1][24] Data Collection Challenges - The traditional remote operation data collection model is facing insurmountable efficiency bottlenecks, as it relies on linear accumulation processes that cannot meet the exponential data demands outlined by the Scaling Law [3][4] - Real machine remote operation data collection is limited by physical world constraints, leading to a linear growth that is insufficient for the exponential needs of model performance improvement [3][4] - The complexity of deploying, debugging, and maintaining physical hardware creates a rigid and cumbersome data collection system, hindering rapid scalability [4][12] Embodied Robotics Value Proposition - The core value realization of embodied robots lies in their application in real-world scenarios that meet essential needs, sustainability, and economies of scale [5][6] - Current applications often represent superficial "scene slices" rather than comprehensive industrial solutions, emphasizing the need for robots to become collaborative partners in human labor [5][6] Precision Interaction Capabilities - Embodied robots must not only perform tasks but also understand the underlying logic of actions, requiring a deep comprehension of physical interactions and environmental variables [6][8] - The lack of suitable training data for various embodied forms presents a significant challenge in developing robots capable of nuanced physical interactions [8][9] Data Pyramid Structure - The industry recognizes a "data pyramid" structure, with the base consisting of vast amounts of internet data and human operation videos, the middle layer comprising synthetic data, and the apex being high-value real machine remote operation data [10][11] Generalist AI's Breakthrough - Generalist AI's use of 270,000 hours of human operation video data has validated the existence of the Scaling Law in robotics, demonstrating the potential for scalable data collection through its UMI (Universal Manipulation Interface) solution [12][24] - The UMI approach allows for flexible deployment of data collection devices across various environments, facilitating true scalability [12][24] Simulation Data Potential - Synthetic data shows promise in achieving scalability and economic efficiency, as it can quickly generate diverse training data in virtual environments without the need for physical setups [14][16] - The commercial value of synthetic data has been demonstrated through successful applications, indicating its potential to bridge the gap between virtual and real-world robotics applications [17][24] Industry Trends and Future Directions - The industry is at a critical stage of data development, emphasizing the need for efficient acquisition of high-quality training data to meet the demands of embodied robotics [18][24] - Companies that continue to focus on traditional data collection methods are likely to struggle in the competitive landscape defined by the Scaling Law [24][25]
2026年A股策略展望:“小登”月时代,牛途仍在
Guoxin Securities· 2025-11-13 12:03
Group 1 - The current bull market is in its second phase, transitioning from emotional drivers to fundamental factors, with a focus on technology as the main theme [1][11][19] - The bull market is characterized by a significant structural differentiation between "small-cap assets" and "large-cap assets," with "small-cap stocks" outperforming [30][21] - The technology sector is expected to lead the market, with specific attention on AI applications, robotics, smart driving, and AI programming [2][57][63] Group 2 - The report highlights that the bull market's main line is technology, with significant contributions from major tech companies, particularly in AI and related fields [2][57] - Historical bull markets have shown that the main line often correlates with industry cycles, where sectors with high revenue growth tend to outperform [58][60] - The report emphasizes the importance of fundamental recovery, with expectations of improved profitability and contract liabilities for listed companies [19][21] Group 3 - The report indicates that the market's valuation structure is healthy, with no signs of overheating, as evidenced by the current PB ratios being lower than in previous bull markets [21][25] - The differentiation in performance between "old economy stocks" and "new economy stocks" is notable, with "old economy stocks" lagging significantly behind [30][31] - The report suggests that the ongoing "deposit migration" trend may lead to increased investment in higher-yielding assets, further supporting market growth [35][39] Group 4 - The report outlines key policy directions for 2026, focusing on high-quality development, technological self-reliance, and comprehensive reform to support economic growth [17][18] - The anticipated political volatility in the U.S. and potential interest rate cuts by the Federal Reserve are expected to drive capital flows into emerging market assets, including Chinese stocks [46][47] - The report notes that the AI industry is expected to see substantial growth, with projections indicating a market size of over $2.6 trillion by 2030, driven by advancements in technology and increased investment [63][68]
2026年A股策略展望:“小登”时代,牛途仍在
Guoxin Securities· 2025-11-13 09:23
Group 1 - The current bull market is in its second phase, transitioning from emotional drivers to fundamental ones, with a focus on technology as the main theme [1][11][19] - The bull market is characterized by a significant structural differentiation between "small-cap" and "large-cap" assets, with "small-cap" stocks outperforming [30][21] - The technology sector is expected to lead the market, with specific attention on AI applications, robotics, smart driving, and AI in life sciences [2][57][68] Group 2 - The report highlights that the bull market's main line is technology, with significant contributions from major tech companies, particularly in AI and semiconductor sectors [2][63] - Historical bull markets have shown that the main line often correlates with industry cycles, where sectors with high revenue growth tend to outperform [58][60] - The report emphasizes the importance of understanding the differentiation between "old economy" and "new economy" stocks, with a recommendation to maintain exposure to dividend-paying assets amidst a backdrop of financial asset scarcity [2][30][10] Group 3 - The report discusses the impact of macroeconomic policies, including fiscal and monetary measures, on market performance, particularly in relation to the "14th Five-Year Plan" and its focus on high-quality development and technological self-reliance [17][18] - The analysis indicates that the market's valuation structure is healthier compared to previous bull markets, with a lower percentage of stocks trading at high price-to-book ratios [21][25] - The report notes that the trend of "deposit migration" is ongoing, with a shift in funds towards higher-yielding assets as traditional deposit rates decline [35][39]
宇宙尺度压缩:Scaling Law的边界,柏拉图表征收敛于物质和信息交汇,解决P与NP问题,Simulation假说……
AI科技大本营· 2025-11-13 05:59
Core Viewpoint - The article discusses the successful implementation of scientific multitask learning at a cosmic scale through the BigBang-Proton project, proposing the concept of Universe Compression, which aims to pre-train models using the entirety of the universe as a unified entity [1][7]. Group 1: Scientific Multitask Learning - Scientific multitask learning is essential for achieving Universe Compression, as it allows for the integration of highly heterogeneous datasets across various disciplines, which traditional models struggle to converge [2][4]. - The BigBang-Proton project demonstrates that with the right representation and architecture, diverse scientific data can converge, indicating the potential for transfer learning across scales and structures [2][4]. Group 2: Scaling Law and Platonic Representation - The Scaling Law observed in language models can extend beyond language to encompass physical realities, suggesting that the limits of these models may align with the fundamental laws of the universe [5][6]. - The Platonic Representation Hypothesis posits that AI models trained on diverse datasets tend to converge on a statistical representation of reality, which aligns with the findings from the BigBang-Proton project [6][7]. Group 3: Universe Compression Plan - The proposed Universe Compression plan involves creating a unified spacetime framework that integrates all scientific knowledge and experimental data across scales, structures, and disciplines [25][26]. - This approach aims to reveal the underlying homogeneity of structures in the universe, facilitating deep analogies across various scientific fields [26]. Group 4: Next Steps and Hypotheses - The company proposes a second hypothesis that suggests reconstructing any physical structure in the universe through next-word prediction, enhancing the model's ability to simulate complex physical systems [28]. - This hypothesis aims to integrate embodied intelligence capabilities, improving generalization in complex mechanical systems like aircraft and vehicles [28].
「紫荆智康」获近亿元天使轮融资,加速AI医院系统开发及落地 | 早起看早期
36氪· 2025-11-11 00:10
Core Insights - "Zijing Zhikang" has completed nearly 100 million yuan in angel round financing, led by Xinglian Capital, with the funds primarily allocated for the development and iteration of the Zijing AI Hospital system [2] - The company aims to leverage advanced large model AI technology to create a virtual medical world system, enhancing smart healthcare applications in the real world [2] - The Zijing AI Hospital's core logic involves simulating real hospital facilities and processes, particularly by creating highly human-like, diverse AI patients to meet initial training data needs [2] Data and Training Challenges - High-quality data and case studies are essential for training AI doctors, but challenges such as data silos and difficulty in data acquisition persist in the real medical world [3] - Zijing Zhikang's core technology team is addressing the cold start problem by synthesizing some case data using AI, creating "evolvable intelligent agents based on simulations" [3] - The AI hospital has constructed over 500,000 AI patients covering various countries, age groups, and disease types, serving as a significant supplement for training AI doctors [3] AI Doctor Evolution - The AI doctors are designed to possess self-evolution capabilities, with a specific memory and reflection algorithm to accumulate "experience" during consultations [5] - The evolution of AI doctors is expected to be faster than that of human doctors, with experimental results indicating that the AI's capability evolution curve aligns with Scaling Law [5] - Zijing Zhikang has developed 42 AI doctors that achieved over 96% accuracy on the MedQA dataset, surpassing the average level of human doctors [5] Product Development and Features - The AI system includes three interfaces: a patient app, a doctor workstation, and a hospital system, facilitating full-cycle health management [5] - Patients can register online, engage in intelligent pre-consultation, and generate structured medical records, while doctors can access these records to save time during consultations [5] - The system is designed to manage health data over time, providing health advice and allowing patients to utilize AI for health consultations and report interpretations [5] Future Plans and Regulatory Alignment - The Zijing AI Hospital system is set to launch publicly by the end of 2025, with initial internal testing already conducted in various departments at Tsinghua University Hospital [6] - The system's development aligns with recent government initiatives aimed at promoting and regulating the application of AI in healthcare, potentially enhancing service capabilities and efficiency in grassroots medical settings [6]
史上规模最庞大、最多元的真实世界操作数据集!具身领域的Scaling Law来了~
具身智能之心· 2025-11-09 14:08
Core Insights - The article discusses the introduction of GEN-0, a new type of embodied foundational model designed for multimodal training based on high-fidelity physical interactions, which aims to enhance robotic intelligence through real-world data [5][9]. Group 1: Model Characteristics - GEN-0 has been developed to capture human-level reflexes and physical common sense, featuring a core characteristic called "harmonic reasoning" that allows seamless training of thinking and action [5]. - The model has surpassed the critical threshold of 7 billion parameters, showing a phase transition where smaller models become stagnant while larger models continue to improve [6][11]. - GEN-0 demonstrates a strong scaling law, indicating that increased pre-training data and computational power predictably enhance the model's performance across multiple tasks [6][11]. Group 2: Data Utilization - The model is pre-trained on over 270,000 hours of real-world heterogeneous manipulation data, with the dataset expanding at a rate of over 10,000 hours per week [22]. - The data collection comes from diverse operational scenarios across thousands of households, warehouses, and workplaces, aiming to cover all conceivable operational tasks [24]. Group 3: Implications for Robotics - GEN-0 signifies a new era in embodied foundational models, where capabilities will grow predictably with real physical interaction data rather than relying solely on text, images, or simulated data [9]. - The findings highlight that smaller models struggle to process complex sensory-motor data during pre-training, while models with over 70 billion parameters can internalize large-scale pre-training data and quickly adapt to downstream tasks with minimal fine-tuning [15][11].
BigBang-Proton: 自回归基座模型统一语言、科学和物质世界
3 6 Ke· 2025-11-06 10:58
Core Insights - The article discusses the advancements made by the company 超越对称 (Super Symmetry) with their new model BigBang-Proton, which integrates various scientific disciplines and challenges existing AGI approaches [1][2][4]. Group 1: BigBang-Proton Model Innovations - BigBang-Proton successfully unifies multiple scientific problems across different scales, from micro-particles to macro-earth systems, using a next-word prediction paradigm [1]. - The model introduces three fundamental innovations: Binary Patch Encoding, a theory-experiment learning paradigm, and Monte Carlo Attention, which enhances its ability to handle complex scientific tasks [9][12][16]. - The model's pre-training is designed to extend to the entire universe, proposing a concept called "Universe Compression" to consolidate vast amounts of information into a single foundation [5]. Group 2: Performance and Comparisons - BigBang-Proton demonstrates superior performance in arithmetic operations, achieving 100% accuracy in 50-digit addition, significantly outperforming other models like DeepSeek-R1 and ChatGPT-o1 [31][36]. - In particle jet classification tasks, BigBang-Proton achieved an accuracy of 51.29%, competing closely with specialized models, while mainstream LLMs performed poorly [42][44]. - The model also excels in predicting water quality and genomic sequences, achieving competitive results against state-of-the-art models in these domains [59][62]. Group 3: Theoretical and Practical Implications - The introduction of Binary Patch Encoding addresses the limitations of traditional tokenizers, allowing for better numerical analysis and integration of scientific data [11][13]. - The theory-experiment learning paradigm bridges the gap between theoretical knowledge and experimental data, enhancing the model's applicability in real-world scientific research [12][15]. - The advancements made by BigBang-Proton could significantly impact fields reliant on numerical calculations, such as science, engineering, and finance, by resolving long-standing issues related to arithmetic logic [37].
具身智能一步踏入Scaling Law!10B+基础模型,27万小时真实数据
机器之心· 2025-11-05 06:30
Core Viewpoint - The article discusses the breakthrough achieved by the AI robotics startup Generalist with the introduction of a new embodied foundational model, GEN-0, which is designed for multimodal training on high-fidelity physical interaction data, aiming to enhance robotic intelligence through scalable data and computational power [2][5]. Group 1: GEN-0 Model Features - GEN-0 is built to capture human-level reflexes and physical common sense, with a parameter count exceeding 10 billion [3][4]. - A core feature of GEN-0 is "Harmonic Reasoning," allowing the model to seamlessly think and act simultaneously, which is crucial for real-world physical systems [5]. - The model has demonstrated strong scaling laws, indicating that increased pre-training data and computational power can predictably enhance performance across various tasks [6][10]. Group 2: Data and Training Insights - Generalist has pre-trained GEN-0 on over 270,000 hours of diverse real-world operational data, with the dataset growing at a rate of 10,000 hours per week [23][24]. - The company emphasizes that the quality and diversity of data are more critical than sheer quantity, leading to models with different characteristics based on the data mix used [33]. - The scaling experiments revealed that smaller models exhibit "ossification," while larger models continue to improve, highlighting the importance of model size in absorbing complex sensory-motor data [10][11]. Group 3: Applications and Future Directions - GEN-0 has been successfully tested on various robotic platforms, including humanoid robots with different degrees of freedom [6]. - The company is building the largest and most diverse real-world operational dataset to expand GEN-0's capabilities, covering a wide range of tasks across different environments [28]. - Generalist aims to create a robust infrastructure to support the extensive data collection and processing required for training large-scale robotic models [31].