scaling law

Search documents
大模型发展情况综述
2025-07-28 01:42
Summary of Key Points from Conference Call Records Industry Overview - The conference call discusses the development of large model technology, indicating a shift from research to application, with 2025 being a critical turning point for the industry [1][2] - The global landscape shows the U.S. leading in computing power while China excels in efficiency [1][5] Core Insights and Arguments - The capital market's attitude towards AI investments has shifted from research funding to a focus on certainty and stability, with a noted pessimism regarding domestic large models that may be corrected, leading to potential gains [1][6] - The accuracy of large models has improved due to real-time data integration and enhanced retrieval-augmented generation techniques, with synthetic data expected to surpass human-accumulated data by 2028-2030 [3][16][17] - The context window length has significantly increased, allowing models to process longer text, thus improving overall performance and accuracy [9] - The development of agent and collective intelligence is advancing rapidly, with agents capable of completing complex tasks more efficiently than typical interns, indicating strong commercial potential [12][14] Important but Overlooked Content - The scaling law's effectiveness was validated by GPT-4.5, emphasizing the importance of deep reasoning and the significant impact of reasoning time on model performance [1][5][8] - The introduction of low-precision training techniques has reduced computing costs while facing challenges like gradient loss, with advancements in models like Deepseek R1 achieving large-scale training at FP8 precision [19] - The AI application revenue growth is notable, with sectors like AI search and programming showing rapid expansion, and a strong willingness to pay for AI applications compared to traditional ones [25][26] - Collective intelligence in finance has shown advantages through collaboration among agents, leading to higher return rates in stock trading compared to single models [15] Conclusion - The large model technology is at a pivotal moment, with significant advancements in efficiency, accuracy, and commercial viability, particularly in the AI sector, which is poised for explosive growth and investment opportunities [1][27]
肖仰华教授:具身智能距离“涌现”还有多远?
3 6 Ke· 2025-06-27 11:30
Group 1 - The development of artificial intelligence (AI) has two clear trajectories: one represented by AIGC (Artificial Intelligence Generated Content) and the other by embodied intelligence [3][6] - AIGC is considered a technological revolution due to its foundational nature, its ability to significantly enhance productivity, and its profound impact on societal structures [10][11] - Embodied intelligence aims to replicate human sensory and action capabilities, but its impact on productivity is seen as limited compared to cognitive intelligence [11][13] Group 2 - The current stage of AI development emphasizes the quality of data and training strategies over sheer data volume and computational power [3][15] - The scaling law, which highlights the importance of large datasets and computational resources, is crucial for both AIGC and embodied intelligence [14][15] - The industry faces challenges in gathering sufficient high-quality data for embodied intelligence, which is currently lacking compared to language models [20][21] Group 3 - The future of embodied intelligence relies on its ability to understand and interact with human emotions, making emotional intelligence a core requirement for consumer applications [5][28] - The development of embodied AI is hindered by the complexity of accurately modeling human experiences and environmental interactions [30][32] - There is a need for innovative data acquisition strategies, such as combining real, synthetic, and simulated data, to overcome current limitations in embodied intelligence training [22][23]
中信证券:系统级算力有望成为AI发展的下一站 建议关注国内产业链相关公司
智通财经网· 2025-06-26 00:29
Core Viewpoint - The report from CITIC Securities indicates that the demand for AI large model training and inference is continuously growing, with system-level computing expected to become the next generation of AI computing infrastructure [1] Group 1: System-Level Computing - System-level computing is anticipated to become the next generation of AI computing infrastructure, driven by the need for generality in foundational infrastructure to address future model developments [1] - The scaling law is rapidly evolving in post-training and online inference stages, with innovations in model architecture enhancing training capabilities [1] - The focus on hardware deployment for achieving higher throughput and lower latency in inference is becoming critical, with a shift towards cluster-based inference models [1] Group 2: Technical Aspects - The development of single-chip computing capabilities is outpacing advancements in communication technology, making communication efficiency a key factor for cluster performance [3] - Two primary methods for building large clusters are identified: Scale up (increasing resources per node) and Scale out (increasing the number of nodes), with Scale up being a significant future direction [3] - Notable examples include NVIDIA's NVL72 system and Huawei's CloudMatrix384 super node, which provide insights into industry development [3] Group 3: Industry Dynamics - The semiconductor industry typically utilizes mergers and acquisitions for technology integration and market expansion, with leading companies often pursuing these strategies to enhance their market position [4] - NVIDIA's acquisition of Mellanox exemplifies this strategy, expanding its NVLink technology to include RDMA networks for large-scale computing [4] - AMD's acquisition of ZT Systems has strengthened its system architecture design capabilities and data center solution delivery experience, contributing to the core of AI solutions [4][5]
世界怎么就「东升西落」了?聊聊二级市场与 DeepSeek+Manus 的热潮 | 42章经
42章经· 2025-03-30 14:25
「东升西落」的叙事 曲凯: 最近我又来美国了,发现市场真是变化太快,这边突然有人开始提到一个所谓「东升西 落」的叙事。 莫傑麟: 对,二级市场今年 1 月以来一直在演绎这个剧本,但其实 24 年就已经在为这个叙事做 铺垫了。 24 年美国的宏观环境和各项经济数据都比较好。他们一方面非常重视 AI,在所有前沿创新上也一 直绝对领先,另一方面又凭借美元的强势吸引着全球的投资。 但今年 Trump 上台之后,情况发生了变化。 Trump 在关税、财政支出上都做了很多调整,一套大刀阔斧去杠杆的动作下来,大家关注的重点 从 AI 转向了宏观问题,也对未来多了很多不确定性。 又因为过去几年,美国股市一直走高,投资人的预期已经被拉得很满。所以大家现在极度厌恶风 险,股市就会出现剧烈的震荡。 而今年的中国刚好是美国的镜像。 其实国内的股价从 24 年开始就有回升,但并不明显,直到今年 DeepSeek 的发酵才彻底引爆。 归根结底,还是因为大家之前对于中国科技行业和宏观环境的预期都太低了。 曲凯: 对,我觉得「东升西落」本质上是一种价值评判的回归,之前大家确实过于低估国内 AI 了,而 DeepSeek 就是一个典型代表。 ...
软件电信教育:关于AI陪伴和AI应用的一些观察思考&Deepseek影响评述
2025-03-11 01:47
Summary of Conference Call Notes Industry or Company Involved - The discussion revolves around the AI industry, specifically focusing on the developments and models from a company referred to as "Deep Sick" [1][2][3]. Core Points and Arguments 1. **Model Series Overview**: Deep Sick has released several models, notably V3 and R1, which are considered high-performance and cost-effective. The V3 model is highlighted for its engineering optimization and performance [1][2]. 2. **Comparison with Competitors**: The V3 model is compared to OpenAI's GPT-4, suggesting that it operates at a similar level of capability. The discussion emphasizes the importance of responsible AI development [2][3]. 3. **Scaling Laws**: The concept of "shifting on the curve" is introduced, indicating that as models evolve, they can achieve similar performance with fewer parameters, leading to cost reductions over time [3][4]. 4. **R1 Model Characteristics**: The R1 model is designed for long reasoning tasks, capable of handling complex queries. It has gained significant user engagement, reaching nearly 30 million monthly active users shortly after its release [5][6]. 5. **User Demographics**: Only 30% of R1's users are from China, indicating a strong international presence and appeal [6]. 6. **Innovative Training Approach**: The R1 model employs an object reward model (ORM) for training, which differs from traditional supervised fine-tuning methods, allowing for more flexible learning [7][8]. 7. **Consumer Applications**: The AI search capabilities of Deep Sick are highlighted as a rapidly growing application area, with the potential to provide reliable answers to user queries [10][11]. 8. **Market Impact**: The success of Deep Sick is seen as a catalyst for innovation in the AI sector, with implications for various industries, including healthcare and legal services [12][21]. 9. **Resource Requirements**: The discussion notes the significant computational resources required to support the models, with estimates suggesting the need for thousands of high-performance GPUs [19][20]. 10. **Future Outlook**: The potential for new applications and the overall positive sentiment towards the AI industry is emphasized, despite the presence of market bubbles [23]. Other Important but Possibly Overlooked Content 1. **Training Costs**: The narrative around the cost of developing AI models is nuanced, with claims that the reported costs may not fully capture the total investment required for development [16][17]. 2. **Externalities of Open Source**: The open-source nature of Deep Sick's models is seen as beneficial for fostering innovation and entrepreneurship within China [22][23]. 3. **Market Dynamics**: The call highlights the competitive landscape, noting that while some companies may struggle, others are likely to emerge successfully from the current market conditions [23]. This summary encapsulates the key insights and discussions from the conference call, providing a comprehensive overview of the current state and future potential of the AI industry as represented by Deep Sick.
聊一下物理Ai和机器人
雪球· 2025-03-09 04:55
Core Viewpoint - The article discusses the underlying logic behind the rise of robotics, emphasizing that the three key elements of AGI (Artificial General Intelligence) are computing power, algorithms, and data, with current robotics representing the data aspect [2][3]. Group 1: Development of Robotics - The development of large models faced challenges last year due to the exhaustion of available data on the internet, leading to a need for new data sources [3]. - Robotics can be viewed as a core component of AIDC (Artificial Intelligence Data Center), similar to GPUs and other capital expenditures in AI models [4]. - The anticipated deployment of 1 million robots globally by 2027-2028 could represent a capital expenditure of 500 billion to 1 trillion [4]. Group 2: Market Dynamics and Investment Opportunities - The current market perception of robotics is skewed, with many believing that robots are far from being able to serve humans, while they are actually crucial for data collection in AI development [4]. - The article suggests that the robotics sector is currently dominated by a small number of institutional investors, indicating a potential for significant growth if the sector gains broader acceptance [5]. - The ongoing "bull market" is attributed to a shift of global capital from US stocks to emerging markets, particularly Hong Kong and A-shares, which are closely following the trends in technology sectors [8]. Group 3: Challenges and Risks - There are several risks identified in the robotics sector, including the significant decline in major players' stock prices and the skepticism surrounding new entrants in the market [5]. - The article highlights the contradiction between strong expectations for AI implementation and the actual challenges faced in achieving these goals [7]. - Concerns are raised about the reliance on foreign capital and the potential volatility in the A-share market if foreign investors withdraw [8].
上半年 AI 市场有多差?为什么机构出手这么少? | 42章经
42章经· 2024-07-21 13:50
Q1. 今年以来,AI 市场的温度在如何变化? A: 今年上半年,整个一级市场(尤其是 AI 领域)之差可以说是十年所未见。 AI 赛道从去年 3 月左右开始火起来,下场创业的很多都拿到了融资,但 9 月之后形势开始急转直下 (原因我们在去年的 年终总结 里也有写 过)。到了今年,市场还在进一步变冷,从 1 月到 5 月,拿到主流机构投资的 AI 公司不会超过 30 家,其中相当一部分还是追加轮次的公 司。 这个数字大概是十年前的百分之一,也就是说整个市场的融资难度提高了百倍。在这个数字之下,绝大多数人的任何选择和努力都是无效 的。 Q2. 为什么机构出手这么少?是他们需要更多时间跟踪和学习吗? A: 我们接触了这么多机构,其实非常能理解为什么他们不出手: 1)讲大故事的是否能做出来?AI 能做到吗?大厂竞争怎么办?大模型做了怎么办? 2)现在能赚钱的、有数据的,会不会不够 AI?是不是太像生意?市场是不是不大? 3)AI 未来到底会怎么发展?甚至大模型到底行不行? 其实说穿了,就是现在这件事有太多的不确定性。机构如果仍然在之前的评价体系里,就一定是无法出手的。 所以我觉得问题不是说大家 对于这事的了解还不够 ...