大模型训练
Search documents
不用任何人类语言训练,大模型反而更强了?
机器之心· 2026-03-14 06:33
Core Viewpoint - The article explores the hypothesis that language may not be the only pathway to intelligence, suggesting that training language models on non-language synthetic data could yield better performance than traditional methods [1][6]. Group 1: Research Findings - A new training paradigm called "pre-pre-training" is proposed, where models are first trained on synthetic data generated by Neural Cellular Automata (NCA) before being fine-tuned on natural language [7][6]. - This approach has shown to improve language modeling performance by up to 6%, accelerate training convergence by 40%, and enhance reasoning capabilities in downstream tasks [2][38]. - Models trained with NCA data outperformed those trained on natural text, even when the latter had significantly larger datasets [22][27]. Group 2: Data Characteristics - NCA data possesses rich spatiotemporal structures and statistical properties similar to natural language, while being controllable and cost-effective to generate [8][10]. - Each NCA sequence corresponds to a unique latent rule, compelling the model to infer these rules from context, which is fundamental for developing reasoning abilities [12][39]. Group 3: Implications for Training - The study indicates that attention mechanisms are crucial for transferring learned capabilities, while MLP layers encode more domain-specific knowledge [34]. - The complexity of NCA data can be tailored to match specific tasks, allowing for customized training approaches [42][44]. - The long-term vision is to develop models that acquire reasoning capabilities through synthetic data before learning semantics from carefully selected natural language corpora [45][46].
存储芯片价格上涨,手机厂商调价,新品手机平均售价或上涨15%至25%→
新华网财经· 2026-03-13 13:29
Core Viewpoint - The rising prices of storage chips are forcing smartphone manufacturers to adjust their product prices, marking a significant shift in the industry dynamics and potentially leading to a restructuring of the market landscape [2][3][4]. Group 1: Price Adjustments and Market Impact - OPPO announced a price adjustment for some of its released products starting March 16, 2026, due to rising costs of key smartphone components, including high-speed storage hardware [2]. - Multiple smartphone brands are expected to follow suit with price increases, marking the largest and most significant price adjustment in the smartphone industry in the past five years [2]. - Counterpoint Research predicts that the average selling price of new smartphones in the Chinese market will increase by 15% to 25% compared to models from 2025 [2]. Group 2: Storage Chip Price Trends - As of January 2026, the prices of major storage chips, DRAM and NAND flash, reached their highest levels since 2016, with some DDR4 8Gb chips seeing a price increase of 369% from their 2025 lows [3]. - The ongoing demand for AI server computing power is expected to keep the global storage chip market in a supply-demand imbalance, leading to continued price increases that will affect consumer electronics [3]. - The cost of storage chips is becoming a larger portion of smartphone manufacturing costs, with the BOM cost for devices like the iPhone 17 Pro Max seeing an increase in storage chip cost share from 8% in 2020 to over 10% in 2025 [3]. Group 3: Challenges for Mid-Range and Budget Smartphones - The price increase of storage chips is particularly challenging for the budget smartphone market, with some manufacturers like Meizu facing difficulties in maintaining product commercialization due to soaring memory prices [4]. - Analysts suggest that mid-range and budget smartphone manufacturers may respond to rising storage costs by reducing specifications or scaling back production, as the cost of storage now accounts for nearly 30% of their overall costs [4]. - The current market dynamics are accelerating a reshaping of the industry, with leading manufacturers leveraging supply chain advantages to better manage costs and maintain resilience [4]. Group 4: Opportunities for Domestic Manufacturers - This situation presents a window for domestic smartphone manufacturers to upgrade their business models by focusing on chip self-research, system-level power optimization, and value-added software services [5]. - If the mid-range market contracts excessively, it could lead to a service gap for foundational user groups, prompting manufacturers to explore differentiated storage configurations and cloud computing solutions to alleviate cost pressures while maintaining user experience [5].
港股异动 | 粤港湾控股(01396)现涨超4% 旗下天顿数据近期获福田国资重磅战略性投资
智通财经网· 2026-02-04 03:45
Group 1 - The core viewpoint of the article highlights that Guangdong-Hong Kong Bay Holdings (01396) has seen a stock price increase of over 4%, currently trading at 6.8 HKD with a transaction volume of 10.6023 million HKD [1] - The company plans to issue a total of 20.311 million subscription shares at a discount of approximately 8.40%, aiming to raise about 121.6 million HKD for general operational funds [1] - Approximately 90% of the net proceeds from the subscription will be allocated to potential AI computing power cloud service projects, while about 10% will be used for daily operational expenses [1] Group 2 - Recently, the company announced that a subsidiary of the Shenzhen Futian District State-owned Assets Supervision and Administration Commission has invested 800 million RMB in the company's subsidiary, Tiandun Data, acquiring a 40% stake [1] - The investment will specifically support the company's computing power network layout in the Guangdong-Hong Kong-Macao Greater Bay Area [1] - Tiandun Data is recognized as one of the top-tier intelligent computing operators in China, possessing rare capabilities and practical experience in building and operating high-performance computing clusters to meet the training needs of large models with hundreds of billions of parameters [1]
SuperX首个全球供应中心正式投产 斩获首批9.1亿美元AI服务器订单
Quan Jing Wang· 2026-01-30 12:39
Core Viewpoint - SuperX AI Technology Limited has officially launched its first global supply center in Japan, with an annual production capacity of 20,000 AI servers, marking a significant step towards large-scale commercial production and enhancing its market presence in AI infrastructure solutions [1][3]. Group 1: Supply Center Operations - The new supply center focuses on three core objectives: ensuring high manufacturing quality, achieving production scale, and establishing a global export hub [3]. - The center leverages Japan's stringent quality standards to guarantee the reliability of AI servers, while also allowing for flexible capacity adjustments based on global order growth [3]. - It simplifies and optimizes the product delivery process to international markets, enhancing order response efficiency for global customers [3]. Group 2: Market Demand and Orders - The global demand for AI data center infrastructure has surged, with a projected 28.3% year-on-year growth in AI server shipments by 2026 [5]. - SuperX has secured approximately $910 million in AI server procurement orders as of January 2026, with additional memorandums of understanding (MOUs) signed for the potential procurement of 5,000 AI servers, which could amount to $2.1 billion [5][6]. - The company’s modular AI factory solution addresses market pain points by integrating computing, cooling, and power systems for rapid deployment [6]. Group 3: Strategic Partnerships and Technology - SuperX has formed strategic partnerships to address power supply and cooling challenges in AI infrastructure, creating a comprehensive technology framework that includes computing, cooling, and power supply [4]. - The company has introduced advanced AI computing platforms, such as the SuperX GB300NVL72, which achieves 1.8 exaFLOPS of FP4 computing power, and the XN9160-B300AI server, which features a 50% improvement in performance compared to previous models [6]. Group 4: Customer Support and Services - To support its global operations, SuperX has established a standardized technical service system, providing 24/7 customer support and expert consultation [8]. - The company combines its global technical team with local spare parts networks to offer customized project implementation services and tiered on-site maintenance solutions [8].
分拆自威胜控股,数字能源公司惟远能源赴港IPO,募资发力AI数据中心与全球化
Sou Hu Cai Jing· 2026-01-28 10:40
Core Viewpoint - Weiyuan Energy Technology Co., Ltd. has officially submitted its listing application to the Hong Kong Stock Exchange, having been spun off from Weisheng Holdings [1] Group 1: Business Overview - Weiyuan Energy focuses on the integration of digital technology and the energy sector, providing comprehensive solutions for data centers, smart power grids, and new energy storage scenarios [2] - The company operates in three core segments: smart power grids, data centers, and new energy storage [2] - The smart power grid segment has been the main revenue contributor, accounting for 62.9% of total revenue in the first nine months of 2025 [2][3] - The data center segment has seen significant growth, with its revenue share increasing from 8.4% in 2023 to 22.1% in the first nine months of 2025, driven by the demand for AI data centers [2][3] Group 2: Financial Performance - Weiyuan Energy reported revenues of 2.485 billion yuan in 2023, 2.903 billion yuan in 2024, and 1.967 billion yuan in the first nine months of 2025, indicating a consistent growth trend [3] - The gross profit margin is projected to improve from 23.5% in 2023 to 26.5% in 2024, reflecting enhanced profitability [4] - Net profit is expected to rise from 105 million yuan in 2023 to 200 million yuan in 2024, with a net profit margin increasing from 4.2% to 9.2% [6] Group 3: Market Trends and Opportunities - The global data center infrastructure market is projected to grow from approximately $25.6 billion in 2020 to about $39.5 billion in 2024, with a compound annual growth rate (CAGR) of 11.5% [5] - The market is expected to reach around $90 billion by 2029, with a CAGR of 17.9% from 2024 to 2029, providing significant growth opportunities for Weiyuan Energy [5] Group 4: Client Base and Global Expansion - Weiyuan Energy has established a diverse client base, including state-owned and private power companies, data center operators, and large industrial enterprises, with key clients such as State Grid and Southern Power Grid [9] - The company has initiated a globalization strategy, establishing sales and service centers in Malaysia, Australia, Brazil, Turkey, and Mexico, with overseas revenue expected to account for 14.4% in 2024 [9] - The company plans to use the proceeds from its IPO to enhance production capacity, improve R&D capabilities, expand its global marketing network, and optimize financial conditions [9]
新股消息 | 数字能源解决方案提供商惟远能源递表港交所 聚焦于智能配电网、数据中心及新型储能领域
Zhi Tong Cai Jing· 2026-01-27 09:33
Company Overview - Weiyuan Energy Technology Co., Ltd. has submitted its listing application to the Hong Kong Stock Exchange, with China International Capital Corporation as its sole sponsor. The company focuses on digital energy solutions, particularly in smart distribution networks, data centers, and new energy storage [1][5]. - Weiyuan Energy provides a range of products and solutions in the smart distribution network sector, including smart switchgear and efficient transformers, aimed at enhancing reliable and efficient power distribution [5]. - In the data center sector, the company offers power distribution cabins, IT cabins, and HVDC systems to ensure stable power infrastructure for data centers [5]. - The new energy storage segment includes innovative storage systems and charging solutions, promoting efficient utilization of renewable energy [5]. Financial Performance - For the fiscal year ending December 31, 2023, Weiyuan Energy reported revenues of approximately RMB 2.484 billion, with a gross profit of RMB 583.888 million, resulting in a gross margin of 23.5% [9][12]. - Projected revenues for 2024 and the first nine months of 2025 are RMB 2.903 billion and RMB 1.967 billion, respectively, with corresponding gross profits of RMB 767.903 million and RMB 519.401 million, maintaining gross margins of 26.5% and 26.4% [9][10][12]. - The company’s net profit for the fiscal year 2023 was approximately RMB 105.375 million, with projections of RMB 200.279 million for 2024 and RMB 180.886 million for the first nine months of 2025 [10]. Industry Insights - The global data center critical digital infrastructure market has expanded from approximately USD 25.6 billion in 2020 to an expected USD 39.5 billion by 2024, with a compound annual growth rate (CAGR) of 11.5% [13]. - The market is projected to reach around USD 90 billion by 2029, driven by the ongoing expansion of data centers and increasing demands for power capacity and energy efficiency, with a CAGR of 17.9% from 2024 to 2029 [13]. - The smart distribution equipment market in China is expected to grow to approximately RMB 247.1 billion by 2029, with a CAGR of about 18.4% from 2024 to 2029, supported by investments in infrastructure [16]. - The global new energy storage capacity has surged from 18.6 GW in 2020 to an anticipated 170.0 GW by 2024, with a CAGR of approximately 73.9%, expected to exceed 789.0 GW by 2029 [17].
数字能源解决方案提供商惟远能源递表港交所 聚焦于智能配电网、数据中心及新型储能领域
Zhi Tong Cai Jing· 2026-01-27 09:28
Core Viewpoint - Weiyuan Energy Technology Co., Ltd. has submitted its listing application to the Hong Kong Stock Exchange, with CICC as its sole sponsor. The company focuses on digital energy solutions, particularly in smart distribution networks, data centers, and new energy storage [1][4]. Company Overview - Weiyuan Energy specializes in smart distribution networks, data centers, and new energy storage solutions, providing products such as smart switchgear, efficient transformers, and high-voltage direct current (HVDC) systems [4]. - The company has established a diverse customer base across various sectors, including state-owned and private power companies, data center operators, and large industrial enterprises [4]. - Weiyuan Energy has a global sales network with service centers in Malaysia, Australia, Brazil, Turkey, and Mexico, enhancing its overseas operational capabilities [4]. Financial Information - For the fiscal year ending December 31, 2023, Weiyuan Energy reported revenues of approximately RMB 2.484 billion, with a gross profit of RMB 583.888 million, resulting in a gross margin of 23.5% [6][7][10]. - The projected revenues for 2024 and the first nine months of 2025 are RMB 2.903 billion and RMB 1.967 billion, respectively, with corresponding gross profits of RMB 767.903 million and RMB 519.401 million [6][7][10]. - The company’s net profit for the fiscal year 2023 is approximately RMB 105.375 million, with projections of RMB 200.279 million for 2024 and RMB 180.886 million for the first nine months of 2025 [8][10]. Industry Overview - The global data center critical digital infrastructure market has expanded from approximately USD 25.6 billion in 2020 to an expected USD 39.5 billion by 2024, with a compound annual growth rate (CAGR) of 11.5% [11]. - The smart distribution equipment market in China is projected to reach approximately RMB 247.1 billion by 2029, with a CAGR of about 18.4% from 2024 to 2029 [14]. - The global new energy storage capacity is expected to grow from 18.6 GW in 2020 to over 789 GW by 2029, reflecting a CAGR of approximately 35.9% from 2024 to 2029 [15].
摩尔线程:预计2025年营收同比增长230.70%至246.67%,S5000已...
Xin Lang Cai Jing· 2026-01-21 12:19
Core Viewpoint - Moore Threads expects its annual revenue for 2025 to be between 1.45 billion to 1.52 billion yuan, representing a growth of 230.70% to 246.67% compared to 2024 [1] Group 1: Financial Projections - The projected net loss attributable to the parent company for 2025 is expected to be between 1.04 billion to 1.15 billion yuan, which indicates a narrowing of losses by 29.59% to 36.32% compared to the previous year [1] Group 2: Product Development - The company has successfully launched its flagship all-in-one GPU computing card, MTT S5000, which has achieved market-leading performance and has entered mass production [1] - A large-scale cluster built on this product has been completed and is now operational, efficiently supporting training for models with parameters ranging from hundreds of billions to trillions, with computational efficiency on par with advanced foreign GPU clusters of the same scale [1]
新年首炸!DeepSeek提出mHC架构破解大模型训练难题
Sou Hu Cai Jing· 2026-01-07 09:13
Core Insights - DeepSeek has introduced a new architecture called mHC aimed at addressing stability issues in large-scale model training while maintaining performance improvements [1][11]. Group 1: Problem Identification - Large models face a dilemma in training stability, where traditional single-channel connections lead to information congestion as model size increases [3][5]. - Previous solutions, like the hyper-connection approach, improved efficiency but introduced new issues such as uncontrolled information amplification or suppression, leading to gradient explosion and training failures [5][7][9]. Group 2: mHC Architecture - The mHC architecture incorporates an intelligent scheduling system for multi-channel connections, utilizing the Sinkhorn-Knopp algorithm to maintain energy conservation during information transmission [11][13]. - Additional design features include non-negative constraints on input-output mappings to prevent useful signal loss due to coefficient cancellation [15]. Group 3: Infrastructure Optimization - DeepSeek has optimized its infrastructure by merging multiple computation steps into a single operator, reducing memory read/write cycles and employing recomputation strategies to lower memory usage [16][18]. - These optimizations have resulted in significant stability improvements with minimal increases in training time, even at an expansion factor of 4 [18]. Group 4: Performance Validation - Testing on various model sizes, particularly a 27 billion parameter model, demonstrated that mHC effectively resolved training instability issues, achieving lower loss values compared to traditional baseline models [21][22]. - The performance advantages of mHC were consistent across different model sizes, indicating its practical value for both small and large models [24]. Group 5: Industry Implications - The introduction of mHC suggests a shift in the industry towards refined architectural designs rather than merely increasing parameters and computational power, potentially lowering entry barriers for smaller companies in the large-scale model domain [26][29]. - This pragmatic technological innovation is expected to facilitate the deployment of AI technologies, making it easier for more enterprises to engage in large-scale model development [29].
科大讯飞:讯飞星火对标A100的训练效率优化后达到85%-95%以上
Xin Lang Cai Jing· 2026-01-06 14:31
Core Viewpoint - The company has made significant investments in optimizing the training and inference cost efficiency of its large models under limited computing resources, achieving substantial improvements in performance metrics compared to industry standards [1] Group 1: Technological Advancements - Since May 2023, the company has collaborated with Huawei to overcome various technical challenges, including high-speed interconnection, hidden computation communication, and optimization of training and inference efficiency [1] - The training efficiency of general large models and deep inference models has improved from an initial 30%-50% to over 85%-95% when benchmarked against NVIDIA's A100 [1] Group 2: Breakthroughs in Domestic Computing Power - In 2025, the company achieved significant breakthroughs in two areas: enhancing the training efficiency of long-thought chain reinforcement learning from 30% to over 84% against the A800 benchmark, and improving the full-link training efficiency of MoE models from 30% to 93% [1] - These advancements represent a major leap from 0 to 1 in the domestic computing power sector, indicating a strong potential for further cost reductions in training [1]