Workflow
大模型训练
icon
Search documents
万钢:实现L3、L4级别的自动驾驶,需要智慧的道路和云计算技术平台的支撑
转自:证券时报 炒股就看金麒麟分析师研报,权威,专业,及时,全面,助您挖掘潜力主题机会! 责任编辑:凌辰 新浪合作大平台期货开户 安全快捷有保障 人民财讯9月27日电,9月27日,在2025世界新能源汽车大会上,中国科协主席、世界新能源汽车大会主 席万钢表示:"要实现L3(有条件自动驾驶)、L4(高度自动驾驶)级别的自动驾驶,需要智慧的道路 和云计算技术平台的支撑。具体而言,需要将汽车在驾驶过程中遇到的情况和处理方式,上传到云平台 进行大模型训练,再将升级后的能力反馈至车端,形成一个从车端到云端的闭环,才能真正提高自动驾 驶能力。" ...
腾讯申请大模型训练库WeChat-YATT商标
Qi Cha Cha· 2025-09-24 06:28
企查查APP显示,近期,腾讯科技(深圳)有限公司注册"WeChat-YATT"商标,国际分类涉及科学仪 器、设计研究,当前商标状态为注册申请中。公开信息显示,WeChat-YATT是腾讯微信团队开源的一个 专注于大模型训练的软件库。 (原标题:腾讯申请大模型训练库WeChat-YATT商标) ...
放榜了!NeurIPS 2025论文汇总(自动驾驶/大模型/具身/RL等)
自动驾驶之心· 2025-09-22 23:34
Core Insights - The article discusses the recent announcements from NeurIPS 2025, focusing on advancements in autonomous driving, visual perception reasoning, large model training, embodied intelligence, reinforcement learning, video understanding, and code generation [1]. Autonomous Driving - The article highlights various research papers related to autonomous driving, including "FutureSightDrive" and "AutoVLA," which explore visual reasoning and end-to-end driving models [2][4]. - A collection of papers and codes from institutions like Alibaba, UCLA, and Tsinghua University is provided, showcasing the latest developments in the field [6][7][13]. Visual Perception Reasoning - The article mentions "SURDS," which benchmarks spatial understanding and reasoning in driving scenarios using vision-language models [11]. - It also references "OmniSegmentor," a flexible multi-modal learning framework for semantic segmentation [16]. Large Model Training - The article discusses advancements in large model training, including papers on scaling offline reinforcement learning and fine-tuning techniques [40][42]. - It emphasizes the importance of adaptive methods for improving model performance in various applications [44]. Embodied Intelligence - Research on embodied intelligence is highlighted, including "Self-Improving Embodied Foundation Models" and "ForceVLA," which enhance models for contact-rich manipulation [46][48]. Video Understanding - The article covers advancements in video understanding, particularly through the "PixFoundation 2.0" project, which investigates the use of motion in visual grounding [28][29]. Code Generation - The article mentions developments in code generation, including "Fast and Fluent Diffusion Language Models" and "Step-By-Step Coding for Improving Mathematical Olympiad Performance" [60].
但我还是想说:建议个人和小团队不要碰大模型训练!
自动驾驶之心· 2025-09-20 16:03
Core Viewpoint - The article emphasizes the importance of utilizing open-source large language models (LLMs) and retrieval-augmented generation (RAG) for businesses, particularly for small teams, rather than fine-tuning models without sufficient original data [2][6]. Group 1: Model Utilization Strategies - For small teams, deploying open-source LLMs combined with RAG can cover 99% of needs without the necessity of fine-tuning [2]. - In cases where open-source models perform poorly in niche areas, businesses should first explore RAG and in-context learning before considering fine-tuning specialized models [3]. - The article suggests assigning more complex tasks to higher-tier models (e.g., o1 series for critical tasks and 4o series for moderately complex tasks) [3]. Group 2: Domestic and Cost-Effective Models - The article highlights the potential of domestic large models such as DeepSeek, Doubao, and Qwen as alternatives to paid models [4]. - It also encourages the consideration of open-source models or cost-effective closed-source models for general tasks [5]. Group 3: AI Agent and RAG Technologies - The article introduces the concept of Agentic AI, stating that if existing solutions do not work, training a model may not be effective [6]. - It notes the rising demand for talent skilled in RAG and AI Agent technologies, which are becoming core competencies for AI practitioners [8]. Group 4: Community and Learning Resources - The article promotes a community platform called "大模型之心Tech," which aims to provide a comprehensive space for learning and sharing knowledge about large models [10]. - It outlines various learning pathways for RAG, AI Agents, and multi-modal large model training, catering to different levels of expertise [10][14]. - The community also offers job recommendations and industry opportunities, facilitating connections between job seekers and companies [13][11].
算力“好兄弟”存储发力:先进存力中心建设加速
Core Insights - The rapid development of advanced computing capabilities is accompanied by a significant push towards optimizing data storage solutions, highlighting the importance of data as a strategic resource for economic growth [1][4]. Group 1: Data Storage Growth - China's data storage capacity is projected to grow at a rate exceeding 20% from 2022 to 2024, reaching a total of 1580 EB by the end of 2024, with an annual increase of 380 EB, representing a 32% year-on-year growth [3]. - The structure of data storage is evolving, with the proportion of flash storage in external storage increasing from 25% in 2023 to 28% in 2024, indicating a shift from capacity-driven to performance-oriented storage systems [3][5]. - The demand for large-scale data storage is driven by the need for low-latency and high-throughput performance, as well as the increasing volume of non-structured data [5][8]. Group 2: Industry Applications and Trends - Various industries, including manufacturing, internet, and finance, are rapidly adopting flash storage solutions, with their market share exceeding 45%, while sectors like education and healthcare are also optimizing their storage structures [3][4]. - The emergence of large model training has created a surge in demand for data storage, necessitating the collection and processing of vast amounts of multi-modal data [4][6]. Group 3: Strategic Recommendations - Recommendations for advancing data storage capabilities include establishing a unified national plan for advanced storage centers, optimizing data storage resource distribution, and enhancing data governance frameworks [6][7]. - The integration of AI data lake storage technology is suggested to unify multi-source data collection and improve data quality through advanced data governance tools [7][8]. - Emphasis is placed on the importance of developing a secure data circulation space and implementing internal storage security mechanisms to protect data throughout its lifecycle [8][9]. Group 4: Industry Experience and Implementation - Companies like Huawei are leading initiatives to build data storage centers in urban areas and create data lakes for enterprises, facilitating the aggregation and management of diverse data types [9][10]. - The focus on creating a trustworthy data circulation space is evident in collaborative projects that aim to enhance data flow and security across various sectors, including automotive and finance [9].
国内AI算力市场需求——云厂训练和推理投入分配情况解析
傅里叶的猫· 2025-08-24 12:31
Core Viewpoint - The AI training market in China is entering a competitive phase dominated by major companies, with a significant reliance on large orders from these firms to sustain market activity [2][3]. Group 1: AI Training Market Analysis - Tencent has sufficient training chip reserves and does not face chip shortage concerns, focusing on using the best available models from various suppliers [2]. - The training market is currently dominated by NVIDIA, with over 60% of training card demand driven by Alibaba, followed by ByteDance and Tencent [3]. - The "Six Little Dragons" are withdrawing from training resources, negatively impacting the overall training market, as these companies are still in the early stages of commercialization [3]. Group 2: Competition Among Major Players - The competition between Alibaba and ByteDance is intensifying, with both companies striving to excel in large model training, leading to a zero-sum game scenario [3]. - The demand for training resources is primarily concentrated among major companies, with Tencent continuing to invest in next-generation models despite the competitive landscape [3]. Group 3: Market Trends and Future Outlook - The demand for inference computing power has not seen the expected significant growth, despite initial optimism earlier in the year [4]. - The growth of AI applications, such as Yuanbao, has begun to slow down, with a modest increase in monthly active users and a significant drop in monthly downloads [4]. - The influx of second-hand A100 and H100 training devices into the domestic market is expected to lower prices significantly, impacting the compliance card market [4][5]. Group 4: Investment Allocation Among Companies - Alibaba allocates approximately 80% of its budget to training and 20% to inference, while ByteDance maintains a balanced 50:50 ratio [5][6]. - Tencent's investment distribution is approximately 20% for training and 80% for inference, indicating a product-oriented approach that has not yet yielded positive revenue [5][6].
训练效率提升25%、成本降23%!上海期智研究院、算秩未来联合推出MegatronApp:专为万亿参数大模型训练打造的系统工具包
AI前线· 2025-07-28 06:47
Core Insights - The article discusses the launch of MegatronApp, an open-source toolchain designed to enhance the training efficiency of large models using the Megatron-LM framework, achieving a 25% increase in training efficiency and a 23% reduction in training costs [2][38][40] Group 1: MegatronApp Overview - MegatronApp is the first open-source enhancement toolchain in China specifically built around Megatron-LM, focusing on high availability, adaptability, efficiency, and observability [3] - The toolchain consists of four main modules: MegaScan, MegaDPP, MegaFBD, and MegaScope, each targeting specific challenges in large model training [4] Group 2: Efficiency Improvements - MegaScan improves training efficiency by 25% through precise identification of slow nodes and intelligent scheduling, while reducing training costs by 23% [5][38] - MegaDPP reduces network bandwidth requirements by 50% and enhances GPU and network synchronization, allowing for dynamic pipeline scheduling [17][20] - MegaFBD increases single GPU efficiency by 18.7% by decoupling forward and backward computations, optimizing resource allocation [21][24] Group 3: User Experience and Monitoring - MegaScan provides real-time monitoring of GPU performance, allowing for quick identification of issues that can hinder training efficiency [9][15] - MegaScope offers a lightweight, interactive visualization tool that enables users to monitor training processes and intervene as needed, maintaining a low performance overhead [28][37] Group 4: Cost Savings and Practical Implications - The improvements from MegatronApp translate to significant cost savings in large model training, where even a 1% efficiency gain can save tens of thousands of dollars [40] - The tool is positioned as a foundational system for stable large model training, rather than just an enhancement, emphasizing its importance in practical applications [41]
连续套现14亿元,黄仁勋急着“下车”?
3 6 Ke· 2025-07-23 12:01
Core Viewpoint - Jensen Huang, the CEO of NVIDIA, is perceived as a businessman who prioritizes profit, as evidenced by his recent stock sales despite claiming he has enough wealth [1][9]. Stock Sales and Financial Impact - On July 18, Huang sold 75,000 shares of NVIDIA for approximately $12.94 million (about 92.67 million RMB) [2]. - Over the past two months, Huang has sold NVIDIA shares nearly 20 times, cashing out a total of 1.435 billion RMB [3][5]. - In July alone, Huang has sold 900,000 shares, amounting to around $150 million [6]. Market Performance and Competitive Position - NVIDIA's stock price has surged due to the global expansion of generative AI and the high demand for its GPUs, with a market share of 92% in the discrete graphics card market as of Q1 2025 [8]. - The company's market capitalization briefly surpassed $4 trillion, making it the first company to reach this milestone [3]. Investor Sentiment and Market Dynamics - Huang's continuous stock sales have caused unease among investors, leading to a shift in perception from "AI godfather" to "cash-out king" [4]. - Analysts have begun to warn of potential risks associated with NVIDIA's high valuation, indicating that the stock may be in an overbought state [12]. Global Challenges and Strategic Moves - Despite NVIDIA's technological strengths, the company faces challenges due to geopolitical tensions and regulatory scrutiny, particularly in the U.S. and EU [10][11]. - Huang's recent travels to various regions, including Latin America and Europe, highlight the company's efforts to navigate these complex international relations [10].
大数据ETF(159739)上涨超1%,H20芯片恢复对华销售,大模型训练迎来利好
Xin Lang Cai Jing· 2025-07-16 02:31
Group 1 - The core viewpoint of the news highlights the strong performance of the China Securities Cloud Computing and Big Data Theme Index, with significant gains in constituent stocks such as Xinyiseng and Cloud Tianli Fei, indicating a positive trend in the cloud computing and big data sectors [1][2] - As of July 15, 2025, the Big Data ETF has seen a cumulative increase of 5.99% over the past week, ranking it in the top 20% among comparable funds, reflecting strong investor interest in this sector [1][2] - Nvidia's founder Jensen Huang announced that the U.S. has approved Nvidia to sell H20 chips to China, which is expected to positively impact cloud computing services and large model training, as major internet companies are actively purchasing these chips [1] Group 2 - China Galaxy Securities reports a continuous growth in overseas token demand, suggesting a positive feedback loop between AI computing power and applications, and recommends focusing on domestic NV chain-related companies [2] - The Big Data ETF closely tracks the China Securities Cloud Computing and Big Data Theme Index, which includes 50 listed companies involved in cloud computing services, big data services, and related hardware, reflecting the overall performance of these sectors [2] - As of June 30, 2025, the top ten weighted stocks in the China Securities Cloud Computing and Big Data Theme Index account for 51.84% of the index, indicating a concentration of investment in key players like iFlytek and Zhongji Xuchuang [2]
科创板年内新增最大IPO融资项目拆解:摩尔线程的商业化初探
Hua Er Jie Jian Wen· 2025-07-03 13:09
Core Viewpoint - The competition for the title of "first domestic GPU stock" has begun, with major players like Moer Technology and Muxi Integrated Circuit both advancing towards IPOs, indicating a significant move towards capitalizing the domestic GPU market [1][8]. Group 1: Company Overview - Moer Technology is highlighted as the most notable player among the "four little dragons" of domestic GPUs, with a core team primarily from Nvidia [2]. - The company's MTT S80 graphics card has a single-precision floating-point performance close to Nvidia's RTX 3060, and its self-built GPU computing cluster outperforms similar foreign counterparts [2][12]. Group 2: Financial Performance - In 2024, Moer Technology's revenue reached 438 million yuan, representing a year-on-year increase of over 200% [3]. - Despite the revenue growth, the company incurred a net loss of 1.492 billion yuan due to R&D expenses of 1.359 billion yuan, although this loss has decreased by about 10% year-on-year [4]. Group 3: Fundraising and Investment Plans - Moer Technology plans to raise 8 billion yuan for the development of AI training and inference chips, graphics chips, and AI SoC chips, marking the largest fundraising scale among new IPO projects on the Sci-Tech Innovation Board this year [5][6]. Group 4: Product Strategy and Market Position - Moer Technology's product lineup includes AI computing, professional graphics acceleration, desktop graphics acceleration, and intelligent SoC, catering to government, enterprise, and individual consumer needs [9]. - The AI computing products generated 336 million yuan in revenue in 2024, accounting for over 70% of total revenue, benefiting from the rapid growth in demand for large model training and inference deployment [11][12]. Group 5: Competitive Landscape - Moer Technology's revenue in 2024 was only about 60% of Muxi Integrated Circuit's revenue, indicating a competitive challenge [18]. - The company is shifting its strategy to focus more on professional graphics acceleration and AI computing products, as its consumer-grade products have struggled in a competitive market [20][21]. Group 6: Future Outlook - The management anticipates that Moer Technology could achieve profitability as early as 2027, with 440 million yuan in sales contracts already in progress [23][24].