开源开放

Search documents
科技部原副部长李萌:工程创新成为成就颠覆性创新更重要的形式
Di Yi Cai Jing Zi Xun· 2025-06-27 10:25
Core Insights - DeepSeek has achieved a breakthrough in developing large models with lower costs while maintaining equivalent performance, prompting industry discussions on the efficiency revolution in large models [1] - Engineering innovation is seen as a crucial driver for disruptive innovation, with DeepSeek exemplifying the potential of engineering advancements in enhancing large model development [1][3] - The future of artificial intelligence will increasingly depend on the synergy between software and hardware, particularly in fields like humanoid robotics and advanced autonomous driving [1] Group 1 - The historical context of engineering innovation is highlighted, questioning why significant innovations often arise in specific locations, such as the steam engine revolution occurring in Manchester rather than London [3] - The interplay between theoretical breakthroughs and engineering optimizations is expected to lead future disruptive innovations, with both "0 to 1" and "1 to 100" processes being significant [3] - The efficiency revolution in large models is driven by a combination of architecture, strategy, and optimal software-hardware collaboration, indicating a shift from single-dimensional to multi-faceted understanding of innovation [3][4] Group 2 - DeepSeek's approach to developing large models emphasizes low computing power and cost while achieving performance equivalence, marking a shift in industry competition logic where efficiency is paramount for disruptive innovation [4] - The pursuit of energy efficiency is becoming increasingly important, suggesting that without high performance and energy efficiency, disruptive innovation may not occur [4] - Open-source initiatives are identified as essential for supporting the ecosystem of disruptive innovation [4] Group 3 - While focusing on disruptive innovation, it is crucial to consider potential disruptive harms, as current large model technologies exhibit incomplete explainability [5] - The governance of advanced AI technologies is becoming more urgent, especially as the reasoning capabilities of large models increase, leading to concerns about their compliance with instructions [5]
从开源共建到生态繁荣:昇思MindSpore支持Day0迁移、一键部署
财联社· 2025-06-12 10:59
Core Viewpoint - The article emphasizes the rapid development of large models and the need for efficient migration and deployment solutions in the AI ecosystem, particularly through the use of MindSpore, which aims to facilitate seamless integration and performance optimization for developers [1][2]. Group 1: Migration Challenges - The first challenge is fast migration, enabling zero-cost migration of third-party framework models while ensuring complete alignment in model accuracy. MindSpore achieves this through a threefold compatibility approach, allowing for zero-code migration of mainstream models and improving training performance by 5% while maintaining distributed parallel strategies [4]. - The second challenge is rapid deployment, automating the entire training-to-inference process to make large model deployment as simple as executing a single command [2]. Group 2: Training and Inference Solutions - MindSpore supports Day 0 migration for training, providing a "no-sense intelligent translation" capability across frameworks. It utilizes tools like MindSpeed/Megatron for seamless PyTorch model migration, achieving near-zero migration loss for popular models [4]. - In inference deployment, the vLLM-MindSpore plugin allows for HuggingFace models to be deployed in under 30 minutes, with an 80% reduction in weight loading time for large models [5][6]. Group 3: Open Source and Community Engagement - Since its open-source inception on March 28, 2020, MindSpore has fostered a vibrant developer community, with over 1.2 million downloads and contributions from more than 46,000 developers across 2400 cities [7]. - The company promotes a collaborative ecosystem through community governance, providing free computing resources and knowledge sharing across 20+ technical special interest groups (SIGs) [8].
字节Seed首次开源代码模型,拿下同规模多个SOTA,提出用小模型管理数据范式
量子位· 2025-05-11 04:20
Core Viewpoint - ByteDance's Seed has released the Seed-Coder model, an 8 billion parameter code generation model that surpasses Qwen3 and achieves multiple state-of-the-art (SOTA) results in various benchmarks [1][7]. Model Overview - Seed-Coder consists of three versions: Base, Instruct, and Reasoning [6]. - The model has a context length of 32K and was trained using 6 trillion tokens, following a permissive MIT open-source license [10]. Data Management and Processing - The Seed-Coder model employs a "model-centered" data processing approach, utilizing the model to curate training data [12]. - The data filtering process involves several stages, including deduplication using SHA256 and MinHash algorithms, which reduced the original data volume by approximately 98% [15][16]. - A scoring model trained on over 220,000 code documents is used to filter low-quality code files, resulting in a corpus supporting 89 programming languages and containing around 1 trillion unique tokens [19]. Data Sources - Seed-Coder collected 74 million commit records from 140,000 high-quality GitHub repositories, with selection criteria including at least 100 stars, 10 forks, 100 commits, and 100 days of maintenance activity [21]. - The model also extracts data from web archives, identifying two types of raw data: HTML pages with clear code tags and those without, employing both precise and approximate deduplication techniques [27][28]. Pre-training Phases - The pre-training of Seed-Coder is divided into two phases: conventional pre-training using file-level code and code-related web data, and continuous pre-training that incorporates all data categories along with high-quality datasets to enhance performance [34][35]. Model Variants and Innovations - Two special variants of Seed-Coder have been developed to further expand its utility [36]. - ByteDance has also launched other models, including a video generation model (Seaweed) and a reasoning model (Seed-Thinking-v1.5), emphasizing cost-effectiveness and performance improvements [39][40]. Strategic Direction - ByteDance's Seed is focusing on open-source initiatives and lowering barriers to access, with ongoing adjustments within its AI Lab to explore foundational research in AGI [44].
“断供”阴影下,国产操作系统的破局时刻
Guan Cha Zhe Wang· 2025-05-08 14:22
Core Viewpoint - The key to the breakthrough of domestic operating systems lies not in forced imitation driven by "replacement anxiety," but in identifying advantageous scenarios based on "demand insight" and breaking technical barriers through open-source collaboration, thereby building irreplaceability in niche markets [1][3][4]. Group 1: Market Context - Domestic operating systems have made significant strides, particularly in the consumer electronics sector, with Huawei's new phones exclusively using "pure HarmonyOS" [3]. - The global IoT device count is expected to exceed 64 billion by 2025, with China's market size surpassing 4.5 trillion yuan, accounting for over 30% [3]. - The market is fragmented, with various giants positioning themselves differently: open-source HarmonyOS aims for "all-scenario unification," while Xiaomi's Vela targets smart homes, and Alibaba focuses on industrial internet [3][4]. Group 2: Technological Insights - RT-Thread, a domestic open-source embedded operating system, has achieved an installation base of over 2.5 billion units, making it the largest domestic embedded OS by installation volume [6][16]. - The choice of a microkernel and lightweight design for RT-Thread was driven by market demand, particularly for small chips that cannot run Linux or Android [6][14]. - The operating system's success is attributed to its ability to meet specific needs in fragmented scenarios, such as real-time response requirements in satellites and automotive applications [7][14]. Group 3: Competitive Landscape - The rise of domestic operating systems faces challenges, including weak developer ecosystems and difficulties in adapting to various chips and toolchains [7][19]. - RT-Thread differentiates itself by being a neutral and open third-party platform, collaborating with various manufacturers, and having a solid understanding of the domestic market [21][31]. - The competition includes major players like Huawei and Xiaomi, with RT-Thread focusing on foundational operating systems for vehicles and industrial applications [31][36]. Group 4: Future Outlook - The future of operating systems may involve either creating a self-contained ecosystem like Apple or adhering to an open-source model [7][25]. - The ongoing geopolitical tensions and the push for domestic alternatives are accelerating the development of domestic operating systems [7][24]. - RT-Thread is well-positioned to adapt to the RISC-V architecture, which is gaining traction as an open instruction set standard, enhancing its competitive edge [37][42].