开源社区
Search documents
从“内卷”到“竞合”:大模型时代,开源社区能否带领国产OS“场景突围”?
Ge Long Hui· 2025-11-18 12:23
Core Insights - The article discusses the transformative impact of AI on traditional computing systems, particularly focusing on the evolution of the Anolis OS and the broader domestic software industry in China [2][11] - It highlights the shift from a stable replacement of operating systems to a co-evolution with AI, emphasizing the need for a new definition of operating systems in the AI era [2][11] Group 1: AI's Impact on Operating Systems - AI is reshaping the definition of operating systems from mere resource managers to active "transmission devices" that efficiently organize and schedule heterogeneous resources [5][11] - The new operating systems must support complex applications driven by AI models, requiring advanced memory and tool usage capabilities [4][6] - The demand for managing diverse computing resources, including GPUs and AI chips, presents new technical challenges for operating systems [4][6] Group 2: Domestic Operating Systems' Challenges and Opportunities - Domestic operating systems like Anolis and OpenEuler face a global competitiveness gap compared to top international systems, particularly in unified ecosystem representation [6][7] - However, the inability to rely on a single dominant computing supply has led domestic systems to develop unique experiences in supporting diverse computing environments [7][8] - The complexity of scenarios faced by Chinese enterprises provides domestic operating systems with a natural advantage in handling intricate systems [7][8] Group 3: Advantages of Open Source and Community Collaboration - The deep integration of open source with domestic operating systems enhances their ability to innovate collaboratively among various manufacturers [7][10] - Sustainable commercial investment is crucial for the long-term viability of open source communities, ensuring continuous iteration and development [7][10] - The growth of the Longxin community from 100 to 1000 partners illustrates the strong demand for collaborative solutions across the domestic industry [8][10] Group 4: Competitive and Cooperative Dynamics - The competition among domestic operating systems is characterized by a complex interplay of cooperation and competition, rather than a simple replacement model [8][10] - Multiple communities are necessary to address the diverse needs of the domestic industry, allowing for parallel development without hindering each other [8][10] - The focus should be on creating a "systemic prosperity" rather than a singular dominance, fostering a collaborative ecosystem [10][11] Group 5: Future Directions and Strategic Focus - The path for domestic operating systems involves leveraging open models to drive hardware and OS standards, facilitating a shift away from hardware dependency [10][11] - The ongoing evolution of computing paradigms and the need for high-level cooperation among communities will define the future of domestic operating systems [11][12] - The article concludes that the journey of domestic operating systems is a continuous process of conflict, cooperation, and evolution, positioning them as active participants in the AI-driven transformation [11][12]
大模型优秀大脑齐聚硬核开源聚会,SGLang社区举办国内首次Meetup
机器之心· 2025-10-28 06:29
Core Insights - The Pytorch Conference 2025 showcased the vibrant community and significant developments in deep learning, particularly highlighting SGLang's contributions and potential in the industry [1][3][4]. SGLang Overview - SGLang, an open-source high-performance inference engine for large language models and visual language models, originated from RadixAttention and is incubated by the non-profit organization LMSYS. It offers low latency and high throughput inference across various environments, from single GPUs to large distributed clusters [7][8]. Community Engagement - The first Meetup event in Beijing, co-hosted by SGLang, Meituan, and Amazon Web Services, attracted numerous contributors, developers, and scholars, indicating a strong community presence and development potential [4][8]. Technical Developments - The Meetup featured technical discussions on SGLang's architecture, including advancements in KV Cache, Piecewise CUDA Graph, and Spec Decoding, aimed at improving efficiency and compatibility [21][22]. - SGLang's quantization strategies were also discussed, focusing on expanding application range and optimizing model performance [34][35]. Application and Practice - Various industry applications of SGLang were presented, including its integration with Baidu's Ernie 4.5 model for large-scale deployment and optimization in search scenarios [41][42]. - The application of SGLang in WeChat's search function was highlighted, emphasizing the need for high throughput and low latency in user experience [44]. Future Directions - The roadmap for SGLang includes further integration with various hardware and software solutions, aiming to enhance stability and compatibility across different platforms [22][35]. - The Specforge framework, developed by the SGLang team, aims to accelerate large language model inference and has been adopted by major companies like Meituan and NVIDIA [57][58].
今日暴论:Deepseek-OCR干翻了所有架构
自动驾驶之心· 2025-10-27 00:03
Core Viewpoint - DeepSeek has introduced a new model, DeepSeek-OCR, which significantly reduces the number of tokens required to store and process information by utilizing images as memory carriers instead of relying solely on text tokens [3][6][12]. Group 1: Model Capabilities - DeepSeek-OCR can store nearly the same amount of information using only one-tenth of the tokens compared to traditional models [40][41]. - In tests, DeepSeek-OCR achieved superior performance, using only 100 visual tokens to surpass the 256 tokens required by GOT-OCR 2.0, and less than 800 visual tokens to outperform MinerU 2.0, which typically requires over 6000 tokens [13][14]. - The model supports various resolutions and compression modes, allowing it to adapt to different document complexities, such as using only 64 visual tokens for simple documents [18][21]. Group 2: Data Collection and Utilization - DeepSeek-OCR can capture previously uncollected data from two-dimensional information, such as graphs and images in academic papers, which traditional models could not interpret [32][33]. - The model can generate over 200,000 pages of training data in a day on an A100 GPU, indicating its efficiency in data collection [35]. Group 3: Resource Efficiency - By using images for memory, DeepSeek-OCR reduces the computational load, allowing for a significant decrease in token usage without sacrificing performance [40][41]. - The model can maintain 96.5% accuracy while using only one-tenth of the original token count, demonstrating its effectiveness in resource management [41][42]. Group 4: Open Source and Community Contributions - The development of DeepSeek-OCR is a collaborative effort, utilizing various open-source resources, including Huawei's Wukong dataset and Meta's SAM for image feature extraction [51][53]. - The integration of multiple open-source models has enabled DeepSeek to create an AI capable of "thinking in images," showcasing the power of community-driven innovation [53].
DeepSeek开源的新模型,有点邪门
创业邦· 2025-10-25 10:14
Core Viewpoint - DeepSeek has introduced a new model, DeepSeek-OCR, which utilizes images to store information instead of relying solely on text tokens, significantly improving data compression and model efficiency [5][11][26]. Group 1: Model Functionality - DeepSeek-OCR can convert large amounts of text into images, serving as a memory carrier for AI, which allows for more efficient data storage [9][14]. - The model demonstrates superior performance by using fewer visual tokens compared to traditional models, achieving better results with less resource consumption [11][26]. - In tests, DeepSeek-OCR used only 100 visual tokens to outperform GOT-OCR 2.0, which required 256 tokens, and it achieved results with less than 800 visual tokens compared to over 6000 tokens for MinerU 2.0 [11][14]. Group 2: Data Collection and Utilization - The model can capture previously uncollected data from two-dimensional information, such as graphs and images in academic papers, which traditional models could not interpret [22][24]. - DeepSeek-OCR can generate over 200,000 pages of training data in a day on an A100 GPU, indicating its potential to enhance the training datasets for future models [24]. - The model's ability to remember the position of images and surrounding text allows for a more comprehensive understanding of the data [18][22]. Group 3: Resource Efficiency - By using image-based memory, DeepSeek-OCR can reduce the number of tokens required to one-tenth of the original, while maintaining a high accuracy rate of 96.5% [26][27]. - The model's design allows for dynamic adjustments in token usage based on the complexity of the document, optimizing resource allocation [14][15]. - The research indicates that even with a 20-fold compression, the model can retain around 60% accuracy, showcasing its robustness [27]. Group 4: Open Source Collaboration - DeepSeek-OCR is an open-source project that integrates contributions from various global open-source communities, utilizing datasets and models from companies like Huawei, Baidu, Meta, and OpenAI [32][34]. - This collaborative effort has resulted in a model capable of "thinking in images," highlighting the importance of community-driven innovation in AI development [34].
《2025年全球创新指数报告》发布,中国首次跻身全球前十——中国创新向世界展现新图景
Ren Min Ri Bao· 2025-10-01 01:53
Group 1: Global Innovation Index and Rankings - China has improved its ranking to 10th in the 2025 Global Innovation Index, marking its first entry into the top ten and leading among 36 upper-middle-income economies, having risen 25 places since 2013 [1] - In terms of innovation input, China ranks 19th globally, up 4 places from the previous year, while its innovation output ranks 5th, an increase of 2 places [3] Group 2: Investment in R&D - In 2024, China's total R&D expenditure exceeded 3.6 trillion yuan, reflecting an 8.3% increase from the previous year, with a steady rise in R&D investment intensity and rapid growth in basic research funding [2] - China has the largest R&D workforce globally, with 26 of the world's top 100 technology innovation clusters, and over 460,000 high-tech enterprises [2] Group 3: Innovation Output and Intellectual Property - China ranks first globally in several intellectual property metrics, including design patent applications per unit of GDP, utility model patent applications, and trademark applications [2] - The efficiency of technology transfer has significantly improved, with the development cycle for consumer products like drones and mobile cameras reduced from years to months or even weeks [3] Group 4: AI and International Cooperation - China is actively promoting AI technology and has launched initiatives like the "AI+" international cooperation initiative to enhance collaboration and benefit various sectors globally [4][5] - The "Artificial Intelligence Global Governance Action Plan" aims to promote inclusive and equitable development of AI through effective international cooperation [5] Group 5: Advancements in Key Technologies - China is making significant strides in core technologies, particularly in AI, with over 1,500 large models developed, many of which are open-source and competitive with international standards [7] - The biotechnology sector in China is experiencing a structural transformation, with over 1,250 innovative drugs in the R&D phase, meeting advanced global standards [7] Group 6: Recommendations for Enhancing Innovation - Experts suggest breaking down disciplinary boundaries to foster collaboration between natural sciences, engineering, and social sciences, enhancing the integration of technology with social ethics and cultural contexts [9] - Recommendations include strengthening the innovation ecosystem, increasing investment in basic research, and establishing a unified framework for AI technology assessment and governance [10]
从被动修复到主动免疫,探寻汽车软件故障的智慧处方
Zhong Guo Qi Che Bao Wang· 2025-09-22 09:18
Core Viewpoint - The automotive industry is facing significant challenges due to software faults as vehicles transition to "software-defined" systems, necessitating clear boundaries for intelligent driving functions and safety measures to prevent exaggerated claims [2][3][4]. Group 1: Software Challenges and Industry Response - The complexity of automotive software has increased dramatically, with code volumes exceeding 1 billion lines, far surpassing traditional systems like Windows 10, leading to higher risks of system failures and security breaches [2]. - Software faults have become a major concern, with consumer complaints about issues such as malfunctioning intelligent assistance systems and software failures in electric vehicles [4]. - The industry consensus is to adopt modular architectures to reduce coupling and integrate safety design throughout the development process, which is essential for addressing software faults [5][6]. Group 2: Talent Development and Organizational Structure - There is a growing demand for professionals skilled in automotive software, emphasizing the need for a new talent cultivation model that combines automotive engineering with software expertise [7][8]. - Companies are transitioning to agile organizational structures to enhance responsiveness and improve user feedback handling, which is crucial for rapid software development [10]. Group 3: Ecosystem and Collaboration - Establishing an open-source community is vital for collaborative innovation in automotive software, which can reduce development costs and accelerate technology iteration [11]. - The creation of an industry-level software vulnerability database for real-time information sharing is essential for enhancing software security and reducing faults [12]. Group 4: Future Directions and Technological Evolution - The shift towards centralized computing platforms in vehicles is expected to transform automotive software architecture, allowing for easier updates and improved communication between software modules [14]. - The integration of advanced AI technologies is anticipated to enhance software reliability and enable self-repair capabilities, marking a significant evolution in automotive software systems [15][16].
腾讯云:全面适配主流国产芯片
财联社· 2025-09-16 03:02
Group 1 - The core viewpoint of the article highlights Tencent's commitment to adapting its cloud services to mainstream domestic chips and actively participating in the open-source community [1] - Tencent Cloud's long-term strategic investment focuses on optimizing hardware and software collaboration, utilizing a heterogeneous computing platform to provide high-cost performance AI computing power [2]
外滩大会观察:中国“小虎队”勾勒科技新图景
Huan Qiu Wang· 2025-09-11 10:23
Group 1 - The article highlights the emergence of a new generation of young innovators in China, referred to as the "Tech Tigers," who are reshaping the technology landscape with an average age of under 30 [1][11] - The 2025 Inclusion Bund Conference in Shanghai serves as a platform for these young researchers, developers, and entrepreneurs, featuring various events such as the AI Innovation Competition and technology exhibitions [1][11] - The AI Innovation Competition attracted nearly 20,000 participants, with over half being post-2000 generation, showcasing the significant involvement of youth in technological advancements [1][11] Group 2 - Young researchers like Lian Hui from the Hefei Institute of Physical Science are making strides in clean energy through controlled nuclear fusion technology, which has implications for AI computing and industrial applications [2] - Zhang Fan, a professor at the University of Electronic Science and Technology, is recognized for his work in digital medicine, significantly reducing MRI imaging time, which can save critical time for patients [2] - Cheng Haonan, a post-95 researcher, developed a platform to combat deepfake technology, demonstrating the innovative spirit of young researchers in addressing contemporary challenges [3] Group 3 - Young entrepreneurs like Wu Chenglin and Zhu Zheqing are leading AI startups that focus on innovative applications of AI technology, emphasizing a shift from traditional business models to more dynamic, technology-driven solutions [9][10] - The article emphasizes the importance of open-source communities in fostering collaboration and innovation among young engineers, as seen in the contributions of figures like Fan Wendong and Xiang Jinyu [6][9] - The narrative illustrates a broader cultural shift among young innovators who are not only focused on technological advancements but also on redefining the creative process and democratizing art through AI [10][11]
亚马逊AWS删用户10年数据陷争议 内部人士称是管理员误操作
Xi Niu Cai Jing· 2025-08-11 07:56
Core Points - Amazon Web Services (AWS) faced significant backlash after deleting the account and all data of a senior open-source developer without warning, leading to accusations of "digital destruction" [2][3] - The developer, Abdelkader Boudih (known as Seuros), reported that AWS requested identity verification, which he completed, but his account was still deleted due to alleged issues with a third-party payment provider [2] - AWS attributed the account deletion to problems with a third-party payment person, who had been covering monthly testing fees, but Seuros argues that this was not the real issue [3] Summary by Sections Account Deletion Incident - AWS deleted Seuros' account on July 23 after rejecting his identity verification documents, despite him providing valid identification and billing information [2] - The account contained critical data, including over 20 Rubygem project codes, teaching materials, unpublished manuscripts, and other important resources [2] AWS's Justification - AWS claimed the deletion was due to issues with a third-party payment provider, who had been responsible for payments but became unreachable after the FTX collapse [2] - AWS refused to revert to Seuros' original credit card, citing "privacy compliance," and placed the responsibility for the account termination on the user [2] Internal Insights and Developer Response - Seuros speculated that the real reason for the account deletion was an internal error during a verification process involving "inactive" accounts, which mistakenly led to the deletion of an active account [3] - An AWS insider suggested that the deletion was a result of a mistake made by AWS administrators during a verification process [3] - In response to the data loss, Seuros is developing a free AWS migration and data cleaning tool to assist developers and small teams in transferring their data to other cloud platforms [3] - Seuros manages clients with over $400,000 in monthly AWS spending, and several clients have agreed to migrate following the incident [3]
GitHub 第 10 亿个仓库封神了!官方祝贺认证,全球程序员笑翻:就爱这野路子!
程序员的那些事· 2025-06-16 02:30
Core Insights - GitHub has reached a significant milestone with the creation of its 1 billionth code repository, humorously named "shit" [4][6] - The repository was created by Aasish Pokhrel and gained rapid popularity, accumulating 2,800 stars within a short period [11] - This event highlights the vibrancy and humor within the open-source community, showcasing its ability to engage developers globally [11] Summary by Sections - **Milestone Achievement** - GitHub officially celebrated the creation of its 1 billionth repository, marking a major milestone in its history [4][7] - **Repository Details** - The repository "shit" was created in just 32 hours and quickly gained traction, receiving 2,800 stars as of the article's publication [11] - **Community Reaction** - The GitHub community reacted enthusiastically, with many users commenting and celebrating the repository's humorous name and its significance [11]