Workflow
人工智能开源
icon
Search documents
昔日开源霸主承认蒸馏阿里千问,世界进入中国AI时间
3 6 Ke· 2025-12-11 11:39
全球人工智能开源领域出现标志性事件。12月10日,彭博社的一则报道震动了硅谷——曾经的全球开源霸主Meta公司,其最新AI模型"牛油果"项目正在秘 密使用中国阿里巴巴的千问Qwen开源模型进行蒸馏训练。 这一消息传出后,阿里巴巴股价盘前应声上涨超过4%。 就在一年前,Meta创始人扎克伯格还在公开场合呼吁构建以美国模型为核心的开源生态。如今,面对自身Llama系列增长乏力与东方模型的强势崛起, Meta的这次选择更像一次务实的转身。 开源世界的权力转移早有征兆。2024年8月,Qwen的衍生模型数量首次超越长期霸主Llama;2025年10月,其全球下载量也完成了反超。英伟达CEO黄仁 勋在近期一场演讲中的观察点明了这一趋势:"开源已变得极其重要,而中国在这一领域正展现出领先态势。" 当昔日的规则制定者开始向曾经的追赶者汲取养分,一场深刻的行业变局已经拉开序幕。 一场无声的倒戈:开源王座易主 根据彭博社报道,扎克伯格密切关注的TBD实验室团队,在"牛油果"模型的训练中,蒸馏了包括谷歌Gemma、OpenAI的gpt-oss以及阿里巴巴Qwen在内的 多方开源模型。 这一技术选择背后是Meta自身开源战略面临的 ...
中国模型厂商开辟“开源战场”,顶层设计再添一把火
Di Yi Cai Jing· 2025-08-29 06:58
Core Insights - Open source is not merely a technical means but is becoming a key mechanism to drive the AI ecosystem and industry implementation [1][5] Group 1: Government Initiatives - The State Council has issued opinions to deepen the implementation of the "AI+" action, emphasizing the enhancement of foundational model capabilities and the promotion of a thriving open-source ecosystem [2] - The development of a robust open-source community is seen as essential for gathering global developers and forming a vast network for technical exchange and innovation, which can enhance the influence of Chinese models on the world stage [2] Group 2: Industry Performance - Chinese model vendors are experiencing a dual scenario of "open source for a leapfrog advantage" and "technical breakthroughs facing bottlenecks," achieving significant rankings on international lists while still confronting challenges in commercializing their innovations [2][4] - Chinese open-source models, such as those from Alibaba and DeepSeek, are now comparable in performance to top proprietary models, marking a significant advancement in the global AI landscape [4] Group 3: Competitive Landscape - The competitive landscape is characterized by a dichotomy where leading firms either choose not to open-source or only open-source non-core models, with proprietary models used to maintain competitive advantages [8] - Despite the advantages of open-source models, such as customization and cost savings, there remains a performance gap of 9 to 12 months compared to leading proprietary models, influencing enterprise preferences [8] Group 4: Future Outlook - Open source is expected to evolve beyond a technical tool to become a critical mechanism for driving AI ecosystem and industry implementation, facilitating innovation and collaboration [5][9] - The relationship between open-source and proprietary models is seen as complementary, with open-source potentially leading breakthroughs in industry customization and multi-modal innovation [9] Group 5: Commercialization Challenges - Chinese open-source projects face challenges in commercializing effectively compared to their overseas counterparts, primarily due to different market environments and user habits [9] - The path to overcoming these challenges lies in globalization, where establishing a developer base in international communities can enhance customer willingness to pay and brand recognition [9]
马斯克开源Grok 2.5:中国公司才是xAI最大对手
3 6 Ke· 2025-08-24 23:25
Core Points - xAI has officially open-sourced Grok 2.5, with Grok 3 expected to be released in six months [1] - Elon Musk had previously indicated that it was time to open-source Grok, which was initially expected to happen the following week [3] - Despite the delay, the sentiment remains positive as many believe that it is better late than never [4] Summary by Sections Open Source Release - Grok can now be downloaded from HuggingFace, consisting of 42 files with a total size of approximately 500GB [5] - The official recommendation is to use SGLang to run Grok 2, with specific steps provided for downloading and setting up the model [6][7] Technical Specifications - The model requires 8 GPUs, each with over 40GB of memory, to operate effectively [8][17] - The initial setup involves downloading weight files and launching the inference server using SGLang [8] Performance Metrics - Grok 2 has shown competitive performance in various academic benchmarks, including GPQA, MMLU, and MATH, with scores that rival leading models [13] - In the LMSYS ranking, Grok 2 surpassed Claude and GPT-4 in overall Elo scores [10] Community Feedback - There are mixed reactions regarding the open-source model, particularly concerning the lack of clarity on the model's parameter weights, which are speculated to be 269 billion for the MoE model [15] - The open-source license has also drawn criticism, as it does not align with more common licenses like MIT or Apache 2.0 used by other models [15] Additional Features - Alongside the open-source release, new features have been added to the Grok APP, focusing on AI video generation [19]
马斯克开源Grok-2,称“中国公司将是最强大的对手”
Mei Ri Jing Ji Xin Wen· 2025-08-24 11:08
Core Insights - Tesla CEO Elon Musk announced the open-sourcing of xAI's best model, Grok-2.5, which is actually Grok-2, and stated that Grok-3 will be open-sourced in approximately six months [1] - Musk expressed confidence that xAI will soon surpass all companies except Google, and eventually surpass Google as well [1] - Musk highlighted that Chinese companies will be the strongest competitors due to their greater access to electricity and strong capabilities in hardware development [1]
刚刚,马斯克开源Grok 2.5:中国公司才是xAI最大对手
Sou Hu Cai Jing· 2025-08-24 01:29
Core Viewpoint - Elon Musk's xAI has officially open-sourced Grok 2.5, with Grok 3 expected to be released in six months, generating significant attention in the AI community [1][2]. Group 1: Open Source Release - Grok 2.5 is now available for download on HuggingFace, consisting of 42 files totaling approximately 500GB [3][4]. - The model requires a minimum of 8 GPUs, each with over 40GB of memory, to operate effectively [4][10]. Group 2: Model Performance - Grok 2 has surpassed Claude and GPT-4 in overall Elo scores on the LMSYS leaderboard, indicating competitive performance [4]. - In various academic benchmark tests, Grok 2 has shown strong results in areas such as graduate-level scientific knowledge (GPQA), general knowledge (MMLU, MMLU-Pro), and mathematics (MATH) [7][8]. Group 3: Community Feedback - While the open-sourcing of Grok has been positively received, there are criticisms regarding the lack of clarity on model parameters and the open-source licensing terms [9]. - The community has noted that the open-source model's parameters are speculated to be around 269 billion in a MoE configuration, but this has not been explicitly confirmed by xAI [9].
三位90后,估值700亿
创业家· 2025-08-11 10:09
Core Viewpoint - Mistral AI, founded by three young graduates, is raising $1 billion in a new funding round, reaching a valuation of $10 billion, reflecting a nearly 50-fold increase in just two years [4][8]. Group 1: Company Overview - Mistral AI was established by three 90s graduates who previously worked at top AI companies and returned to France to seize the AI opportunity [8]. - The company launched its first open-source model, Mistral 7B, which outperformed competitors in several benchmarks, quickly gaining attention in the developer community [8][9]. - Mistral aims to lead the generative AI wave through open-source initiatives, contrasting with closed models from competitors like OpenAI [8][9]. Group 2: Funding and Valuation - Mistral AI completed a record seed round of $113 million shortly after its founding, achieving a valuation of over $260 million [12]. - By the end of 2023, Mistral raised $415 million in Series A funding, led by a16z, increasing its valuation to $2 billion [13]. - The company’s valuation skyrocketed to $6 billion after a $640 million Series B round, with major investors including Microsoft and Nvidia [14]. - Currently, Mistral is negotiating a $1 billion funding round, which could elevate its valuation to approximately $10 billion [14]. Group 3: Competitive Landscape - The AI landscape is becoming increasingly competitive, with the emergence of DeepSeek as a significant player, prompting Mistral to accelerate its product development and commercialization efforts [9]. - Mistral has launched several products, including the chatbot Le Chat, which achieved high download rates in France but struggled internationally [9]. - The company is actively pursuing partnerships with industry giants like Nvidia to enhance its market position [9]. Group 4: Young Entrepreneurs in AI - The AI sector is witnessing a surge of young entrepreneurs, with several companies founded by 90s graduates achieving significant funding and rapid growth [16][17]. - Companies like Perplexity and Genesis AI have also seen remarkable valuations, highlighting the trend of young innovators in the AI space [16][17]. - This new generation of entrepreneurs is characterized by their global perspective and technical expertise, positioning them well to capitalize on AI opportunities [18].
OpenAI时隔6年再度开源,两款推理模型,o4-mini级,手机和笔记本能跑
3 6 Ke· 2025-08-06 03:23
Core Insights - OpenAI has released two long-anticipated open-source models: gpt-oss-120b and gpt-oss-20b, both utilizing the MoE architecture for efficient deployment [1][2][3] - The gpt-oss-120b model can run efficiently on a single 80GB GPU, while the gpt-oss-20b requires only 16GB of memory for edge devices, providing local model options for AI applications [1][2][3] - The models have shown competitive performance in benchmark tests, with gpt-oss-120b performing similarly to OpenAI's o4-mini and gpt-oss-20b comparable to o3-mini [1][2][3] Model Specifications - gpt-oss-120b has 117 billion total parameters, activating 5.1 billion parameters per token, while gpt-oss-20b has 21 billion total parameters with 3.6 billion active parameters per token [29][30] - Both models support a context length of up to 128k tokens and utilize advanced attention mechanisms to enhance efficiency [29][30] Performance and Compatibility - The gpt-oss-120b model has achieved a record inference speed of over 3000 tokens per second, while gpt-oss-20b can run on mobile devices, although some experts question the feasibility of this claim [10][45][22] - At least 14 deployment platforms, including Azure and Hugging Face, have already integrated support for these models, indicating strong industry adoption [9][10] Community and Industry Response - While many users celebrate the release, there are concerns regarding the lack of transparency in the training process and data sources, limiting the open-source community's ability to fully leverage the models [9][27][29] - OpenAI's decision to open-source these models is seen as a response to previous criticisms regarding its openness, potentially influencing more developers and companies to adopt these technologies [47]
扎克伯格发文正式告别“默认开源”!网友:只剩中国 DeepSeek、通义和 Mistral 还在撑场面
AI前线· 2025-08-02 05:33
Core Viewpoint - Meta is shifting its AI model release strategy to better promote the development of "personal superintelligence," emphasizing the need for careful management of associated risks and selective open-sourcing of content [3][5][11]. Group 1: Shift in Open-Source Strategy - Mark Zuckerberg's recent statements indicate a significant change in Meta's approach to open-source AI, moving from being a "radical open-source advocate" to a more cautious stance on which models to open-source [6][8]. - The company previously viewed its Llama open-source model series as a key competitive advantage against rivals like OpenAI and Google DeepMind, but this perspective is evolving [5][9]. - Meta is unlikely to open-source its most advanced models in the future, which could lead to increased expectations for companies that remain committed to open-source AI, particularly in China [10][11]. Group 2: Investment and Development Focus - Meta has committed $14.3 billion to invest in Scale AI and restructure its AI department into "Meta Superintelligence Labs," indicating a strong focus on developing closed-source models [11][12]. - The company is reallocating resources from testing the latest Llama model to concentrate on developing a closed-source model, reflecting a strategic pivot in its AI commercialization approach [12][14]. - Meta's primary revenue source remains internet advertising, allowing it to approach AI development differently than competitors reliant on selling access to AI models [11]. Group 3: Future of Personal Superintelligence - Zuckerberg envisions "personal superintelligence" as a means for individuals to achieve their personal goals through AI, with plans to integrate this concept into products like augmented reality glasses and virtual reality headsets [14]. - The company aims to create personal devices that can understand users' contexts, positioning these devices as the primary computing tools for individuals [14].
超越DeepSeek,中国开源“集团军”重塑全球AI生态
Guan Cha Zhe Wang· 2025-04-27 12:57
Core Insights - China's open-source AI ecosystem is rapidly evolving, showcasing technological confidence and creating a path for global collaboration, contrasting with the closed-source approach prevalent in the U.S. [1][6][8] Group 1: Open-Source Development in China - DeepSeek and other foundational models like Alibaba's Qwen are driving the advancement of China's open-source capabilities, leading to the emergence of smaller, more powerful vertical models from various SMEs [1][4] - The launch of models like Skywork-OR1 by Kunlun Wanwei demonstrates that even companies with limited funding can achieve state-of-the-art (SOTA) performance by leveraging existing open-source models [4][5] - The rapid iteration of large models in China, such as Alibaba's Qwen2.5-VL and the multi-modal models from Jiepu, indicates a thriving open-source ecosystem [5][6] Group 2: Comparison with U.S. AI Strategy - The U.S. AI industry remains predominantly closed-source, driven by major tech companies and venture capitalists seeking high returns, which fosters a monopolistic environment [6][8] - OpenAI's shift to a closed-source model, particularly after its partnership with Microsoft, highlights the commercial motivations behind this strategy [7][8] - In contrast, China's top-down approach emphasizes open-source development as a means to enhance technological equity and industry collaboration [8][9] Group 3: Economic and Social Implications - The Chinese government has actively supported open-source initiatives, recognizing their potential to lower technological barriers and promote economic integration [8][9] - Investments in open-source projects, such as the Z Fund's commitment to support AI open-source communities, reflect a broader strategy to foster innovation [9][10] - The open-source movement in China is not only about providing free products but also about enabling developers to build upon existing technologies, thus accelerating progress [5][10] Group 4: Practical Applications and Success Stories - Open-source models are being successfully implemented in various industrial applications, such as predictive maintenance in manufacturing and environmental conservation efforts [13][14] - Companies like Baosteel and Zhongmei Kegong are utilizing open-source models to enhance operational efficiency and reduce costs [13][14] - The collaborative nature of open-source development allows for broader participation in AI projects, benefiting both commercial and non-profit sectors [14][15] Group 5: Future Outlook - China's open-source AI landscape is transitioning from a phase of "technological following" to "ecosystem leadership," reshaping the global AI landscape [18][20] - The focus is shifting from mere parameter competition to the deep integration of AI technology with the real economy, indicating a strategic evolution in the industry [18][20]
OpenAI 的 75 封内部邮件,一堂硅谷创业课
晚点LatePost· 2024-12-24 12:53
3 万字全文翻译,被披露的不只是内斗。 文丨贺乾明 翻译、校对丨GPT-4o、Claude-3.5-Sonnet、贺乾明 编辑丨 黄俊杰 "创业公司的内部全都是车祸现场。只是有的你能在媒体上看到,有的看不到。" 山姆·阿尔特曼(Sam Altman)的导师,硅谷孵化器 YC 创始人保罗·格雷厄姆 曾这么总结他看过的千百家创业公司。 得益于阿尔特曼和特斯拉 CEO 埃隆·马斯克的争斗和诉讼,现在我们也能看到 OpenAI 早年的真实样子。双方各自公布了数批共 75 封内部邮件以及短信记 录,时间跨度从 2015 年 OpenAI 筹备到 2019 年组建营利实体,展示了一群硅谷名流和天才 AI 研究者如何因为理想聚在一起,又如何随着 OpenAI 的发展争 权夺利。 这些长达 3 万多字的内部记录也像是一堂 OpenAI 出品的创业课,涉及这家全球最大的 AI 创业公司早期如何讲故事、组建精英阵容,到薪酬设计、股权分配 的方方面面;从低价与 Google 争夺人才到与微软谈判合作,甚至还包括曾考虑过的加密货币融资方案。而且,我们还能看到首席科学家伊利亚·苏茨克维 (Ilya Sutskever) 如何写双周报, ...