Workflow
开源模型
icon
Search documents
六年来首次!OpenAI新模型开放权重,Altman称为"全球最佳开放模型"
Hua Er Jie Jian Wen· 2025-08-05 20:05
Core Insights - OpenAI has released two open-weight language models, gpt-oss-120b and gpt-oss-20b, marking its first open-weight model launch since 2019 and responding to competition from Meta, Mistral AI, and DeepSeek [1][2][12] Model Specifications - gpt-oss-120b and gpt-oss-20b are designed for low-cost options, with gpt-oss-20b able to run on a laptop with 16GB RAM and gpt-oss-120b requiring approximately 80GB RAM [2][5] - gpt-oss-120b has a total of 117 billion parameters, activating 5.1 billion parameters per token, while gpt-oss-20b has 21 billion parameters, activating 3.6 billion parameters per token [5][6] Performance Evaluation - gpt-oss-120b performs comparably to OpenAI's o4-mini in core inference benchmarks, while gpt-oss-20b matches or exceeds the performance of o3-mini [7][8] - Both models utilize advanced pre-training and post-training techniques, focusing on efficiency and practical deployment across environments [5][11] Security Measures - OpenAI has implemented extensive security measures to prevent malicious use of the models, filtering harmful data during pre-training and conducting specialized fine-tuning for security assessments [11] - The company collaborates with independent expert groups to evaluate potential security risks associated with the models [11] Market Impact - The release of these models is seen as a strategic shift for OpenAI, which had previously focused on proprietary API services, now responding to competitive pressures in the open-weight model space [12][15] - OpenAI has partnered with major cloud service providers like Amazon to offer these models, enhancing accessibility for developers and researchers [3][11]
中国AI猛追美国
日经中文网· 2025-08-05 02:43
阿里巴巴集团重点展示了"开源"的AI(7月26日,上海市) 中国国内完成备案的AI模型数量半年增加了4成,斯坦福大学的研究报告指出"中国的模型正逐渐赶上美 国"。由于中国的AI多为开源模型,日本AI开发中也在大量采用。美国试图遏制中国AI发展。但专家 称"能否遏制中国崛起仍是未知数"…… 中国的生成式AI(人工智能)正在猛追美国。中国国内完成备案的AI模型数量半年增加了4成,与美国 企业的性能差距也在缩小。7月26日在上海开幕的"世界人工智能大会"的参加企业比去年增加6成,阿里 巴巴集团等展示了最新技术。在美国试图关闭人才交流大门的背景下,中国则寻求在相关领域获得引领 世界的地位。 2018年开始举办的世界人工智能大会今年约有800家企业参加,比2024年增加了约300家。除了40多款 AI模型之外,还展示了60多款机器人,在性能和创新方面展开了竞争。 据路透社报道,中国国务院总理李强在开幕式发表演讲时指出,全球人工智能治理仍然分散。虽然没有 提到美国,但指出人工智能可能成为少数国家和企业的排他性游戏。还提出芯片供应短缺和人才交流受 限等课题,提倡成立AI相关的合作组织。 针对生成式AI,中国提出了在监管的同 ...
对话PPIO姚欣:AI大模型赛道加速内卷,但合理盈利路径仍需探索
Tai Mei Ti A P P· 2025-08-05 02:23
Core Insights - PPIO, co-founded by CEO Yao Xin, is focusing on AI cloud computing services, particularly in the context of the growing demand for GPU computing power and AI inference driven by technologies like ChatGPT and DeepSeek [3][4] - The company has optimized the DeepSeek-R1 model, achieving over 10 times throughput improvement and reducing operational costs by up to 90% [4] - PPIO is recognized as the largest independent edge cloud service provider in China, holding a market share of 4.1% and operating the largest computing network in the country [4][5] Company Developments - PPIO has submitted its IPO application to the Hong Kong Stock Exchange, indicating increased interest from investors following the submission [5] - The company launched China's first Agentic AI infrastructure service platform, which includes a sandbox for agents and supports rapid integration of various AI models [5][6] - PPIO aims to build a comprehensive infrastructure service for developers and enterprises, focusing on agent-based applications [5][6] Market Position and Strategy - PPIO is one of the earliest participants in the distributed cloud computing market to offer AI cloud services, with a significant increase in daily token consumption from 27.1 billion in December 2024 to 200 billion by June 2025 [5] - The company emphasizes the importance of open-source models for the development of the AI industry, contrasting with the trend of U.S. companies moving towards closed-source models [6][10] - Yao Xin believes that the future of AI will require a shift towards distributed computing, particularly in edge and side computing, as the industry moves away from centralized models [7][28] Industry Insights - The AI infrastructure market is characterized by low margins and large scale, with PPIO positioning itself to capitalize on the growing demand for distributed computing solutions [6][18] - The company sees significant opportunities in the domestic GPU market, particularly as the demand for inference capabilities increases [20] - Yao Xin highlights the need for a strong integration of hardware and software to drive advancements in AI technology, emphasizing the importance of end-to-end capabilities [20][22]
大模型年中报告:Anthropic 市场份额超 OpenAI,开源模型企业采用率下降
Founder Park· 2025-08-04 13:38
Core Insights - The foundational large models are not only the core engine of generative AI but are also shaping the future of computing [2] - There has been a significant increase in model API spending, which rose from $3.5 billion to $8.4 billion, indicating a shift in focus from model training to model inference [2] - The emergence of "code generation" as the first large-scale application of AI marks a pivotal development in the industry [2] Group 1: Market Dynamics - Anthropic has surpassed OpenAI in enterprise usage, with a market share of 32% compared to OpenAI's 25%, which has halved from two years ago [9][12] - The release of Claude Sonnet 3.5 in June 2024 initiated Anthropic's rise, further accelerated by subsequent releases [12] - The code generation application has become a killer app for AI, with Claude capturing 42% of the market, significantly outperforming OpenAI's 21% [13] Group 2: Trends in Model Adoption - The adoption of open-source models in enterprises has slightly declined from 19% to 13%, with Meta's Llama series still leading [17] - Despite the continuous progress in open-source models, they lag behind closed-source models by 9 to 12 months in performance [17][20] - Developers prioritize performance over cost when selecting models, with 66% opting to upgrade within their existing supplier ecosystem [24][27] Group 3: Shift in AI Spending - AI spending is transitioning from model training to inference, with 74% of model developers indicating that most of their tasks are now driven by inference, up from 48% a year ago [31]
具有“开源精神”的投研团队是什么样的?
点拾投资· 2025-08-01 07:03
Core Viewpoint - The article emphasizes that 2025 is likely to be the true beginning of the artificial intelligence (AI) era, highlighted by the rapid adoption of open-source models like DeepSeek's V3 and R1, which reached over 100 million users in just seven days [1][2]. Group 1: Importance of Open Source in AI - Open-source models are seen as crucial for democratizing technology, allowing individuals to customize and commercialize AI solutions [1][2]. - The debate between open-source and closed-source models has gained traction, with industry leaders acknowledging the historical missteps of closed-source approaches [1][2]. - The concept of "technological democratization" is highlighted, suggesting that open-source models can inspire new economic paradigms through systemic technological changes [1][2]. Group 2: Investment Opportunities in AI - The article outlines three levels of economic impact from AI: enhancing productivity, creating new markets and business models, and optimizing resource allocation [6][7][8]. - Five key investment opportunities are identified: AI infrastructure, embodied intelligence, vertical deepening of generative AI, the explosion of AI agent ecosystems, and AI smart terminals [9]. Group 3: Insights from the World Artificial Intelligence Conference - The World Artificial Intelligence Conference showcased over 3,000 AI products, reflecting China's capabilities in hard technology [5]. - The conference served as a platform for discussing the predictions made in the "China Technology - Dare! 2025 Noan Fund Technology Investment Report" [5]. Group 4: Team Structure and Investment Strategy - The Noan Fund's technology team has developed a robust investment strategy, focusing on the intersection of various technology sectors to capture significant opportunities [20][22]. - The team emphasizes a culture of open communication and collaboration to minimize alpha loss and enhance research efficiency [20][22]. Group 5: Historical Context and Future Outlook - The article discusses the evolution of China's technology sector, particularly in semiconductor capabilities, and how the Noan Fund has positioned itself during challenging times [12][14]. - The team believes that the current AI revolution is akin to past technological waves, with hardware innovation being a critical driver [23][24]. Group 6: Commitment to Long-term Investment - The Noan Fund is committed to being a "patient capital" provider, supporting the growth of China's technology sector through sustained investment [28][27]. - The article concludes that the mission of financial institutions is to facilitate industrial development and optimize resource allocation in society [28][29].
基模下半场:开源、人才、模型评估,今天的关键问题到底是什么?
Founder Park· 2025-07-31 14:57
Core Insights - The competition in large models has shifted to a contest between Chinese and American AI, with Chinese models potentially setting new open-source standards [3][6][10] - The rapid development of Chinese models like GLM-4.5, Kimi 2, and Qwen 3 indicates a significant shift in the landscape of open-source AI [6][10] - The importance of effective evaluation metrics for models is emphasized, as they can significantly influence the discourse in the AI community [5][24][25] Group 1 - The emergence of Chinese models as potential open-source standards could reshape the global AI landscape, particularly for developing countries [6][10] - The engineering culture in China is well-suited for rapidly implementing validated models, which may lead to a competitive advantage [8][10] - The talent gap between institutions is not as pronounced as perceived; efficiency in resource allocation often determines model quality [5][16] Group 2 - The focus on talent acquisition by companies like Meta may not address the underlying issues of internal talent utilization and recognition [15][18] - The chaotic nature of many AI labs can hinder progress, but some organizations manage to produce significant results despite this [20][22] - The future of AI evaluation metrics will likely shift towards those that can effectively measure model capabilities in real-world applications [23][24] Group 3 - The challenges of reinforcement learning (RL) and model evaluation are highlighted, with a need for better benchmarks to assess model performance [23][26] - The complexity of creating effective evaluation criteria is increasing, as traditional methods may not suffice for advanced models [34][36] - The long-term progress in AI may be limited by the need for better measurement tools and methodologies rather than just intellectual advancements [37][38]
开源模型三城记
Hu Xiu· 2025-07-30 01:58
Core Insights - The article discusses the competitive landscape of AI in China, particularly focusing on the launch of new open-source models like GLM-4.5 by Zhiyu and the ongoing rivalry among cities like Beijing, Shanghai, and Hangzhou in the AI sector [1][19] - The emergence of open-source models is seen as a response to the U.S. AI action plan, with China aiming to accelerate the deployment of open-source AI globally [1][16] Group 1: Open-Source Model Developments - Zhiyu has released the GLM-4.5 model, which has a total parameter count of 355 billion and an active parameter count of 32 billion, showcasing significant performance capabilities [11] - Alibaba has introduced several models, including Qwen3-Coder with 480 billion total parameters, which is priced at one-third of its competitor Claude 4, indicating a strong push in the open-source domain [3][5] - The K2 model from the company Moonlight has implemented a self-criticism reward mechanism to enhance its ability to handle complex tasks, marking a significant innovation in the field [10] Group 2: Competitive Dynamics - The competition among AI startups in Shanghai and Beijing has intensified, with companies like MiniMax and Moonlight rapidly updating their models to keep pace with market demands [6][9] - The article highlights the "flywheel effect" initiated by DeepSeek, which has led to price wars and increased performance testing among open-source models [2] - The collaboration and competition among these cities are likened to a "three-city drama," emphasizing the regional rivalry in AI development [1][19] Group 3: Strategic Implications - The open-source approach is seen as a cultural shift for companies like DeepSeek, which aims to attract top talent and contribute to global innovation in AI [14] - Alibaba's strategy aligns with its cloud computing identity, focusing on technology-first approaches rather than purely commercial ones [13] - The article suggests that the open-source ecosystem in China could lead to rapid innovation and improvement, potentially surpassing proprietary models from the U.S. [17][19]
阿里通义大模型迎“周年庆”:一周开源4款模型
Nan Fang Du Shi Bao· 2025-07-29 12:23
Core Insights - Alibaba has made significant progress in open-sourcing its AI models, with the recent release of the fully open-sourced Tongyi Qianwen and the introduction of the video generation model Tongyi Wanshang Wan2.2, which incorporates cinematic elements and allows for extensive user customization [1][3]. Group 1: Open Source Progress - Alibaba's Tongyi Qianwen has achieved full-scale, multi-modal open-sourcing, breaking down barriers between open-source and closed-source models [1]. - In the past week, Alibaba has released four major models, including the Tongyi Wanshang Wan2.2, which can generate 5-second high-definition videos with over 60 customizable parameters [3]. Group 2: Model Performance and Recognition - The latest version of Tongyi Qianwen has been recognized as the "most intelligent non-thinking foundational model" by Artificial Analysis, while its reasoning model has matched top closed-source models like Gemini 2.5 pro and o4-mini [3]. - The AI programming model Qwen3-Coder has surpassed leading closed-source models such as GPT-4.1 and Claude 4, achieving the top position on the global open-source community Hugging Face [3]. Group 3: Market Impact and Community Engagement - Alibaba's open-source initiatives have sparked a new wave of AI development in China, with the Tongyi Qianwen API call volume exceeding 100 billion tokens within three days, surpassing other top models [4]. - The download count for Tongyi Qianwen has surpassed 400 million, with over 140,000 derivative models, making it the largest open-source model family globally, surpassing Meta's Llama series [5].
超越OpenAI、Meta,阿里千问API调用量跃居全球第四
Jing Ji Guan Cha Wang· 2025-07-29 12:18
7月28日消息,全球知名的大模型API三方聚合平台OpenRouter公布了最新一期榜单,来自中国的DeepSeek和阿里通义千问跻身全球前五。其中,来自阿里 的通义千问以10.4%的市场份额,超越OpenAI的4.7%,位列第四。 以通义千问为代表的中国开源模型,正以"周级迭代"的加速度引领AI变革。上周,阿里巴巴接连开源3款大模型,分别斩获基础模型、编程模型、推理模型 等主流领域的三项全球开源冠军,性能逼平Claude4、GPT4.1、o4-mini、Gemini2.5 pro等目前同领域最强的闭源模型。7月27日,通义团队还公布了「阿里 AI三连发」背后的训练秘籍——强化学习新算法GSPO,引发技术圈又一轮热议。 据了解,OpenRouter聚集了全球最顶尖的模型,无论闭源还是开源,均以API形式对外提供服务,其API调用量榜单通常被视为全球大模型市场最重要的风 向标之一。开源模型正加速取代闭源模型,OpenRouter推文显示,当下成长最快前10(Top10)大模型中有9个是开源的;其中,Qwen3-Coder调用量以近 500亿 Tokens高居第一,通义千问包揽前三,并在前十中占据五席。最近一周, ...
一周四连发,阿里AI跑出飓风速度
3 6 Ke· 2025-07-29 08:48
Core Insights - Alibaba has rapidly advanced its AI capabilities by releasing multiple open-source models, including Qwen3 series and Tongyi Wanshang, redefining the performance standards of open-source models [1][2][3][4][17] - The company has achieved significant breakthroughs in various AI domains, including foundational models, programming models, and reasoning models, with Qwen3-235B-A22B-Instruct being recognized as the "most intelligent non-thinking foundational model" globally [1][4][6][19] - Alibaba's open-source strategy is aimed at democratizing technology and fostering a global developer ecosystem, which is expected to reshape the competitive landscape of AI technology [12][15][19][21] Model Performance and Features - The Qwen3-235B-A22B-Instruct model can be deployed with only four H20 GPUs, occupying one-third of the memory compared to similar models, and has a reasoning speed improvement of 1.8 times [4][6] - Qwen3-Coder has surpassed top proprietary models like GPT-4.1 and Claude4, offering significant advantages in programming capabilities, including a context extension from 256K tokens to 1 million tokens [10][12] - Tongyi Wanshang Wan2.2 includes three video generation models that utilize a mixture of experts (MoE) architecture, achieving a parameter count of 27 billion and reducing computational resource consumption by approximately 50% [7][10] Market Impact and Ecosystem Development - Alibaba's open-source models have attracted significant attention from the global developer community, with Qwen3-Coder quickly becoming the top model on HuggingFace [10][12] - The company aims to lower the usage costs for developers and small businesses, allowing them to access top-tier models without high licensing fees, thus promoting a more inclusive AI ecosystem [15][19] - Alibaba's strategy of "open exchange for ecosystem" is designed to build technical standards and commercialize through cloud computing and enterprise services, fostering a virtuous cycle of model open-sourcing and ecosystem prosperity [15][19] Competitive Positioning - Alibaba's rapid advancements in AI technology have positioned it as a formidable player in the global AI landscape, with its models now competing with leading players like OpenAI [17][21] - The company has achieved a market share of 10.4% with its Tongyi Qianwen model, surpassing OpenAI's 4.7%, indicating a significant shift in the competitive dynamics of AI technology [19][21] - With over 400 million downloads and more than 140,000 derivative models, Alibaba's Tongyi Qianwen has become the most widely used open-source model family globally, indicating strong adoption across various industries [21]