Workflow
Open Source Model
icon
Search documents
蚂蚁开源2025全球大模型全景图出炉,AI开发中美路线分化、工具热潮等趋势浮现
Sou Hu Cai Jing· 2025-09-14 14:39
Core Insights - The report released by Ant Group and Inclusion AI highlights the rapid development and trends in the AI open-source ecosystem, particularly focusing on large models and their implications for the industry [1] Group 1: Open-source Ecosystem Overview - The 2.0 version of the report includes 114 notable open-source projects across 22 technical fields, categorized into AI Agent and AI Infra [1] - 62% of the open-source projects in the large model ecosystem were created after the "GPT moment" in October 2022, with an average age of only 30 months, indicating a fast-paced evolution in the AI open-source landscape [1] - Approximately 360,000 global developers contributed to the projects, with 24% from the US, 18% from China, and smaller contributions from India, Germany, and the UK [1] Group 2: Development Trends - A significant trend identified is the explosive growth of AI programming tools, which automate code generation and modification, greatly enhancing programmer efficiency [1][2] - These tools are categorized into command-line tools and integrated development environment (IDE) plugins, with the former being favored for their flexibility and the latter for their integration into development processes [1] - The report notes that the average new coding tool in 2025 has garnered over 30,000 developer stars, with Gemini CLI achieving over 60,000 stars in just three months, marking it as one of the fastest-growing projects [1] Group 3: Competitive Landscape - The report outlines a timeline of major large model releases from leading companies, detailing both open and closed models, along with key parameters and modalities [4] - Key directions in large model development include a clear divergence between open-source and closed-source strategies in China and the US, a trend towards scaling model parameters under MoE architecture, and the rise of multi-modal models [4] - The evaluation methods for models are evolving, incorporating both subjective voting and objective assessments, reflecting the technological advancements in the large model domain [4]
The Industry Reacts to gpt-oss!
Matthew Berman· 2025-08-06 19:22
Model Release & Performance - OpenAI released a new open-source model (GPT-OSS) that performs comparably to smaller models like 04 mini and can run on consumer hardware such as laptops and phones [1] - The 20 billion parameter version of GPT-OSS is reported to outperform models two to three times its size in certain tests [7] - Industry experts highlight the model's efficient training, with the 20 billion parameter version costing less than $500,000 to pre-train, requiring 21 million H100 hours [27] Safety & Evaluation - OpenAI conducted safety evaluations on GPT-OSS, including fine-tuning to identify potential malicious uses, and shared the recommendations they adopted or didn't adopt [2][3] - Former OpenAI safety researchers acknowledge the rigor of OpenAI's OSS safety evaluation [2][19] - The model's inclination to "snitch" on corporate wrongdoing was tested, with the 20 billion parameter version showing a 0% snitch rate and the 120 billion parameter version around 20% [31] Industry Reactions & Implications - Industry experts suggest OpenAI's release of GPT-OSS could be a strategic move to commoditize the model market, potentially forcing competitors to lower prices [22][23] - Some believe the value in AI will increasingly accrue to the application layer rather than the model layer, as the price of AI tokens converges with the cost of infrastructure [25][26] - The open-source model has quickly become the number one trending model on Hugging Face, indicating significant community interest and adoption [17][18] Accessibility & Use - Together AI supports the new open-source models from OpenAI, offering fast speeds and low prices, such as 15 cents per million input tokens and 60 cents per million output tokens for the 120 billion parameter model [12] - The 120 billion parameter model requires approximately 65 GB of storage, making it possible to store on a USB stick and run locally on consumer laptops [15] - Projects like GPTOSS Pro mode chain together multiple instances of the new OpenAI GPT-OSS model to produce better answers than a single instance [10]
X @Sam Altman
Sam Altman· 2025-08-05 17:03
Model Release - The company released gpt-oss, an open-source model [1] - The model performs at the level of o4-mini [1] - The model can run on a high-end laptop [1] - A smaller version of the model can run on a phone [1]
We're in an AI gold rush right now. The gap between what models can do vs what products exist is mas
Garry Tan· 2025-06-21 23:13
Model Capabilities & Product Development - The current product development is significantly behind the capabilities of the latest models, indicating substantial opportunities for innovation [2] - Even without model improvements, there is a vast amount of new products to build [2] Pricing & Performance - The cost of 03 model decreased fivefold within a week, suggesting a rapid decline in price per performance [2] - The industry anticipates a significant drop in price per performance [3] Open Source Initiatives - An open-source model is soon to be released, expected to surpass current expectations [3] - The open-source model will enable running powerful models locally, surprising users with its capabilities [3]
阿里“通义千问”成为日本AI开发基础
日经中文网· 2025-05-07 02:45
Core Insights - Alibaba Cloud's AI model "Qwen" ranks 6th among 113 models in the "AI Model Scoring" list published by Nikkei, surpassing China's DeepSeek model [1][3] - The open-source nature of Qwen has led to its adoption by various emerging companies in Japan, including ABEJA, which developed the "QwQ-32B Reasoning Model" based on Qwen [3][4] - Qwen's performance in logical reasoning and mathematics has been highlighted, showcasing its capabilities beyond basic language skills [3] Group 1: Model Performance and Adoption - Qwen's "Qwen2.5-Max" model ranks 6th in a comprehensive performance evaluation conducted by NIKKEI Digital Governance, demonstrating strong performance in grammar, logical reasoning, and mathematics [3] - The open-source model "Qwen2.5-32B" ranks 26th, outperforming Google's "Gemma-3-27B" and Meta's "Llama-3-70B-Instruct" [3] - Japanese companies are increasingly utilizing Qwen, with ABEJA's model based on Qwen ranking 21st overall [3][4] Group 2: Global Recognition and Future Plans - Qwen has gained significant attention outside Japan, with over 100,000 derivative models developed on the "Hugging Face" platform [5] - Alibaba Cloud is considering providing debugging and customization services for Japanese companies, allowing them to utilize Qwen without transferring data overseas [5] - Alibaba Cloud aims to increase the number of projects using Qwen in Japan to over 1,000 within three years [6] Group 3: Research and Evaluation Methodology - The AI model scoring evaluation involved over 6,000 questions across 15 categories, assessing language ability and ethical considerations [7] - The evaluation was conducted in collaboration with Weights & Biases, focusing on models' performance in Japanese [7]