Core Insights - OpenAI has launched two open-source AI inference models, GPT-oss-120b and GPT-oss-20b, which are comparable in capability to its existing models [1][2] - The release marks OpenAI's return to the open-source language model space after six years, aiming to attract both developers and policymakers [2][3] Model Performance - In the Codeforces programming competition, GPT-oss-120b and GPT-oss-20b scored 2622 and 2516, respectively, outperforming DeepSeek's R1 model but slightly below OpenAI's own o3 and o4-mini models [2] - In the Human-Level Exam (HLE), the models achieved scores of 19% and 17.3%, surpassing DeepSeek and Qwen but still lower than o3 [3] - The "hallucination" rates for the GPT-oss models were significantly higher than those of OpenAI's latest models, with rates of 49% and 53% compared to 16% and 36% for o1 and o4-mini [3] Model Training Methodology - The GPT-oss models utilize a "Mixture-of-Experts" architecture, activating only a portion of their parameters for efficiency [5] - Despite having 117 billion parameters, GPT-oss-120b activates only 510 million per token, and both models underwent high-computational reinforcement learning [5] - Currently, the models only support text input and output, lacking multi-modal processing capabilities [5] Licensing and Data Transparency - GPT-oss-120b and GPT-oss-20b are released under the Apache 2.0 license, allowing commercial use without authorization [5] - OpenAI has chosen not to disclose the training data sources, a decision influenced by ongoing copyright litigation in the AI sector [6] Competitive Landscape - OpenAI faces increasing competition from Chinese AI labs like DeepSeek and Alibaba's Tongyi (Qwen), which have released leading open-source models [2] - The focus in the industry is shifting towards upcoming models from DeepSeek and Meta's Superintelligence Lab, indicating a rapidly evolving competitive environment [6]
六年来首次开源,OpenAI放出两款o4-mini级的推理模型
 Jin Shi Shu Ju·2025-08-06 03:47
