Workflow
路由功能
icon
Search documents
OpenAI 的命门,决定了大模型公司的未来
3 6 Ke· 2025-09-03 07:12
Core Insights - The article emphasizes that "cost control of computing power" is fundamental for the development and commercialization of large models, with the Scaling Law being a key metric for enhancing model capabilities [1][19]. - OpenAI's introduction of the "routing" feature with GPT-5 aims to match user queries with appropriate models to improve user experience and computational efficiency, despite facing criticism for not meeting user expectations [1][3][4]. Group 1: Cost Control and Model Efficiency - DeepSeek has significantly reduced the inference and training costs of models to below 10%, contributing to its popularity in the open-source community [1]. - The MoE architecture has gained traction post-GPT-4, becoming the default choice for many large model developers due to its effectiveness in lowering inference costs [1]. - OpenAI's routing feature is designed to identify simpler queries that can be handled by less resource-intensive models, potentially reducing computational costs by 8% if 10% of queries are matched correctly [10][23]. Group 2: Challenges and User Experience - OpenAI's push for the routing feature was driven by the need to help users select the most suitable model from over five options, especially for those unfamiliar with large models [6][8]. - The routing function's failure to align user expectations with model capabilities has been a significant factor in the criticism of GPT-5 [3][4]. - The efficiency of routing is crucial, as the computational cost difference between inference and non-inference models can be as high as 5-6 times, with complex queries consuming thousands of tokens [8][10]. Group 3: Infrastructure and Market Expansion - OpenAI is expanding its infrastructure with a plan to add 4.5 GW of data center capacity by July 2025, in collaboration with Oracle [19]. - The company is also exploring partnerships in India to establish a data center with at least 1 GW capacity, aiming to connect local user growth with computational resources [20]. - The "AI cost paradox" is driving demand for efficient routing functions, as the total computational demand continues to rise despite lower token prices [19][23].