狙击Gemini 3！OpenAI发布GPT-5.1-Codex-Max

Core Insights - The article discusses the competitive landscape of AI programming models, highlighting the release of OpenAI's new model, GPT-5.1-Codex-Max, which aims to outperform Gemini 3 and other models in the market [1][34]. Model Performance - GPT-5.1-Codex-Max has achieved a new state-of-the-art (SOTA) in METR, indicating its ability to complete software engineering tasks with a 50% success rate in a time frame that previously required human intervention of 2 hours and 42 minutes, now reduced by 25 minutes compared to its predecessor [11][12]. - The new model demonstrates improved efficiency in task execution, particularly in software engineering tasks such as PR creation and code review, and is the first OpenAI model capable of operating in a Windows environment [16][18]. Long-Running Tasks - GPT-5.1-Codex-Max can operate independently for over 24 hours, processing millions of tokens continuously, which is a significant advancement for handling long-duration tasks without losing context [25][21]. - The model's ability to compress dialogue when approaching context window limits allows it to maintain coherence over extended tasks, making it suitable for analyzing lengthy documents without information loss [22][27]. Competitive Landscape - The article notes that other AI models, such as Claude, are also evolving, with Claude Code being faster in execution compared to OpenAI's offerings [32][31]. - The rapid advancements in AI programming models indicate a highly competitive environment, with multiple companies releasing new versions and features in quick succession [34][13]. Additional Releases - OpenAI has also introduced GPT-5.1 Pro, which reportedly excels in instruction following, although details are limited [36][38].