OpenAI最强编程模型登场，连续干活24小时，一次处理几百万token

Core Insights - OpenAI has released its latest programming model, GPT-5.1-Codex-Max, designed for complex tasks in software engineering, research, and mathematics [2] - The model features a new compaction technology that allows it to handle millions of tokens in a single task while maintaining coherence across multiple context windows [2][3] - GPT-5.1-Codex-Max demonstrates improved performance in programming benchmarks compared to its predecessor, GPT-5.1-Codex, and is the first model trained for programming in a Windows environment [3] Performance and Efficiency - The model uses approximately 30% fewer tokens for tasks of medium reasoning intensity while achieving higher accuracy [5] - OpenAI anticipates that this token efficiency will lead to cost savings for developers [5] - GPT-5.1-Codex-Max can operate independently for hours and has been evaluated to work continuously for up to 24 hours on the same task, iterating to deliver successful results [3] Features and Applications - GPT-5.1-Codex-Max is now available in Codex for CLI, IDE extensions, cloud, and code review, with API access forthcoming [6] - The model has created various applications, including a browser-based CartPole reinforcement learning sandbox and a solar system gravity simulator, allowing users to visualize and interact with complex concepts [8][10] - Users can train the model in real-time and observe its decision-making process through neural network visualization features [8] User Experience and Comparisons - Users have reported that GPT-5.1-Codex-Max produces more detailed and realistic outputs compared to previous models, showcasing its improved capabilities [10][12] - Feedback indicates that the model exhibits a higher level of proactivity and efficiency in problem-solving compared to GPT-5.1-Pro [12] Security and Safety - OpenAI acknowledges that as model capabilities increase, security challenges also rise, with GPT-5.1-Codex-Max being the most secure model to date, though it has not yet reached the highest level of network security [14] - The model operates in a highly isolated security sandbox, limiting file writing and network access to mitigate risks such as prompt injection [14] Future Implications - The evolution of programming models like GPT-5.1-Codex-Max signifies a shift towards "agentification," where models can autonomously complete project-level tasks, moving from simple code generation to more complex roles [15] - This transition may change software development practices from "writing code" to "describing requirements and reviewing results," with intelligent agents taking on more implementation and iteration responsibilities [15]