Workflow
Codex CLI
icon
Search documents
“AI提高了我的生产力,但我更累了”
3 6 Ke· 2026-02-09 23:40
Core Insights - The article discusses the phenomenon of "AI fatigue," where developers feel overwhelmed despite the productivity gains from AI tools [1][27] - It highlights the paradox of increased efficiency leading to greater workloads and stress for developers [10][27] Group 1: Developer Experience - Developers, even experienced ones, report feeling more exhausted and less efficient due to the demands of AI tools [2][4] - The shift in roles from creators to evaluators has led to increased mental fatigue, as developers now spend more time reviewing AI-generated outputs [11][30] Group 2: Workload and Efficiency - AI can complete tasks that previously took a full day in just one hour, but this results in developers being assigned more tasks, leading to fragmented work and higher cognitive load [7][8] - The need for constant context switching and decision-making increases the overall stress and reduces the sense of accomplishment [10][11] Group 3: AI Tool Evolution - The rapid evolution of AI tools creates a continuous learning curve, with new models and protocols emerging weekly, adding to the pressure on developers to keep up [15][16] - This constant need to adapt leads to anxiety and distracts from solving actual problems [17][30] Group 4: Cognitive Impact - Over-reliance on AI can lead to a decline in critical thinking skills, as developers may become dependent on AI for problem-solving [23][30] - The comparison culture fostered by social media exacerbates feelings of inadequacy among developers, as they see others achieving results quickly with AI [25][27] Group 5: Recommendations for Sustainable Work - The author suggests setting time limits for AI tasks, distinguishing between thinking and execution time, and accepting that AI outputs do not need to be perfect [28][30] - Emphasizing the importance of focusing on foundational skills rather than chasing every new tool can help mitigate fatigue [28][30]
AI编程真面目:完整项目通过率仅27% | 上交大新基准
量子位· 2026-02-09 08:00
ProjDevBench团队 投稿 量子位 | 公众号 QbitAI AI编程是一项非常有实用价值的能力,但网络上不时也能看到程序员抱怨AI"听不懂人话"、"难以找到根本问题",更有直接建议"每次生成代码 不要超过5行"的经验分享。 而近期又有很多AI工具声称可以从零快速构建完整代码项目。 所以AI编程智能体真的能从零构建完整软件项目吗?近日一多校联合研究团队针对这一问题进行了探索。 上海交通大学、上海创智学院、加州大学默塞德分校、 北京理工大学(按论文作者顺序) 联合发布 ProjDevBench ——首个通过OJ细粒度 反馈评估AI编程智能体端到端项目开发能力的基准测试,要求智能体仅凭自然语言需求文档,从零开始构建完整、可运行的软件仓库。 当任务从"补全现有代码"变为"从零构建"时,性能出现断崖式下跌。 结果令人深思: 所有智能体总体提交AC率仅27.38% 。 该研究得出的结论摘要: 为什么需要端到端项目开发基准 现有基准测试如HumanEval、MBPP聚焦于函数级代码生成,SWE-bench关注issue修复,但真实软件工程需要的远不止这些。当开发者使 用Cursor或GitHub Copilot进 ...
OpenAI启动Codex发布月
Xin Lang Cai Jing· 2026-01-24 08:15
Core Viewpoint - OpenAI is set to launch multiple Codex-related products within the next month, with the first product debuting next week, marking a significant advancement in AI programming assistance [1] Group 1: Product Development - The Codex system is described as a highly mature intelligent programming assistance ecosystem, evolving beyond a mere API interface to become an "AI software engineer" that integrates models, tools, and workflows [1] - The first product related to Codex will be released next week, indicating a rapid development cycle and commitment to innovation [1] Group 2: Safety and Security - Sam Altman emphasized the importance of AI programming safety, advocating for the swift resolution of security vulnerabilities as the most beneficial approach for the world [1] - The focus on safety highlights the company's proactive stance in addressing potential risks associated with AI technologies [1] Group 3: Technical Insights - OpenAI's website published a technical blog detailing the core logic of its cross-platform local software agent, Codex CLI, which operates on an "Agent Loop" [1] - The "Agent Loop" is responsible for orchestrating interactions between users, models, and tools, converting user instructions into token inputs for the model, executing reasoning, and invoking tools until a final response is generated [1]
挑战Claude Code?OpenAI Codex发布月将至,今先揭秘智能体循环
机器之心· 2026-01-24 04:09
Core Insights - OpenAI's CEO Sam Altman announced an upcoming series of exciting releases related to Codex, particularly emphasizing cybersecurity [1] - OpenAI released a technical blog titled "Unrolling the Codex Agent Loop," which details the core architecture of Codex CLI and its functionalities [3][4] Group 1: Codex Overview - Codex CLI is a cross-platform local software agent developed by OpenAI that can generate high-quality software changes [7] - OpenAI has accumulated significant experience in building world-class software agents since the initial release of CLI in April [8] Group 2: Agent Loop Mechanism - The agent loop is the core logic of Codex CLI, coordinating interactions between users, models, and tools for executing software tasks [10] - The agent loop consists of several steps: input acquisition, inference, decoding, decision-making, execution, and retry until a final response is generated [16][17] Group 3: Model Inference and API Interaction - Codex CLI operates model inference by sending HTTP requests to the Responses API, which drives the agent loop [22][23] - The Responses API endpoints used by Codex CLI are configurable, allowing integration with various implementations [24][25] Group 4: Prompt Construction - The initial prompt for the Responses API is constructed based on user inputs and various roles, including system, developer, user, and assistant [28][30] - Codex appends user messages to the input after constructing the initial prompt, facilitating the start of the dialogue [33] Group 5: Performance Considerations - The JSON payload sent to the Responses API can grow quadratically during conversations, but Codex currently avoids using the previous_response_id parameter to maintain statelessness [51] - Prompt caching is crucial for efficiency, allowing Codex to reuse previous inference results and reduce computational costs [52][53] Group 6: Context Management - Codex employs a strategy of compressing dialogue once the token count exceeds a certain threshold, replacing the input with a smaller representation to continue the conversation [58][59] - The Responses API has evolved to support a compact endpoint for more efficient dialogue compression [58] Group 7: Future Directions - The blog introduces the Codex agent loop and discusses practical considerations for developers building agent loops on top of the Responses API [61] - Future articles will delve deeper into the architecture of CLI, exploring tool invocation implementations and the sandbox model of Codex [63]
Using OpenAI Codex CLI with GPT-5-Codex
OpenAI· 2025-10-14 18:19
Product Updates & Features - Codex CLI has been improved with updates to GP5 and GP5 codecs, enhancing agentic coding capabilities [1] - Codex CLI can be easily installed with npm or brew and logged in with a Chat GPT account [2] - The new GPT5 Codex model is highly effective for various coding tasks [3] - The model switcher feature allows users to select different models for different tasks, optimizing reasoning level [5] - Codex offers sandboxing features with three modes: read-only, auto (default), and full access, controlling file access and modifications [6][7] - Codex resume allows users to continue from previous sessions [8] - Codex can be used for deploying applications and debugging by analyzing logs and disparate data sources [9][10] Deployment & Usage - Codex can deploy applications on platforms like Versal, utilizing code search to fetch the latest documentation [11] - The demonstration showcased Codex CLI modifying a game to implement a multiplayer feature in real-time [12][13] - Codex CLI supports a wide variety of languages, frameworks, and projects [14] Accessibility & Availability - Codex CLI is designed to be an AI teammate accessible wherever users work, directly in the terminal [15]
多个编码智能体同时使用会不会混乱?海外开发者热议
机器之心· 2025-10-06 04:00
Core Insights - The rapid advancement of AI programming tools is transforming the coding landscape, with models like GPT-5 and Gemini 2.5 enabling a degree of automation in development tasks [1][2] - The adoption of AI coding agents has become a norm not only for programmers but also for professionals in product and design roles, leading to an increasing proportion of AI-generated code [3] - Despite the benefits, challenges remain regarding code quality and analysis efficiency, prompting developers to explore the use of multiple AI agents in parallel [3][5] Summary by Sections - **Parallel Coding Agent Lifestyle**: Simon Willison initially had reservations about using multiple AI agents due to concerns over code review bottlenecks. However, he has since embraced this approach, finding it manageable to run multiple small tasks without overwhelming cognitive load [5][6] - **Task Categories for Parallel Agents**: - **Research Tasks**: AI agents can assist in answering questions or providing suggestions without modifying core project code, facilitating rapid prototyping and validation of concepts [7][9] - **System Mechanism Recall**: Modern AI models can quickly provide detailed, actionable answers about system functionalities, aiding in understanding complex codebases [10][11] - **Small Maintenance Tasks**: Low-risk code modifications, such as addressing deprecation warnings, can be delegated to AI agents, allowing developers to focus on primary tasks [13][14] - **Precisely Specified Work**: Reviewing code generated from detailed specifications is less burdensome, as the focus shifts to verifying compliance with established requirements [15] - **Current Usage Patterns**: Willison's primary tools include Claude Code, Codex CLI, and Codex Cloud, among others. He often runs multiple instances in different terminal windows, executing tasks in a YOLO (You Only Live Once) manner for manageable risks [16][19] - **Developer Community Response**: The blog post has garnered significant attention, resonating with current pain points in coding workflows. Many developers are experimenting with parallel AI agents, with some reporting that a substantial portion of their coding work is AI-assisted [21][22] - **Concerns and Discussions**: While some developers express apprehension about the unpredictability of AI-generated code, others, including Willison, advocate for the benefits of parallel agent usage, particularly for non-code-committing research tasks [26][29]
X @Sam Altman
Sam Altman· 2025-09-19 01:36
Product Update - Codex CLI工具推出,用于本地代码更改的快速审查 [1] - 未来几天将大幅扩展Codex CLI的功能 [1] - 鼓励用户向Daniel反馈使用体验 [1]
连续干7小时“不累”,OpenAI最强编程模型GPT-5-Codex来了
3 6 Ke· 2025-09-16 02:07
Core Insights - OpenAI has released GPT-5-Codex, an optimized version of GPT-5 specifically for software engineering, enhancing its programming capabilities [1][2] - The model can dynamically adjust its thinking time based on task complexity, allowing it to work independently on large tasks for over 7 hours [1][4] - GPT-5-Codex has shown improved accuracy in benchmark tests compared to GPT-5, with a reported accuracy of 74.5% in software engineering tasks [4][5] Group 1: Model Features and Performance - GPT-5-Codex is designed for complex engineering tasks, including project construction, feature addition, debugging, and code review [4] - The model's accuracy in code refactoring tasks is 51.3%, significantly higher than GPT-5's 33.9% [5] - In code reviews, GPT-5-Codex has a lower error comment rate of 4.4% compared to GPT-5's 13.7%, and a higher rate of high-impact comments at 52.4% [9][10] Group 2: Developer Tools and Integration - GPT-5-Codex is integrated into various developer tools, including Codex CLI and IDE extensions, allowing seamless transitions between local and cloud environments [2][16] - The Codex CLI has been updated to allow developers to share images and track progress on complex tasks, enhancing collaboration [14] - The IDE extension enables developers to use Codex within popular code editors, streamlining the coding process and maintaining context [16][17] Group 3: Competitive Landscape - The AI programming tool market is becoming increasingly competitive, with products like OpenAI Codex, Claude Code, and GitHub Copilot vying for dominance [21] - OpenAI's recent upgrades to Codex demonstrate its commitment to enhancing automation and collaboration in programming tasks, reflecting the intensifying competition in the sector [21]
GPT-5编程专用版发布,独立连续编程7小时,简单任务提速10倍,VS Code就能用
3 6 Ke· 2025-09-16 02:01
Core Insights - OpenAI has launched the GPT-5-Codex specialized model, which supports independent continuous programming for up to 7 hours [1][2] - The new model features a dynamic routing mechanism that allows real-time adjustments during task execution, enhancing its ability to handle complex tasks [2][5] - GPT-5-Codex has shown a nearly 20% improvement in success rates for code refactoring tasks compared to the original GPT-5 [5] Performance Enhancements - The model exhibits "true dynamic thinking" capabilities, significantly reducing output token counts for simple tasks by 93.7%, resulting in a 10-fold speed increase [8] - For complex tasks, it takes twice as long for reasoning, editing, and testing, with output token counts increasing by 102.2% [8] - The error comment rate during code review has decreased from 13.7% to 4.4%, while the proportion of high-impact comments has risen from 39.4% to 52.4% [11] Ecosystem Upgrades - OpenAI has restructured the entire Codex product system, introducing features like image input support and a task tracking to-do list for complex tasks [14] - The new IDE extensions integrate Codex into popular editors like VS Code and Cursor, allowing seamless cloud and local task management [14] - Performance improvements in cloud infrastructure have reduced median completion times for tasks by 90% [15] Market Positioning - The timing of this upgrade coincides with a decline in user subscriptions for Claude Code, positioning OpenAI to capture market share in AI programming [16] - There is a suggestion for Microsoft to upgrade its Copilot, indicating competitive pressures in the AI programming space [18]
刚刚,OpenAI发布GPT-5-Codex:可独立工作超7小时,还能审查、重构大型项目
机器之心· 2025-09-16 00:22
Core Viewpoint - OpenAI has launched GPT-5-Codex, a model optimized for programming tasks, enhancing software engineering capabilities and code review processes [1][3][4]. Group 1: GPT-5-Codex Features - GPT-5-Codex is designed for real software engineering tasks, capable of quick responses in interactive sessions and independently handling complex tasks [1][8]. - The model has been integrated into all Codex use cases, including Codex CLI, IDE extensions, web, mobile, and GitHub code reviews [3][4]. - OpenAI CEO Sam Altman reported that within two and a half hours of launch, GPT-5-Codex accounted for approximately 40% of Codex traffic, with expectations to become the main traffic source [3]. Group 2: Performance Improvements - GPT-5-Codex shows superior performance in software engineering benchmarks, outperforming GPT-5 in accuracy on SWE-bench Verified and Code refactoring tasks [8][10]. - The model can dynamically adjust its thinking time based on task complexity, allowing it to work independently for over 7 hours on complex tasks [11][12]. - In user interactions, GPT-5-Codex consumes 93.7% fewer tokens in the least complex requests compared to GPT-5, while investing more time in complex tasks [12]. Group 3: Code Review Capabilities - GPT-5-Codex has been specifically trained for code review, capable of identifying critical vulnerabilities and providing focused feedback [14][27]. - The model has been evaluated using recent commits from popular open-source projects, demonstrating a higher accuracy in review comments compared to human engineers [14]. Group 4: Integration and Usability - Codex CLI and IDE plugins have been redesigned for better integration into developers' workflows, allowing for seamless context switching between local and cloud tasks [19][20]. - The new GitHub integration enables users to assign tasks to Codex without leaving their editing environment, enhancing productivity [23][24]. - Codex can now process images and screenshots, improving its ability to understand design specifications and UI issues [23][25]. Group 5: Security and Safety Measures - OpenAI has implemented safety measures to protect against potential misuse of Codex, including sandbox environments and permission requests for risky operations [28][34]. - Developers are encouraged to review Codex's outputs before deployment, as it serves as an additional reviewer rather than a complete replacement for human oversight [29][30]. Group 6: Pricing and Availability - GPT-5-Codex is included in various ChatGPT subscription plans, such as Plus, Pro, Business, Edu, and Enterprise [32][36]. - OpenAI plans to make GPT-5-Codex available through API soon, with additional purchasing options for Business and Enterprise plans [36].