Workflow
OpenAI Codex
icon
Search documents
OpenAI Codex桌面版深夜突袭,一人指挥Agent军团,程序员彻底告别996
3 6 Ke· 2026-02-03 07:54
太带劲了!抢先Claude 5,OpenAI深夜祭出了一个编码杀器——Codex。它可以让一人指挥多Agent并行协作,自带Skills,编码从此进入自动化时代。 Claude 5的脚步声越来越近,奥特曼终于坐不住了。 就在刚刚,OpenAI毫无预警地抛出「王炸」——Codex正式进化为独立的桌面App。 Codex的进化令人毛骨悚然,它不仅生成代码,还学会了利用代码作为「Skills」来操控电脑。 比如想要解决项目中的Comment,直接调用安装好的Skills,Codex立刻就把问题破解了。 这不仅仅是一个写代码的窗口,更是一个能同时指挥千军万马(多个Agent)的「全能指挥部」。 Codex定位非常明确:要做Agent的「指挥中心」 具体来说,Codex可以做到以下几点: 多任务并行切换,毫不费力:同时调用多个AI智能体开展工作,并通过「工作树」(worktrees)实现变更隔离,互不干扰; 创建并调用Skills:将工具和开发规范封装成可复用的能力; 设置自动化流程:通过后台定时工作流,把那些重复性的琐事统统交给Codex处理。 假设想要为相册里的照片添加「拖拽」功能,选择「工作树」,即可让AI在同一 ...
2026年1月的李想对AI进步速度的预期与Anthropic的CEO类似
理想TOP2· 2026-02-01 11:30
本文架构: 核心观点就是标题,李想对AI进步速度预期与 Anthropic CEO Dario Amodei很类似。 李想认为现在的人形机器人很像2025年2月的Manus,机器人速度可能比软件发展慢一点,但会超所有人预期。 Dario认为最快 1-2 年内就会有超强AI到 来,尽管也可能显著更久。 据TOP2观察,言语上倾向说理想首先应该把车造好的人,普遍对AI进步速度没有这个预期。李想现在重心完全放在具身智能上了,对造好具身智能的汽 车非常上心。 花长篇幅介绍为什么 Dario Amodei的观点值得深度参考,核心锚点主要是 Anthropic近期取得的成果水准,以及 Dario比Ilya早三年发现应该解除 Sam Altman的CEO职务。 TOP2并不是想表达,因为李想与 Dario 是这么想,他们的观点就一定是对的。TOP2认为不应该盲从李想/ Dario或任何人的观点,应该从第一性上充分批 判性思考,但他们的观点本身 值得深入参考。他们的观点是什么,为什么这么想都值得深入求证。 本文核心是指出李想的想法是什么,李想在AI进步速度的预期与AI核心领袖之一大方向很接近。这种接近不是轻飘飘的一句话,往小 ...
2026 年的 Coding 时刻是 Excel
3 6 Ke· 2026-01-27 01:30
近期 Claude Code 推出的 Excel 功能非常惊艳,我们认为 Excel 可能成为继 Coding 之后,下一个迎来"aha moment"、并快速爆发的高价值领域。 本文是 Altiemeter 合伙人 Freda Duan 对 Coding 和 Excel 这两个 AI 垂直领域的深度解读,原文发布于她的 Substack Robonomics。 简单来说,正如 Coding 凭借庞大的市场规模、向相邻场景自然延展的能力以及以产品驱动的 GTM 模式,迅速崛起为最强势的 AI 应用之一,Excel 也具 备同样的条件: Coding 已经证明了这条路径下的爆发力,而 Excel 很可能是体量更大的下一站。 Intro Coding 以超出所有人预期的表现,成为至今为止最强势的 AI 垂直应用之一。它同时具备三种罕见的特质:极其庞大的 TAM、自然延展到相邻使用场景 的切入口,以及以产品驱动为核心的 GTM 模式,几乎不需要传统的销售和市场推广。 具备这种组合特征的垂直领域非常少,Excel 是其中之一。它的 TAM 甚至更大,从某种角度看,软件行业的很大一部分都可以被视为一层层叠加在 Exce ...
腾讯研究院AI速递 20260126
腾讯研究院· 2026-01-25 16:01
Group 1 - OpenAI CEO Altman announced the release of significant Codex-related content starting next week, with a technical blog revealing the core architecture of Codex CLI, specifically the intelligent agent loop [1] - The intelligent agent loop coordinates user instructions, model inference, and local tool execution through the Responses API, employing a "consistent prompt prefix" strategy to trigger cache optimization [1] - Codex supports zero data retention configurations to ensure privacy and utilizes automatic compression technology to manage context windows, with further details on tool invocation and sandbox models to be introduced later [1] Group 2 - Google DeepMind released D4RT, which unifies 3D reconstruction, camera tracking, and dynamic object capture into a single "query" action, achieving speeds 18 to 300 times faster than existing state-of-the-art methods [2] - The core innovation is a unified spatiotemporal query interface, where AI first globally "reads" videos to generate scene representations and then searches for 3D trajectories, depth, and poses of any pixel on demand [2] - This technology is significant for embodied intelligence, autonomous driving, and AR, although training still requires a 1 billion parameter model and 64 TPUs [2] Group 3 - Claude Code upgraded its internal "Todos" to "Tasks," enabling multi-session or sub-agent collaboration on long-term complex projects across multiple context windows [3] - Tasks are stored in a file system for easy collaboration among multiple sessions, with updates in one session broadcasting to all sessions handling the same task list [3] - The new feature is compatible with Opus 4.5, enhancing autonomous operation capabilities, allowing users to enable multiple sessions to collaborate on the same task list through environment variables [3] Group 4 - Baidu's Wenxin 5.0 officially launched with a parameter count of 2.4 trillion, utilizing native multimodal unified modeling technology to support understanding and generation of text, images, audio, and video [4] - It has topped the LMArena text and visual understanding leaderboard five times, entering the global first tier, with language and multimodal understanding capabilities leading internationally [4] - Practical tests show the model excels in complex emotional understanding, subtext analysis, and creative writing tasks, earning the title of "strongest liberal arts student" [4] Group 5 - The open-source project Clawdbot has gained popularity in Silicon Valley, capable of running on Mac mini, serving as both a local AI agent and chat gateway, allowing conversations via WhatsApp, iMessage, etc. [5] - Clawdbot addresses the memory limitations of large models, capable of recalling conversations from two weeks ago, proactively sending emails, reminders, and executing tasks on the computer [5] - The project has received 9.2k stars on GitHub, with a minimum monthly cost of approximately $25, though it requires some technical knowledge for deployment, and users report it can automate business management and code writing, replacing paid services like Zapier [5] Group 6 - Turing Award winner LeCun announced that AMI Labs' core direction is "world models," aiming to build intelligent systems that understand the real world, possess persistent memory, and have reasoning and planning capabilities [6] - This approach argues that merely predicting the next token does not lead to true understanding of reality, necessitating predictions and reasoning at a higher representational level to filter out unpredictable noise [6] - AMI Labs is reportedly seeking financing at a valuation of $3.5 billion, targeting applications in industrial control, robotics, and healthcare, where reliability is crucial [6] Group 7 - Anthropic launched the Claude in Excel plugin, available for Pro, Max, Team, and Enterprise users, based on the Opus 4.5 model, which can be installed and activated via Microsoft Marketplace [7] - The plugin can search the internet and automatically fill in spreadsheets, supporting formula reading, debugging errors, zero-based modeling, and pivot table creation, compatible with .xlsx and .xlsm formats [7] - Currently, it does not support conditional formatting, macros, or VBA, and the company warns of prompt injection risks, advising users to only use files from trusted sources, with high-risk functions triggering confirmation prompts [7] Group 8 - Claude Code's creator Boris Cherny provided a detailed tutorial on using Cowork, emphasizing its role as an "executor" rather than a chat tool, capable of directly manipulating documents, browsers, and various tools [8] - He reiterated that the core workflow involves running multiple tasks in parallel while overseeing Claude instances, starting with "planning mode" for communication until satisfaction is achieved, then switching to "auto-accept edits" mode for execution [8] - Cherny highlighted the importance of Claude.md as a team compounding knowledge base, where any mistakes made by Claude should be documented, and methods for validating Claude's outputs can significantly enhance quality [8] Group 9 - Google Cloud AI Director Addy Osmani warned that programmers who only write prompts will be eliminated by 2026, stating that AI can handle 70% of preliminary work, but the remaining 30% requires experienced engineers [9] - A Stack Overflow survey indicated that developer trust in AI accuracy dropped from 40% to 29%, with 73% of respondents encountering issues with code comprehension due to "ambient coding" [9] - By 2026, the true core competency will be transforming vague problems into clear execution intentions, designing appropriate contextual structures, and distinguishing what is truly important [9] Group 10 - At the Davos Forum, tech giants shared notable insights, with Musk predicting that AI will surpass human intelligence by the end of 2026 and be smarter than the collective intelligence of humanity by 2030, with Tesla set to launch the humanoid robot Optimus next year [10] - Microsoft CEO Nadella warned that if AI only consumes resources without improving outcomes, society will lose tolerance, while Huang Renxun stated that embodied intelligence represents a "once-in-a-generation opportunity" [10] - DeepMind CEO Hassabis believes AGI will still require 5-10 years, while Anthropic CEO Dario claimed that models are just 6-12 months away from being able to complete software development end-to-end [10]
吴恩达年终总结:2025年或将被铭记为「AI工业时代的黎明」
Hua Er Jie Jian Wen· 2025-12-31 03:10
Group 1: Core Insights - 2025 is anticipated to mark the dawn of the AI industrial era, with significant advancements in model performance and infrastructure development driving GDP growth in the U.S. [1] - The integration of technology into daily life is expected to solidify transformative changes in the upcoming year [2] Group 2: Capital Expenditure and Energy Challenges - Major tech companies, including OpenAI, Microsoft, Amazon, Meta, and Alphabet, have announced substantial infrastructure investment plans, with data center construction costs estimated at $50 billion per gigawatt [3] - OpenAI's "Stargate" project involves a $500 billion investment to build 20 gigawatts of capacity globally, while Microsoft plans to spend $80 billion on global data centers by 2025 [3] - Bain & Co. estimates that AI annual revenue must reach $2 trillion by 2030 to support such large-scale construction, exceeding the total profits of major tech companies in 2024 [3] - Insufficient grid capacity has led to some data centers in Silicon Valley being underutilized, and concerns over debt levels have caused Blue Owl Capital to withdraw from financing negotiations for Oracle and OpenAI [3] Group 3: Talent Market Transformation - The shift of AI from academic interest to revolutionary technology has led to skyrocketing salaries for top talent, with Meta offering compensation packages worth up to $300 million [4] - Mark Zuckerberg has personally engaged in talent acquisition, successfully recruiting key researchers from OpenAI and other companies [4] Group 4: Advancements in AI Models - 2025 is viewed as the year of widespread application of reasoning models, with OpenAI's o1 model and DeepSeek-R1 demonstrating enhanced reasoning capabilities through reinforcement learning [6] - The OpenAI o4-mini achieved a 17.7% accuracy rate in a multimodal understanding test, driving the emergence of "Agentic Coding" tools capable of handling complex software development tasks [7] - Coding agents based on the latest large models completed over 80% of tasks in SWE-Bench benchmark tests, despite some limitations in complex logic and increased inference costs [8]
吴恩达年终总结:2025是AI工业时代的黎明
具身智能之心· 2025-12-31 00:50
Core Insights - 2025 is marked as a pivotal year in the AI industry, characterized by rapid advancements and significant developments in AI technologies and infrastructure [10][14][30] - The competition for AI talent has intensified, with leading companies offering unprecedented salaries to attract top professionals [23][27] - The emergence of reasoning models and programming agents has transformed software development, lowering barriers to entry and enabling more individuals to participate in AI innovation [37][40] Group 1: AI Industry Developments - The year 2025 is described as the dawn of the AI industrial era, with major advancements in AI capabilities and infrastructure [14][30] - AI companies are projected to spend over $300 billion in capital expenditures, primarily on building new data centers to support AI tasks [30][32] - By 2030, the costs associated with building sufficient computing power for AI needs could reach $5.2 trillion, indicating a massive investment trend [30] Group 2: Talent Acquisition and Market Dynamics - AI firms are engaged in a fierce talent war, with salaries reaching levels comparable to professional sports stars, as companies like Meta offer up to hundreds of millions in compensation [23][27] - OpenAI, Meta, and other tech giants are implementing strategies to retain talent, including higher stock compensation and accelerated vesting schedules [27][30] - The influx of capital and talent into the AI sector is contributing to economic growth, with evidence suggesting that the majority of GDP growth in the U.S. in early 2025 is driven by data center and AI investments [30] Group 3: Technological Advancements - The introduction of reasoning models has significantly improved the performance of large language models (LLMs), enhancing their capabilities in various tasks [21][22][24] - Programming agents have become a competitive battleground among AI giants, with advancements allowing them to complete over 80% of programming tasks [31][34] - The development of new benchmarks and evaluation methods for programming agents reflects the evolving landscape of AI capabilities [34]
吴恩达年终总结:2025年或将被铭记为“AI工业时代的黎明”
华尔街见闻· 2025-12-30 12:45
Core Insights - The year 2025 is anticipated to mark the dawn of the AI industrial era, characterized by unprecedented advancements in model performance and infrastructure investments that will significantly contribute to GDP growth in the U.S. [1][2] Group 1: Capital Expenditure and Energy Challenges - Major tech companies, including OpenAI, Microsoft, Amazon, Meta, and Alphabet, have announced substantial infrastructure investment plans, with each gigawatt of data center capacity costing approximately $50 billion. OpenAI's "Stargate" project, in collaboration with partners, involves a $500 billion investment to build 20 gigawatts of capacity globally [3]. - Microsoft is projected to spend $80 billion on global data centers in 2025 and has signed a 20-year agreement to restart the Three Mile Island nuclear reactor in Pennsylvania by 2028 to ensure a stable power supply [3]. - Bain & Co. estimates that to support this scale of construction, AI annual revenue must reach $2 trillion by 2030, exceeding the total profits of major tech companies in 2024 [3]. - Insufficient grid capacity has led to some data centers in Silicon Valley being underutilized, and concerns over debt levels have caused Blue Owl Capital to withdraw from negotiations to finance a $10 billion data center for Oracle and OpenAI [3]. Group 2: Talent Market Transformation - Meta has disrupted traditional compensation structures by offering lucrative packages, including cash bonuses and substantial equity, to researchers from OpenAI, Google, and Anthropic, with some four-year contracts valued at up to $300 million [5]. - Mark Zuckerberg has personally engaged in the talent acquisition battle, successfully recruiting key researchers from OpenAI [5]. - In response, OpenAI has introduced aggressive stock option vesting schedules and retention bonuses of up to $1.5 million for new employees [6]. Group 3: Proliferation of Reasoning Models and Agentic Coding - 2025 is viewed as the year of widespread application of reasoning models, with advancements such as OpenAI's o1 model and DeepSeek-R1 demonstrating enhanced reasoning capabilities through reinforcement learning [8]. - The integration of tools has led to significant improvements in model performance, with OpenAI's o4-mini achieving a 17.7% accuracy rate in a multimodal understanding test, driving the rise of "Agentic Coding" [10]. - By the end of 2025, tools like Claude Code, Google Gemini CLI, and OpenAI Codex are expected to handle complex software development tasks through intelligent workflows [10]. - Despite some limitations in reasoning models identified by research from Apple and Anthropic, the trend of utilizing AI for code generation and cost reduction in development remains strong [11].
吴恩达年终总结:2025年或将被铭记为AI工业时代的黎明
Hua Er Jie Jian Wen· 2025-12-30 10:27
Core Insights - 2025 marks the dawn of the AI industrial era, with AI investments becoming a core driver of U.S. GDP growth and global annual capital expenditures surpassing $300 billion [1][4][20] - Major tech companies are launching massive infrastructure projects, with investments reaching trillions and energy supply becoming a critical constraint [1][5][19] - The emergence of reasoning models and agentic coding has significantly enhanced AI capabilities, allowing for independent handling of complex software development tasks [1][7][21] Group 1: AI Industrial Era - 2025 is recognized as the beginning of the AI industrial era, with advancements in model performance and infrastructure development driving U.S. GDP growth [4][10] - AI investments are projected to exceed $3 trillion, with major companies like OpenAI, Microsoft, and Amazon leading the charge [1][5][19] - The integration of AI into daily life is expected to solidify these changes further in the coming years [4][10] Group 2: Infrastructure Investments - Tech giants are announcing staggering infrastructure investment plans, with each gigawatt of data center capacity costing approximately $50 billion [5][19] - OpenAI's "Stargate" project involves a $500 billion investment to build 20 gigawatts of capacity globally [5][19] - Microsoft plans to spend $80 billion on global data centers in 2025 and has signed a 20-year agreement to restart the Three Mile Island nuclear reactor for power supply [5][19] Group 3: Talent Market Transformation - Top talent in AI is now commanding salaries comparable to sports stars, with Meta offering up to $300 million for four-year contracts [2][6][14] - Meta's aggressive recruitment strategy has led to the hiring of key researchers from OpenAI and Google, significantly raising the market value of AI talent [6][15][18] - OpenAI has responded by offering competitive stock options and retention bonuses to attract and retain talent [6][17] Group 4: Advancements in AI Models - 2025 is seen as the year of widespread application of reasoning models, with OpenAI's o1 and DeepSeek-R1 showcasing enhanced multi-step reasoning capabilities [7][11] - AI models are now able to perform complex tasks in mathematics, science, and programming with improved accuracy, as demonstrated by OpenAI's o4-mini achieving a 17.7% accuracy rate in multi-modal understanding tests [7][11] - The rise of agentic coding has enabled AI agents to independently manage software development tasks, significantly increasing coding efficiency [7][21][25]
AI Coding 生死局:Spec 正在蚕食人类编码,Agent 造轮子拖垮效率,Token成本失控后上下文工程成胜负手
3 6 Ke· 2025-12-30 09:21
Core Insights - The evolution of AI Coding is leading to a new role for programmers, focusing on defining rules rather than just writing code, as the complexity of software engineering increases [1] - The rise of Spec-driven development is reshaping the AI Coding landscape, with a shift from traditional coding practices to a more structured approach that emphasizes the importance of context and specifications [8][9] Group 1: AI Coding Evolution - AI Coding has transitioned from a human-led paradigm, where tools like Copilot and Cursor assist in code completion, to an Agent-driven model that takes over tasks from requirement analysis to code generation [2][3] - The limitations of the completion paradigm are becoming apparent, as it requires significant developer attention and has a narrow scope compared to the broader capabilities of Agents [3] - The integration of IDE, CLI, and Cloud capabilities in programming tools reflects the need for a comprehensive task delivery system across different environments [4] Group 2: Spec-Driven Development - The concept of "Spec" has evolved, with various interpretations ranging from better prompts to detailed product requirement documents, highlighting the need for clear guidance in AI Coding [8][10] - Spec is seen as a critical component in providing stable context for Agents, ensuring they understand what needs to be built and the constraints involved [9][12] - The challenge lies in standardizing Spec across different contexts, as its effectiveness depends on the application scenario and the balance between flexibility and rigor [11][12] Group 3: Context Engineering - Context is increasingly recognized as a vital element in AI Coding, with many teams noting that the lack of context, rather than specifications, is a significant barrier to effective AI code generation [9][10] - The development of "living contracts" for Spec emphasizes the need for dynamic, iterative documentation that evolves alongside the coding process, rather than static documents [14] - The focus on context management is crucial, as it directly impacts the efficiency and cost of AI coding, with a need to maximize cache hit rates and minimize redundant computations [22][23] Group 4: Token Economics - The cost structure of using AI tools is shifting, with Token consumption becoming a critical factor in pricing and operational strategies for platforms [18][19] - The transition from simple question-answer interactions to complex Agent tasks has increased the overall Token costs, as multiple interactions and tool calls are required to complete tasks [20][21] - Effective context management is essential to control Token costs, as it determines how information is organized and reused throughout the coding process [26][27]
吴恩达年终总结:2025是AI工业时代的黎明
机器之心· 2025-12-30 06:57
Core Insights - 2025 is marked as a pivotal year in the AI industry, characterized by intense competition among AI giants, a talent war, and significant advancements in AI infrastructure and capabilities [6][10][13]. Group 1: AI Development and Learning - The rapid advancement in AI has created unprecedented opportunities for software development, with a notable shortage of skilled AI engineers [6][22]. - Structured learning is essential for aspiring AI developers to avoid redundant efforts and to understand existing solutions in the industry [7][8]. - Practical experience is crucial; hands-on project work enhances understanding and sparks new ideas in AI development [8][14]. Group 2: AI Infrastructure and Investment - The AI industry has seen capital expenditures surpassing $300 billion in 2025, primarily for building new data centers to handle AI tasks [26]. - Major companies are planning extensive infrastructure projects, with projected costs reaching up to $5.2 trillion by 2030 to meet anticipated demand for AI capabilities [26][31]. - Companies like OpenAI, Meta, Microsoft, and Amazon are investing heavily in data center capacities, with OpenAI planning to build 20 gigawatts of data center capacity globally [31]. Group 3: Talent Acquisition and Market Dynamics - A fierce competition for top AI talent has led to unprecedented salary offers, with some companies offering compensation packages comparable to professional sports stars [22][26]. - Meta's aggressive recruitment strategy has included significant financial incentives to attract talent from competitors, reflecting the high market value of AI professionals [22][27]. - Despite concerns about an AI bubble, investments in AI infrastructure are contributing to economic growth, particularly in the U.S. [29]. Group 4: Advancements in AI Models - The introduction of reasoning models has significantly improved the performance of large language models (LLMs), enhancing their capabilities in various tasks [20][21]. - AI agents are increasingly capable of automating complex coding tasks, with reports indicating that many companies are now relying on AI-generated code for senior-level tasks [33][39]. - The evolution of programming agents has led to a competitive landscape among AI companies, with advancements in code generation capabilities becoming a focal point [30][39].