Workflow
上下文工程
icon
Search documents
苹果开发者自曝用Claude完成95%开发,开发应用已上架
量子位· 2025-07-07 09:35
闻乐 发自 凹非寺 量子位 | 公众号 QbitAI 苹果开发者自曝用AI开发应用程序, Claude含量95% ! 事情是这样的,一位苹果开发者最新发布了一款用于调试MCP服务器的原生macOS应用 Context —— 一款几乎完全由 Claude Code 构建的应用程序。 作者 indragiek 从2008年就开始为Mac开发软件。 这次,他的目标是使用Apple的SwiftUI框架,打造一款在macOS平台上使用起来很顺手且实用的开发者工具。 与以往不同的是,Claude Code承担了Context项目95%的工作量,indragiek声称: 在这个 20000行 代码的项目中,我亲手编写的代码估计 不到1000行 。 "工程师"Claude也是好起来了,能给苹果打工(doge)。 调侃归调侃,下面让我们来"学习"一下这位开发者是怎么用Claude的。 苹果开发者教你"驯服"Claude 作为一名经验丰富的工程师,Indragie像许多同行一样,拥有一个"烂尾项目"list。 尽管能够构建项目原型,但最后20%的交付工作往往耗费巨大时间和精力,导致项目搁置。 所以,他已经6年未能成功发布任何一个 ...
Karpathy最新脑洞「细菌编程」:优秀的代码应该具备细菌的三大特质
量子位· 2025-07-07 04:02
Core Viewpoint - The article discusses Andrej Karpathy's new concept of "Bacterial Code," which emphasizes small, modular, self-contained code blocks that are easy to copy and paste, inspired by the evolutionary strategies of bacteria [1][5][6]. Group 1: Concept of Bacterial Code - Bacterial Code has three main characteristics: small code blocks, modularity, and self-containment, allowing for easy replication [1][6][12]. - The idea is that open-source communities can thrive through "horizontal gene transfer," similar to how bacteria share genetic material [2][12]. - Karpathy's insights are derived from the survival strategies of bacteria, which have evolved to colonize diverse environments through efficient genetic coding [7][8]. Group 2: Principles of Bacterial Code - The first principle is "smallness," where each line of code consumes energy, leading to a natural self-optimization mechanism [8][11]. - The second principle is "modularity," where code should be organized into interchangeable modules, akin to bacterial operons, promoting high cohesion and low coupling [11][12]. - The third principle is "self-containment," meaning code snippets should be independent and not reliant on complex configurations or external libraries [13][14]. Group 3: Limitations and Future Directions - While Bacterial Code is effective for rapid prototyping, it is not suitable for building complex systems, which require more intricate structures like eukaryotic genomes [15][16]. - Karpathy suggests a hybrid approach, utilizing the strengths of both bacterial and eukaryotic coding strategies [16]. Group 4: Evolution of Software Development - Karpathy has previously introduced concepts like Software 3.0, which represents a shift towards programming with natural language models [18][25]. - He notes that software has undergone significant transformations in recent years, moving from traditional coding to model training and now to natural language programming [19][23][31]. - The future of software development will involve a collaboration between humans and large models, leading to semi-autonomous applications [28][30]. Group 5: Context Engineering - Context Engineering is highlighted as a crucial skill for effectively utilizing large language models (LLMs), requiring a balance of information to optimize performance [36][39]. - This discipline involves understanding the behavior of LLMs and integrating various elements like task descriptions and multimodal data [40][41].
腾讯研究院AI速递 20250707
腾讯研究院· 2025-07-06 14:05
Group 1 - Grok 4 achieved a score of 45% in the "Human Last Exam" (HLE), surpassing Gemini 2.5 Pro and Claude 4 Opus, sparking discussions [1] - Elon Musk stated that Grok 4 is built on "first principles" reasoning, analyzing problems from fundamental axioms [1] - Grok 4 is expected to enhance coding capabilities and may be released in two versions: Grok 4 and Grok 4 Code, anticipated after July 4 [1] Group 2 - Gemini CLI has been updated to support audio and video input, significantly expanding its multimodal interaction capabilities, although it currently only processes text, images, and PDF files [2] - The update enhances Markdown functionality, adds table rendering and file import features, and integrates VSCodium and Neovim editors to improve the development experience [2] - The technology stack has been upgraded to Ink 6 and React 19, introducing new themes, privacy management features, and optimizing historical record compression algorithms for better performance and stability [2] Group 3 - Kunlun Wanwei launched the new Skywork-Reward-V2 series reward model, refreshing the evaluation rankings of seven mainstream reward models, with parameter scales ranging from 600 million to 8 billion [3] - The model employs a "human-machine collaboration, two-stage iteration" data selection pipeline, filtering 26 million high-quality data samples from 40 million, achieving a balance between data quality and scale [3] - Smaller parameter models demonstrate "small but powerful" capabilities, with a 1.7 billion parameter model performing close to a 70 billion model, indicating that high-quality data can effectively offset parameter scale limitations [3] Group 4 - The German company TNG has open-sourced the DeepSeek-TNG-R1T2-Chimera model, developed based on three major DeepSeek models using an innovative AoE architecture [4] - The Chimera version improves inference efficiency by 200% compared to the R1-0528 version while significantly reducing inference costs, outperforming standard R1 models in multiple mainstream tests [5] - The AoE architecture utilizes MoE's fine-grained structure to construct specific capability sub-models from the parent model through linear time complexity, optimizing performance using weight interpolation and selective merging techniques [5] Group 5 - Shortcut has become the "first Excel Agent to surpass humans," capable of solving Excel World Championship problems in 10 minutes, ten times faster than humans with over 80% accuracy [6] - The tool offers near-perfect compatibility with Excel, handling complex financial modeling, data analysis, and visualization, even creating pixel art images [6] - Currently in early preview, users can log in with Google accounts for three free trial opportunities, though it has limitations in formatting capabilities, long dialogue performance, and handling complex data [6] Group 6 - Shanghai AI Lab, in collaboration with multiple organizations, launched the Sekai high-quality video dataset project, covering over 5,000 hours of first-person video from 750+ cities across 101 countries [7] - The dataset is divided into real-world Sekai-Real and virtual scene Sekai-Game parts, featuring multi-dimensional labels such as text descriptions, locations, and weather, with a curated 300-hour high-quality subset Sekai-Real-HQ [7] - An interactive video world exploration model, Yume, was trained based on the Sekai data, supporting mouse and keyboard control for video generation, aiding research in world generation, video understanding, and prediction [7] Group 7 - ChatGPT identified a long-standing medical issue as the MTHFR A1298C gene mutation, generating discussions on Reddit and being referred to as a "Go moment" in the medical field [8] - Microsoft's medical AI system MAI-DxO achieved an accuracy rate of 85% in diagnosing complex cases from NEJM, outperforming experienced doctors by more than four times at a lower cost [8] - Medical AI is evolving into a comprehensive solution from search to diagnosis, potentially transforming healthcare models and reducing ineffective medical expenditures [8] Group 8 - "Context Engineering" has gained popularity in Silicon Valley, supported by figures like Karpathy, and is seen as a key factor for the success of AI agents, replacing prompt engineering [9] - Unlike prompt engineering, which focuses on single texts, context engineering emphasizes providing LLMs with a complete system, including instructions, history, long-term memory, retrieval information, and available tools [9] - Context engineering is both a science and an art, focusing on providing appropriate information and tools for tasks, with many agent failures attributed to context rather than model issues, highlighting the importance of timely information delivery [9] Group 9 - Generative AI is reshaping market research, transitioning it from a lagging, one-time input to a continuous dynamic competitive advantage, with traditional research spending of $140 billion shifting towards AI software [10] - AI-native companies are utilizing "generative agent" technology to create "virtual societies," simulating real user behavior without recruiting real human samples, fundamentally reducing costs and enabling real-time research [10] - Successful market research AI does not require 100% accuracy; CMOs believe that 70% accuracy combined with faster speed and real-time updates offers more commercial value than traditional methods, emphasizing rapid market entry and deep integration over perfect accuracy [10] Group 10 - The core challenge of enterprise-level AI product entrepreneurship lies in transitioning from impressive demonstrations to practical products, addressing unpredictable user behavior and data chaos in real environments [11] - AI companies are growing at a rate far exceeding traditional SaaS firms, with top AI companies achieving annual growth rates exceeding ten times, driven by changes in enterprise purchasing behavior and AI's direct replacement of human budgets [11] - Establishing lasting competitive barriers is crucial, which can be achieved by becoming a source of data authority (SoR), creating workflow lock-in, deep vertical integration, and solidifying customer relationships [11]
Karpathy:我不是要造新词,是「上下文工程」对 Agent 来说太重要了
Founder Park· 2025-07-04 13:10
Core Viewpoint - The concept of "Context Engineering" has gained traction in the AI industry, emphasizing that the effectiveness of AI applications relies more on the quality of context provided than on the prompts used to query the AI [1][3]. Group 1: Definition and Importance of Context Engineering - Context Engineering is defined as the discipline of designing and constructing dynamic systems that provide appropriate information and tools to large language models (LLMs) at the right time and in the right format [19]. - The quality of context provided to an AI agent is crucial for its effectiveness, surpassing the complexity of the code or framework used [24]. - A well-constructed context can significantly enhance the performance of AI agents, as demonstrated by examples where rich context leads to more relevant and useful responses [25]. Group 2: Components of Context Engineering - Context Engineering encompasses various elements, including prompt engineering, current state or dialogue history, long-term memory, and retrieval-augmented generation (RAG) [15][11]. - The distinction between prompts, prompt engineering, and context engineering is clarified, with prompts being the immediate instructions given to the AI, while context engineering involves a broader system that dynamically generates context based on task requirements [15][19]. Group 3: Strategies for Implementing Context Engineering - Four common strategies for implementing Context Engineering are identified: writing context, selecting context, compressing context, and isolating context [26]. - Writing context involves saving information outside the context window to assist the agent in completing tasks, such as maintaining a calendar or email history [28][29]. - Selecting context refers to pulling necessary information into the context window to aid the agent, which can include filtering relevant memories or examples [36][38]. - Compressing context focuses on retaining only the essential tokens needed for task execution, often through summarization techniques [43][44]. - Isolating context involves distributing context across multiple agents or using environments to manage context effectively, enhancing task focus and reducing token consumption [47][50].
登上热搜!Prompt不再是AI重点,新热点是Context Engineering
机器之心· 2025-07-03 08:01
Core Viewpoint - The article emphasizes the importance of "Context Engineering" as a systematic approach to optimize the input provided to Large Language Models (LLMs) for better output generation [3][11]. Summary by Sections Introduction to Context Engineering - The article highlights the recent popularity of "Context Engineering," with notable endorsements from figures like Andrej Karpathy and its trending status on platforms like Hacker News and Zhihu [1][2]. Understanding LLMs - LLMs should not be anthropomorphized; they are intelligent text generators without beliefs or intentions [4]. - LLMs function as general, uncertain functions that generate new text based on provided context [5][6][7]. - They are stateless, requiring all relevant background information with each input to maintain context [8]. Focus of Context Engineering - The focus is on optimizing input rather than altering the model itself, aiming to construct the most effective input text to guide the model's output [9]. Context Engineering vs. Prompt Engineering - Context Engineering is a more systematic approach compared to the previously popular "Prompt Engineering," which relied on finding a perfect command [10][11]. - The goal is to create an automated system that prepares comprehensive input for the model, rather than issuing isolated commands [13][17]. Core Elements of Context Engineering - Context Engineering involves building a "super input" toolbox, utilizing various techniques like Retrieval-Augmented Generation (RAG) and intelligent agents [15][19]. - The primary objective is to deliver the most effective information in the appropriate format at the right time to the model [16]. Practical Methodology - The process of using LLMs is likened to scientific experimentation, requiring systematic testing rather than guesswork [23]. - The methodology consists of two main steps: planning from the end goal backward and constructing from the beginning forward [24][25]. - The final output should be clearly defined, and the necessary input information must be identified to create a "raw material package" for the system [26]. Implementation Steps - The article outlines a rigorous process for building and testing the system, ensuring each component functions correctly before final assembly [30]. - Specific testing phases include verifying data interfaces, search functionality, and the assembly of final inputs [30]. Additional Resources - For more detailed practices, the article references Langchain's latest blog and video, which cover the mainstream methods of Context Engineering [29].
上下文就是一切!行业热议话题:提示工程是否应该改名
歸藏的AI工具箱· 2025-06-26 11:40
Core Viewpoint - The article discusses the emerging concept of "context engineering" in AI, suggesting it is a more accurate term than "prompt engineering" to describe the skills needed for effectively utilizing large language models (LLMs) [1][2]. Group 1: Importance of Context Engineering - Context engineering is essential for optimizing the performance of AI agents, as insufficient context can lead to inconsistent actions among sub-agents and hinder the ability to follow instructions accurately [4][5]. - The performance of LLMs can decline if the context is too long or contains irrelevant information, which can also increase costs and delays [4][5]. - Instruction adherence is crucial for agents, with top models showing a significant drop in accuracy during multi-turn conversations, highlighting the need for optimized context length and accuracy [4][5]. Group 2: Strategies for Optimizing Context Engineering - Context engineering encompasses three common strategies: compression, persistence, and isolation [5][6]. - Compression aims to retain only the most valuable tokens in each interaction, with methods like context summarization being critical [6][7]. - Persistence involves creating systems for storing, saving, and retrieving context over time, considering storage methods, saving strategies, and retrieval processes [9][10]. - Isolation focuses on managing context across different agents or environments, utilizing structured runtime states to control what LLMs see in each interaction [16][18]. Group 3: Practical Experiences and Recommendations - The article emphasizes the importance of building robust context management systems for AI agents, balancing performance, cost, and accuracy [24]. - It suggests that memory systems should be simple and track specific agent preferences over time, while also considering parallelizable tasks for multi-agent architectures [26]. - The need for a token tracking mechanism is highlighted as foundational for any context engineering work [23].
提示词工程、RAG之后,LangChain:上下文工程开始火了!
机器之心· 2025-06-25 04:06
Core Viewpoint - Context engineering is emerging as a crucial skill for AI engineers, shifting the focus from traditional prompt engineering to providing structured and dynamic context for large language models (LLMs) to perform tasks effectively [3][7][15]. Group 1: Definition and Importance of Context Engineering - Context engineering involves constructing dynamic systems that provide accurate information and tools in the right format, enabling LLMs to complete tasks effectively [9][10]. - The significance of context engineering lies in its ability to address common failures in AI systems, which often stem from inadequate context or incorrect information being provided to the model [12][15]. - Unlike prompt engineering, which focuses on crafting clever prompts, context engineering emphasizes the importance of delivering complete and structured context to enhance model performance [17][19]. Group 2: Components of Effective Context Engineering - Effective context engineering requires accurate information, as models cannot infer context without being explicitly provided with it [12][19]. - The format of the context is critical; how information is communicated to the LLM can significantly impact its responses [13][19]. - Tools must be appropriately utilized to access external information, and the returned data should be formatted in a way that is easily understandable by the LLM [20]. Group 3: Transition from Prompt Engineering to Context Engineering - The transition from prompt engineering to context engineering is driven by the increasing complexity of applications, highlighting the need for a more comprehensive approach to context provision [16][17]. - Context engineering can be viewed as a subset of prompt engineering, where the focus shifts from single input prompts to managing and formatting dynamic data sets [17][18].
近期必读!Devin VS Anthropic 的多智能体构建方法论
歸藏的AI工具箱· 2025-06-15 08:02
Core Viewpoint - The article discusses the advantages and challenges of multi-agent systems, comparing the perspectives of Anthropic and Cognition on the construction and effectiveness of such systems [2][7]. Group 1: Multi-Agent System Overview - Multi-agent systems consist of multiple agents (large language models) working collaboratively, where a main agent coordinates the process and delegates tasks to specialized sub-agents [4][29]. - The typical workflow involves breaking down tasks, launching sub-agents to handle these tasks, and finally merging the results [6][30]. Group 2: Issues with Multi-Agent Systems - Cognition highlights the fragility of multi-agent architectures, where sub-agents may misunderstand tasks, leading to inconsistent results that are difficult to integrate [10]. - Anthropic acknowledges these challenges but implements constraints and measures to mitigate them, such as applying multi-agent systems to suitable domains like research tasks rather than coding tasks [8][12]. Group 3: Solutions Proposed by Anthropic - Anthropic employs a coordinator-worker model, utilizing detailed prompt engineering to clarify sub-agents' tasks and responsibilities, thereby minimizing misunderstandings [16]. - Advanced context management techniques are introduced, including memory mechanisms and file systems to address context window limitations and information loss [8][16]. Group 4: Performance and Efficiency - Anthropic's multi-agent research system has shown a 90.2% performance improvement in breadth-first queries compared to single-agent systems [14]. - The system can significantly reduce research time by parallelizing the launch of multiple sub-agents and their use of various tools, achieving up to a 90% reduction in research time [17][34]. Group 5: Token Consumption and Economic Viability - Multi-agent systems tend to consume tokens at a much higher rate, approximately 15 times more than chat interactions, necessitating that the task's value justifies the increased performance costs [28][17]. - The architecture's design allows for effective token usage by distributing work among agents with independent context windows, enhancing parallel reasoning capabilities [28]. Group 6: Challenges in Implementation - The transition from prototype to reliable production systems faces significant engineering challenges due to the compounded nature of errors in agent systems [38]. - Current synchronous execution of sub-agents creates bottlenecks in information flow, with future plans for asynchronous execution to enhance parallelism while managing coordination and error propagation challenges [39][38].