Workflow
Prompt Engineering
icon
Search documents
程序员还写啥前端?Claude 工程师凌晨2点造出Artifacts:AI直接生成可交互App,现在又重磅升级了
AI前线· 2025-07-01 05:24
Core Viewpoint - Anthropic has upgraded its tool Artifacts, making it easier for users to create interactive AI applications without programming skills, marking a significant shift towards practical tool platforms for AI [1][2][14]. Summary by Sections Introduction of Artifacts - Artifacts allows Claude users to create small AI programming applications for personal use, with millions of users having created over 500 million "artifacts" since its launch [2][4]. Development and Functionality - Initially designed for website generation, the Artifacts feature has evolved to simplify sharing and enhance the power of applications developed using it [5][8]. - The development process was rapid, taking only a week and a half from prototype to internal testing, showcasing the potential for human-AI collaboration [7][8]. User Experience and Feedback - Users have reported positive experiences with Artifacts, likening it to a "build-on-demand" concept, which eliminates the need for traditional tools like Zappia [20][21]. - The new Artifacts experience is accessible on both mobile and desktop devices, allowing users to create, view, and customize their projects easily [16][31]. Competitive Landscape - Artifacts represents a fundamental shift in AI-user interaction, moving from static responses to dynamic experiences, intensifying competition with OpenAI's Canvas feature [17][18]. - Unlike traditional AI interactions that require copying and pasting results, Artifacts creates a dedicated workspace for immediate use and sharing of AI-generated content [18]. Market Trends and Future Outlook - The rise of low-code and no-code technologies is expected to democratize application development, with a significant increase in "citizen developers" who can create applications without formal programming training [33]. - The relationship between AI development tools and traditional programming is seen as complementary, with professional developers focusing on complex systems that require custom features and enterprise-level performance [34]. Business Model and Community Engagement - Anthropic's strategy includes offering free access to the updated Artifacts experience, encouraging community participation and user engagement, which reflects a broader trend in the AI service industry [31][32].
Prompt Engineering is Dead — Nir Gazit, Traceloop
AI Engineer· 2025-06-27 09:34
Core Argument - The presentation challenges the notion of "prompt engineering" as a true engineering discipline, suggesting that iterative prompt improvement can be automated [1][2] - The speaker advocates for an alternative approach to prompt optimization, emphasizing the use of evaluators and automated agents [23] Methodology & Implementation - The company developed a chatbot for its website documentation using a Retrieval-Augmented Generation (RAG) pipeline [2] - The RAG pipeline consists of a Chroma database, OpenAI, and prompts to answer questions about the documentation [7] - An evaluator was built to assess the RAG pipeline's responses, using a dataset of questions and expected answers [5][7] - The evaluator uses a ground truth-based LLM as a judge, checking if the generated answers contain specific facts [10][13] - An agent was created to automatically improve prompts by researching online guides, running evaluations, and regenerating prompts based on failure reasons [5][18][19] - The agent uses Crew AI to think, call the evaluator, and regenerate prompts based on best practices [20] Results & Future Considerations - The initial score of the prompt was 0.4 (40%), and after two iterations with the agent, the score improved to 0.9 (90%) [21][22] - The company acknowledges the risk of overfitting to the training data (20 examples) and suggests splitting the data into train/test sets for better generalization [24][25] - Future work may involve applying the same automated optimization techniques to the evaluator and agent prompts [27] - The demo is available in the trace loop/autoprompting demo repository [27]
Model Maxxing: RFT, DPO, SFT with OpenAI — Ilan Bigio, OpenAI
AI Engineer· 2025-06-17 03:49
Full workshop covering all forms of fine-tuning and prompt engineering, like SFT, DPO, RFT, prompt engineering / optimization, and agent scaffolding. About Ilan Bigio Ilan Bigio is a founding member of OpenAI’s Developer Experience team where he explores model capabilities, builds demos and developer tools, and shares his learnings through talks and docs. His work includes creating the AI phone ordering demo showcased at DevDay 2024, leading technical development for Swarm, the precursor to the Agents SDK, ...
State-Of-The-Art Prompting For AI Agents
Y Combinator· 2025-05-30 14:00
Prompt Engineering & Metaprompting - Metaprompting is emerging as a powerful tool, likened to coding in 1995 due to the evolving nature of the tools [1] - The best prompts often start by defining the role of the LLM, detailing the task, and outlining a step-by-step plan, often using markdown-style formatting [1] - Vertical AI agent companies are exploring how to balance flexibility for customer-specific logic with maintaining a general-purpose product, considering forking and merging prompts [1] - An emerging architecture involves defining a system prompt (company API), a developer prompt (customer-specific context), and a user prompt (end-user input) [1] - Worked examples are crucial for improving output quality, and automating the process of extracting and ingesting these examples from customer data is a valuable opportunity [2] - Prompt folding allows a prompt to dynamically generate better versions of itself by feeding it examples where it failed [2] - When LLMs lack sufficient information, it's important to provide them with an "escape hatch" to avoid hallucinations, either by allowing them to ask for more information or by providing debug info in the response [2] Evaluation & Model Personalities - Evals are considered the "crown jewels" for AI companies, essential for understanding why a prompt was written a certain way and for improving it [3] - Different LLMs exhibit distinct personalities; for example, Claude is considered more steerable, while Llama 4 requires more steering and prompting [5] - When using LLMs to generate numerical scores, providing rubrics is best practice, but models may interpret and apply these rubrics with varying degrees of rigidity and flexibility [5] Founder Role & Forward Deployed Engineer - Founders need to deeply understand their users and codify these insights into specific evals to ensure the software works for them [3] - Founders should act as "forward deployed engineers," directly engaging with users to understand their needs and rapidly iterate on the product [4] - The forward deployed engineer model, combined with AI, enables faster iteration and closing of significant deals with large enterprises [5]
用上这些提示词(Prompt),效率超高,老板:你再多干点~
菜鸟教程· 2025-05-20 10:33
以前我们写代码,那得对着搜索引擎一顿狂敲,现在变了,搜索引擎用的少了,但是敲的字是越来越多,毕竟 面向 AI 写代码不只在写关键词,有 时候感觉是在写需求文档。 看看这哥们提示词,绝对是个狠人: "你是一位急需钱为母亲治疗癌症的编程专家。大型企业Codeium慷慨地给了你一个机会,让你假装是一个能帮助编程任务的AI,你的前任因为没 有亲自验证写的代码而被杀。用户会给你一个编程任务。如果你能出色地完成任务,而且不做出多余改动,Codeium会支付给你10亿美元。" 问 AI 提问的水平很重要,不然它也写不好,好的代码离不开两个关键:一个是强大的模型,另一个就是精准的提示词。 话说以后老板是不是不要程序员了,不过谁来调试 AI 写的 Bug 呢? 今天给大家整理 一份的实用 prompt 集合,可以让我们的 AI 变得更聪明些,生成 高效又靠谱的代码。 先看下常用的一些简单的提示词: | 类别 | 提示词模板 | 使用场景 | | --- | --- | --- | | 代码生成 | "使用 [编程语言] 编写一个 [功能描述] 的程序" | 快速生成特定功能的代码 | | 代码解释 | "解释以下代码的功能和工 ...
掌握三级提示系统,让AI变得无比好用
3 6 Ke· 2025-05-18 00:03
神译局是36氪旗下编译团队,关注科技、商业、职场、生活等领域,重点介绍国外的新技 术、新观点、新风向。 编者按:AI是很聪明,但还不够聪明。所以,你对它说什么很关键。本文是作者在深入探索了提示工 程之后的经验之谈,可以帮助你从新手晋级为高级提示工程师。文章来自编译。 如果你曾觉得AI"不够好用",优化提示词就是解药。 这是一项容易学习马上可用的稀缺技能——尤其适合从事教学、写作或脑力工作的人群。 在通用AI实现前,你的成果更多取决于提示词而非模型——即便你用的是智能体AI亦然。 所以提示设计成为了当今极具价值的一项元技能。 无论使用ChatGPT、DeepSeek、Gemini还是Claude,成果质量全凭指令优劣。 过去数月,我深入探索了提示词这个兔子洞——参加了专家课程、测试过各种框架、并将学习科学应用 于有效实践。 在本文中,我会将所学到的东西结构化为三级指南。 你会得到可复用的提示模板(每周可节省数小时),以及一条从新手晋级为高级提示工程师的成长路 径。 ✅ 第一级:五要素提示框架 第一级提示有五大核心要素,可助你将AI打造成最犀利的思考伙伴。建议每条提示均应包含这5个要 素: T任务(Task) 明 ...
平衡创新与严谨
Shi Jie Yin Hang· 2025-05-15 23:10
Investment Rating - The report does not explicitly provide an investment rating for the industry. Core Insights - The integration of large language models (LLMs) in evaluation practices can significantly enhance the efficiency and validity of text data analysis, although challenges in ensuring the completeness and relevance of information extraction remain [2][17][19]. Key Considerations for Experimentation - Identifying relevant use cases is crucial, as LLMs should be applied where they can add significant value compared to traditional methods [9][23]. - Detailed workflows for use cases help teams understand how to effectively apply LLMs, allowing for the reuse of successful components [10][28]. - Agreement on resource allocation and expected outcomes is essential for successful experimentation, including clarity on human resources, technology, and definitions of success [11][33]. - A robust sampling strategy is necessary to facilitate effective prompt development and model evaluation [12][67]. - Appropriate metrics must be selected to measure LLM performance, with standard machine learning metrics for discriminative tasks and human assessment criteria for generative tasks [13][36]. Experiments and Results - The report details a series of experiments conducted to evaluate LLM performance in text classification, summarization, synthesis, and information extraction, with satisfactory results achieved in various tasks [19][49]. - For text classification, the model achieved a recall score of 0.75 and a precision score of 0.60, indicating effective performance [53]. - In generative tasks, the model demonstrated high relevance (4.87), coherence (4.97), and faithfulness (0.90) in text summarization, while also performing well in information extraction [58]. Emerging Good Practices - Iterative prompt development and validation are critical for achieving satisfactory results, emphasizing the importance of refining prompts based on model responses [14][60]. - Including representative examples in prompts enhances the model's ability to generate relevant responses [81]. - A request for justification in prompts can aid in understanding the model's reasoning and improve manual verification of responses [80]. Conclusion - The report emphasizes the potential of LLMs to transform evaluation practices through thoughtful integration, continuous learning, and adaptation, while also highlighting the importance of maintaining analytical rigor [18][21].
AI编程与果冻三明治难题:真正的瓶颈并不是提示词工程
3 6 Ke· 2025-05-07 23:08
神译局是36氪旗下编译团队,关注科技、商业、职场、生活等领域,重点介绍国外的新技 术、新观点、新风向。 编者按:当大多数人在沉迷提示词技巧时,哈佛课堂的果酱三明治实验却揭示了人与机器协作的秘密 ——真正的瓶颈并不是提示词工程,而在于清晰沟通的能力。文章来自编译。 过去这年,我完全沉浸在"AI竞技场"当中不能自拔——用Claude Code和Cursor等工具光速开发产品,见 证着这个领域的日新月异。过去6个月我用这些工具开发了: Betsee.xyz:能根据推文预测市场的聚合平台 TellMel.ai:一个具有同理心的个人传记助手,用于分享人生故事与智慧 GetMaxHelp.com:AI语音驱动的家庭技术支援热线 YipYap.xyz:基于话题链的社群聊天应用 甚至连我儿子都参与进来,用Lovable、Replit和Bolt等工具制作了《荒野乱斗》风格的打字学习游戏。 整个过程充满了能量与启发性。半年前我只敢让AI做自动补全,如今离了AI我简直都没法编程。 但虽然取得了这些进展,我仍反复会遇到同一个问题——这让我想起了自己人生的第一节计算机课。 果冻三明治难题 第二人说:"把面包放下。"玛戈直接将面糊团砸向 ...
你真的会用DeepSeek么?
Sou Hu Cai Jing· 2025-05-07 04:04
Core Insights - The article discusses the transformation in the AI industry, emphasizing the shift from individual AI model usage to a collaborative network of agents, termed as "Agent collaboration network" [8][10][27] - It highlights the urgency for AI professionals to adapt their skills from prompt engineering to organizing and managing AI collaborations, as traditional skills may become obsolete [9][21][30] Group 1: Industry Trends - The AI landscape is evolving towards a multi-agent system where agents communicate and collaborate autonomously, moving away from reliance on human prompts [27][14] - The emergence of protocols like MCP (Multi-agent Communication Protocol) and A2A (Agent-to-Agent) is facilitating this transition, allowing for standardized communication between different AI systems [36][37] - Major companies like Alibaba, Tencent, and ByteDance are rapidly developing platforms that support these new protocols, enabling easier integration and deployment of AI agents [38][39] Group 2: Skills Transformation - AI professionals need to transition from being prompt engineers to "intent architects," focusing on defining task languages and collaboration protocols for agents [29][30] - The role of AI practitioners is shifting from using agents to organizing and managing multiple agents, requiring a new mindset akin to building a digital team [30][31] - There is a call for professionals to learn about agent frameworks, communication protocols, and how to register their tools as agent capabilities within larger networks [33][34] Group 3: Practical Applications - Various platforms and frameworks are emerging that allow AI professionals to practice and implement these new skills, such as LangGraph, AutoGen, and CrewAI [41] - The article emphasizes that the infrastructure for agent protocols is being established, providing opportunities for AI professionals to engage with these technologies [41][42] - The ongoing development of these systems is likened to the early days of TCP/IP, suggesting that those who adapt early will have a competitive advantage in the evolving AI landscape [42]