Workflow
AI Engineer
icon
Search documents
Vibes won't cut it — Chris Kelly, Augment Code
AI Engineer· 2025-08-03 04:32
AI Coding Impact on Software Engineering - The speaker believes predictions of massive software engineer job losses due to AI coding are likely wrong, not because AI coding isn't important, but because those making predictions haven't worked on production systems recently [2] - AI code generation at 30% in very large codebases may not be as impactful as perceived due to existing architectural constraints [3] - The industry believes software engineers will still be needed to fix, examine, and understand the nuances of code in complex systems, even with AI assistance [6] - The speaker draws a parallel to the DevOps transformation, suggesting AI will abstract work, not eliminate jobs, similar to how tractors changed farming [7] Production Software Considerations - Production code requires "four nines" availability, handling thousands of users and gigabytes of data, which "vibe coding" (AI-generated code without examination) cannot achieve [10] - The speaker emphasizes that code is an artifact of software development, not the job itself, which involves making decisions about software architecture and dependencies [11] - The best code is no code, as every line of code introduces maintenance and debugging burdens [12] - AI's text generation capabilities do not equate to decision-making required for complex software architectures like monoliths vs microservices [15] - Changing software safely is the core job of a software engineer, ensuring functionality, security, and data integrity [18] AI Adoption and Best Practices - Professional software engineers are observed to be slower in adopting AI compared to previous technological shifts [20] - The speaker suggests documenting standards, practices, and reproducible environments to facilitate AI code generation [22][23] - Code review is highlighted as a critical skill, especially with AI-generated code, but current code review tools are inadequate [27][28] - The speaker advises distrusting AI's human-like communication, as it may generate text that doesn't accurately reflect its actions [32] - The speaker recommends a "create, refine" loop for AI-assisted coding: create a plan, have AI generate code, then refine it [35][36][37]
Building Agents at Cloud Scale — Antje Barth, AWS
AI Engineer· 2025-08-02 18:15
Let's explore practical strategies for building and scaling agents in production. Discover how to move from local MCP implementations to cloud-scale architectures and how engineering teams leverage these patterns to develop sophisticated agent systems. Expect a mix of demos, use case discussions, and a glimpse into the future of agentic services! About Antje Barth Antje Barth is a Principal Developer Advocate at AWS, based in San Francisco. She frequently speaks at AI engineering conferences, events, and me ...
State of Startups and AI 2025 - Sarah Guo, Conviction
AI Engineer· 2025-08-02 16:45
Event Information - The content originates from the AI Engineer World's Fair in San Francisco [1] - Industry professionals can stay informed about future events and content by subscribing to the newsletter [1]
Useful General Intelligence — Danielle Perszyk, Amazon AGI
AI Engineer· 2025-08-02 13:15
AI Agent Challenges & Solutions - AI agents currently struggle with basic computer tasks like clicking, typing, and scrolling [1] - Amazon AGI SF Lab aims to build general-purpose AI agents capable of performing any computer task a human can [1] - The lab proposes a new approach to agents called Useful General Intelligence [1] Amazon AGI SF Lab's Approach - The lab is focused on solving challenges in computer use for AI agents [1] - Developers can access the lab's technology in its early developmental stages [1] - Nova Act, the lab's agentic model and SDK, is being used by developers to build real workflows [1] Personnel & Context - Danielle, a cognitive scientist from Amazon AGI SF Lab, presented this information at the AI Engineer World's Fair in San Francisco [1] - Danielle previously worked at Google and Adept [1]
The 2025 AI Engineering Report — Barr Yaron, Amplify
AI Engineer· 2025-08-01 22:51
AI Engineering Landscape - The AI engineering community is broad, technical, and growing, with the "AI Engineer" title expected to gain more ground [5] - Many seasoned software developers are AI newcomers, with nearly half of those with 10+ years of experience having worked with AI for three years or less [7] LLM Usage and Customization - Over half of respondents are using LLMs for both internal and external use cases, with OpenAI models dominating external, customer-facing applications [8] - LLM users are leveraging them across multiple use cases, with 94% using them for at least two and 82% for at least three [9] - Retrieval-Augmented Generation (RAG) is the most popular customization method, with 70% of respondents using it [10] - Parameter-efficient fine-tuning methods like LoRA/Q-LoRA are strongly preferred, mentioned by 40% of fine-tuners [12] Model and Prompt Management - Over 50% of respondents are updating their models at least monthly, with 17% doing so weekly [14] - 70% of respondents are updating prompts at least monthly, and 10% are doing so daily [14] - A significant 31% of respondents lack any system for managing their prompts [15] Multimodal AI and Agents - Image, video, and audio usage lag text usage significantly, indicating a "multimodal production gap" [16][17] - Audio has the highest intent to adopt among those not currently using it, with 37% planning to eventually adopt audio [18] - While 80% of respondents say LLMs are working well, less than 20% say the same about agents [20] Monitoring and Evaluation - Most respondents use multiple methods to monitor their AI systems, with 60% using standard observability and over 50% relying on offline evaluation [22] - Human review remains the most popular method for evaluating model and system accuracy and quality [23] - 65% of respondents are using a dedicated vector database [24] Industry Outlook - The mean guess for the percentage of the US Gen Z population that will have AI girlfriends/boyfriends is 26% [27] - Evaluation is the number one most painful thing about AI engineering today [28]
Agents vs Workflows: Why Not Both? — Sam Bhagwat, Mastra.ai
AI Engineer· 2025-08-01 16:00
AI Agents and Workflows Debate - The industry is currently engaged in a debate regarding the roles and effectiveness of AI agents versus workflows, sparked by differing viewpoints from Anthropic and OpenAI [1][2][3][4] - The industry should avoid dogmatic approaches, recognizing that there isn't one single "right" way to develop AI systems [5][6][7][8] - The industry should be cautious of overly complex APIs (like those relying heavily on graph theory), as they can hinder readability and team collaboration [9][10][11][12][13] Design Patterns for AI Systems - The industry needs a commonly accepted vocabulary and glossary for agentic patterns and agentic workflow patterns [14][15] - Agents can be viewed as turn-based systems, while workflows are akin to rules engines managing dependencies [16][17][18] - Workflows are gaining popularity in AI engineering due to the need to trace and manage non-determinism, which is more critical in AI than in traditional software engineering [19][20] - Balancing power and control is a key trade-off in designing AI systems; starting with powerful models and adding control where needed is a viable strategy [21][22] Composition and Implementation - Agents and workflows can be combined in various ways: agents can be steps in workflows, workflows can be tools for agents, and so on [23][24][25] - The agent supervisor model involves an orchestrator agent calling other agents as tools [25] - Dynamic tool injection, where agents are given a limited and relevant set of tools for a specific task, can improve performance [26] - Nested workflows, where a workflow is a step within another workflow, are also valuable [26][27] - Practical experience and community knowledge are currently more valuable than theoretical correctness in this rapidly evolving field [28][29]
Why We Don’t Need More Data Centers - Dr. Jasper Zhang, Hyperbolic
AI Engineer· 2025-08-01 15:00
Market Trend & Problem Statement - AI 将与未来的一切融合,对 GPU 和数据中心的需求正在爆炸式增长 [4] - 到 2030 年,需要比现在快四倍的速度建造多四倍的数据中心 [5] - 仅在美国,到 2030 年数据中心供应缺口将超过 15 吉瓦 [8] - 企业和公司 GPU 的空闲时间占 80% [9] - 构建数据中心面临挑战,例如成本高昂(第一个星际之门数据中心耗资超过 10 亿美元),连接电网速度慢(等待时间长达 7 年才能将 100 兆瓦的设施连接到北弗吉尼亚州的电网) [6][7] - GPU 和数据中心消耗了美国总用电量的 4%,并且环境可持续性较差,导致大量的二氧化碳排放 [8] Proposed Solution & Hyperbolic's Approach - 行业需要构建一个 GPU 市场或聚合层,以聚合不同的数据中心和 GPU 提供商,从而解决 GPU 用户的问题 [10] - Hyperbolic 正在构建一个名为 HyperDOS(Hyperbolic Distributed Operating System)的全球编排层,它类似于 Kubernetes 软件,允许任何集群在安装软件后成为网络中的一个集群 [11] - 用户可以通过多种方式租用 GPU,例如现货实例、按需、长期预留或托管模型 [11] - Hyperbolic 的 GPU 市场 H100 的 GPU 成本为每小时 0.99 美元,而 Google 的按需 GPU 成本为 11 美元 [13] - 通过统一的分销渠道,可以大幅降低价格 [13][14] - Hyperbolic 正在构建一个统一的平台,初创公司或公司不再需要审查不同的数据中心,只需选择评级高或价格最优的数据中心即可,还将对 GPU 的性能进行基准测试 [16] Benefits & Cost Savings - 通过 GPU 市场,可以节省 50% 到 75% 的成本 [13] - 通过 Hyperbolic,可以将成本从 4380 万美元降低到 690 万美元,节省 6 倍 [19] - 通过增加计算量,可以提高模型的质量,在相同的预算下,生产力可以提高 6 倍 [20] - 通过将闲置的 GPU 出售给其他人,可以帮助其他人获得更便宜的 GPU [20] Future Vision - GPU 市场将发展成为不同 AI 工作负载的一体化平台,包括 AI 推理(在线和离线)和训练作业 [21] - 行业应该更好地重用和回收那些闲置的计算资源,而不是仅仅关注构建数据中心,因为这会消耗大量能源和占用大量土地 [21]
Flipping the Inference Stack — Robert Wachen, Etched
AI Engineer· 2025-08-01 14:30
Scalability Challenges in AI Inference - Current AI inference systems rely on brute-force scaling, adding more GPUs per user, leading to unsustainable compute demands and spiraling costs [1] - Real-time use cases are bottlenecked by latency and costs per user [1] Proposed Solution - Rethinking hardware is the only way to unlock real-time AI at scale [1] Key Argument - The current approach to inference is not scalable [1]
Infrastructure for the Singularity — Jesse Han, Morph
AI Engineer· 2025-08-01 14:30
AI Agent Transition - AI agents are transitioning from experimental tools to practical coworkers [1] - This transition demands new infrastructure for RL training, test-time scaling, and deployment [1] Morph Labs' Innovation - Morph Labs developed Infinibranch to address the infrastructure needs of AI agents [1] - Morph Labs is building the infrastructure for the singularity [1] - Infinibranch enables scaling train-time and test-time search for agentic reasoning models [1] Leadership - Jesse Han is the Founder and CEO of Morph Labs [1] - Jesse Han previously worked at OpenAI on test-time compute scaling, GPT-4, and reasoning [1]
Hacking the Inference Pareto Frontier - Kyle Kranen, NVIDIA
AI Engineer· 2025-08-01 13:45
Challenges in LLM Inference - LLM inference systems face challenges related to latency, cost, and output quality, impacting user experience, profitability, and applicability [1] - The trade-offs between cost, throughput, latency, and quality define a Pareto frontier, limiting the successful application of LLM systems [1] NVIDIA Dynamo and Inference Techniques - NVIDIA Dynamo, a datacenter-scale distributed inference framework, aims to improve the Pareto frontier of inference systems [1] - Techniques employed include disaggregation (separating LLM generation phases), speculation (predicting multiple tokens per cycle), KV routing, storage, and manipulation (avoiding redundant work), and pipelining improvements for agents (accelerating workflows) [1] Key Inference Optimization Strategies - Disaggregation enhances efficiency by separating phases of LLM generation [1] - Speculation predicts multiple tokens per cycle to improve throughput [1] - KV routing, storage, and manipulation prevent redoing work, optimizing resource utilization [1] - Pipelining improvements for agents accelerate workflows by leveraging agent information [1]