Workflow
AI Engineer
icon
Search documents
Google Photos Magic Editor: GenAI Under the Hood of a Billion-User App - Kelvin Ma, Google Photos
AI Engineer· 2025-07-19 19:00
Technology & Engineering - Google Photos' Magic Editor integrates complex CV and generative AI models into a seamless mobile experience [1] - The focus is on optimizing massive models for latency and size [1] - Crucial interplay exists with graphics rendering (OpenGL/Halide) [1] - The process involves turning research concepts into polished features for practical use [1] Product Development - The aim is to build tools that improve users' lives through greater expression, skill-building, and communication [1] Personnel - Kelvin Ma, a product engineer with 15 years of experience, is involved in developing innovative consumer applications used by millions [1]
General Intelligence is Multimodal — Keegan McCallum, Luma AI
AI Engineer· 2025-07-19 17:45
Company Overview - Luma AI 的使命是发展先进的多模态模型 [1] - Luma AI 拥有一支由研究人员和工程师组成的团队,致力于实现非传统的多模态 AGI 路径 [1] Leadership & Expertise - Keegan McCallum 是 Luma AI 的 ML 基础设施负责人,拥有在多家创业公司和工程领导岗位的经验 [1] - Keegan McCallum 的背景包括投资组合优化研究 [1] Event & Community Engagement - Keegan McCallum 在旧金山举行的 AI Engineer World's Fair 上分享了见解 [1] - Luma AI 通过时事通讯与社区保持联系 [1]
ComfyUI Full Workshop — first workshop from ComfyAnonymous himself!
AI Engineer· 2025-07-19 16:30
Overview - ComfyUI 的快速介绍以及最新内容,包括问答环节 [1] - 该内容在旧金山 AI 工程师世界博览会上录制 [1] Community Engagement - 通过加入时事通讯,及时了解即将举行的活动和内容 [1]
Design like Karpathy is watching - Zeke Sikelianos, Replicate
AI Engineer· 2025-07-19 16:15
Legendary AI engineer and educator Andrej Karpathy recently blogged about his experiences building, deploying, and monetizing a vibe-coded web app called MenuGen. Let's dig into the challenges he faced and learn what we as AI designers can do to make life better for the Andrejs of the world. About Zeke Sikelianos Zeke's been building developer tools at companies like Heroku, npm, GitHub, and Replicate for over ten years. He cares deeply about simple and tasteful developer experiences, and thinks the world o ...
Good Demos Are Important — Sharif Shameem, Lexica
AI Engineer· 2025-07-19 16:00
[Music] All right. Hey everyone. Uh, my name is Sharief. I'll be talking to you about demos and why I think demos are probably the most important thing in the world right now. Um, I'm the founder of Lexica. We're working on generative models, specifically image models. Um, but I kind of want to just talk to you about something a bit more than just models themselves. Um, even more than demos. I kind of just want to talk to you about curiosity. Um, there was a famous French mathematician Poare. He said at the ...
Real world MCPs in GitHub Copilot Agent Mode — Jon Peck, Microsoft
AI Engineer· 2025-07-19 07:00
AI Development Capabilities - The industry is focusing on bringing AI development capabilities through Copilot, starting with code completion and moving towards chat interactions for complex prompts and multi-file changes [1] - Agent mode enables complete task execution with deep interaction, allowing for building apps or refactoring large codebases [2] - Agent mode can interpret readme files, including project structure, environment variable configurations, database schemas, API endpoints, and workflow graphs (even as images), to implement tasks [3][4][5] Model Context Protocol (MCP) - MCP is an open protocol (API for AI) that allows LLMs to connect to external data sources for general or account-specific information [9] - VS Code can be configured to use specific MCPs, allowing Copilot to select the appropriate MCP for a task and connect to it, whether local or remote [11][12] - Developers need to grant permission for Copilot to connect to MCPs, ensuring data access is controlled [20] - GitHub has its own MCP server, enabling actions like committing changes to a new branch and creating pull requests directly from the IDE [26][31] Workflow and Best Practices - Copilot Instructions, a specially named file, can be used to pre-inject standards and practices into every prompt, such as code style guidelines and security checks [28][29][30] - Including a change log of everything the agent has done provides a clear record of each step taken [30]
Brian Balfour: The #1 Question Every AI Product Manager Must Answer
AI Engineer· 2025-07-18 19:00
Core Strategy - The key to success is defining what to build and why it will win in the market [1] - Competitive advantage stems from unique data, functionality, and understanding of unmet customer needs, not just the AI itself [2] - To succeed with AI, identify unmet customer problems, determine how AI can solve them in novel ways, and leverage proprietary data to power solutions [2] Key Questions - What are the unmet customer problems [2] - What AI capabilities can solve those problems in novel ways [2] - What proprietary data can power those solutions [2]
The rise of the agentic economy on the shoulders of MCP — Jan Curn, Apify
AI Engineer· 2025-07-18 18:59
Agentic Economy & MCP Standard - The agentic economy is emerging, where AI agents can interact, find counterparts, and purchase services from other agents, businesses, or tools [4] - MCP (Message Communication Protocol) is becoming a standard for agentic interaction, dominating the space compared to Open API and Google's A2A [8][9] - Tool discovery, a key feature of MCP, allows agents to dynamically discover and use tools based on the workflow, differentiating it from Open API [7][8] - A centralized marketplace of MCP services, like APIFY, can provide access to various services with a single API token, enabling rapid scaling of the ecosystem [12] APIFY's Role & Marketplace - APIFY is a marketplace of 5,000 tools (actors), primarily data extraction tools, with a community of creators who monetize their tools [4] - Actors are self-contained software units with defined input and output, facilitating easy integration with other systems [4][5] - APIFY has integrations with workflow automation tools and MCP, enabling AI agents to call actors from the marketplace [6][7] - APIFY enables publishing and monetization of tools or agents, providing access to a broad ecosystem of developers and visibility [23][24] Challenges & Future - Agents currently rely on human developers for access to tools and services, hindering their ability to autonomously find and purchase services [10][11] - Trust between agents and tools is a key open question, as is the overall value and reliability of autonomous tool discovery [25][26][27] - The company paid out over $4 million to creators last month, with actors generating over $500,000 per month, indicating rapid ecosystem growth [23]
Full Spec MCP: Hidden Capibilities — Harald Kirschner, Microsoft/VSCode
AI Engineer· 2025-07-18 18:42
MCP Ecosystem & Specification - The Model Context Protocol (MCP) ecosystem is still in its early stages, with significant room for growth and development [2][3] - The industry emphasizes the importance of adopting the full MCP specification to unlock rich, stateful interactions between agents [9] - The industry acknowledges a gap in MCP implementation, with a tendency to treat it as just another API wrapper [5] - Technical barriers, including missing support in clients, SDKs, documentation, and references, contribute to the limited adoption of the full MCP spec [6] - The industry highlights the need for developers to stay updated with the latest MCP specification and provide feedback on draft features [29] Tools & Dynamic Discovery - Tools are the most immediately successful aspect of MCP, but overuse can lead to quality problems and AI confusion [7][11][12] - Dynamic tool discovery allows servers to provide context-aware tools, enhancing the user experience [16][17][18] - VS Code offers user controls like per-chat tool selection and user-defined tool sets to manage tool complexity [13][15] Resources & Sampling - Resources provide a semantic layer for exposing files and data to both the LLM and the user, enabling more dynamic and stateful interactions [19][20] - Sampling allows servers to request LLM completions from the client, enabling progressive enhancement and interesting functionalities [22][23][24] Developer Experience & Community - The industry recognizes the need for improved developer experience when working on MCP servers, including debugging and logging [26] - VS Code offers a dev mode with debugging capabilities for MCP servers, simplifying the development process [26][27][28] - A community registry is being developed to facilitate the discovery of MCP servers [32]
Shipping an Enterprise Voice AI Agent in 100 Days - Peter Bar, Intercom Fin
AI Engineer· 2025-07-18 16:00
What does it take to go from blank page to live enterprise voice agent in 100 days? That’s the challenge we took on with Fin Voice at Intercom. Enterprise customer service demands high-quality, reliable voice interactions - but delivering that fast means wrestling with tough problems like latency, hallucinations, voice quality, and answer accuracy. We rapidly evaluated and integrated a full voice stack - including transcription, language model, text-to-speech, retrieval-augmented generation, and telephony - ...