AI Engineer

Search documents
Building Effective Voice Agents — Toki Sherbakov + Anoop Kotha, OpenAI
AI Engineer· 2025-07-20 16:30
Overview - The document discusses building production voice applications [1] - It shares learnings from working with customers in the voice application domain [1] Authorship - The content is associated with tokisherbakov (Twitter handle) and akotha7 (LinkedIn profile) [1]
What every AI engineer needs to know about GPUs — Charles Frye, Modal
AI Engineer· 2025-07-20 07:00
[Music] So um what I wanted to talk about today was uh what every AI engineer needs to know about GPUs. the like so far in the last couple of years um most of the things that people have built as AI applications people who are AI engineers they've been building on top of model APIs so they use the open AI API the anthropic API the deepseek API and they build an application on top of that and that goes back to kind of like the initial diagram that Swix put out the like AI like rise of the AI engineer thing u ...
Brian Balfour: How Granola Beat Giants Like Zoom & Otter in the AI Note-Taking War
AI Engineer· 2025-07-20 07:00
So, let's take all of this theory and let's put it into practice. Let's talk about a product granola. Just by a raise of hands, how many people have either tried or used granola today.What they realize is actually there's a whole other set of customer needs that have been unmet, which is I don't want you to take all of my notes. I just want you to help me take better notes, empower me around this specific task and user. And that's what they built the product around.Now, because the real the realization is t ...
Robots as professional Chefs - Nikhil Abraham, CloudChef
AI Engineer· 2025-07-20 07:00
Company Overview - CloudChef 致力于使用具身人工智能重新构想烹饪方式[1] - CloudChef 正在构建机器人,以使商业厨房能够烹饪高质量的膳食,同时解决对熟练厨师的需求[1] - CloudChef 的机器人已经在多家领先的商业厨房中从事全职工作[1] Technology and Innovation - CloudChef 将一个双手动机器人改造成了一名专业厨师,该厨师可以在新的厨房工作,并通过一次演示学习新的食谱[1] Leadership and Background - CloudChef 的 CEO 是 Nikhil Abraham,他是 IIT Bombay 的校友,也是 Rephrase AI(已被 Adobe 收购)的联合创始人[1]
A Taxonomy for Next-gen Reasoning — Nathan Lambert, Allen Institute (AI2) & Interconnects.ai
AI Engineer· 2025-07-19 21:15
Model Reasoning and Applications - Reasoning unlocks new language model applications, exemplified by improved information retrieval [1] - Reasoning models are enhancing applications like website analysis and code assistance, making them more steerable and user-friendly [1] - Reasoning models are pushing the limits of task completion, requiring ongoing effort to determine what models need to continue progress [1] Planning and Training - Planning is a new frontier for language models, requiring a shift in training approaches beyond just reasoning skills [1][2] - The industry needs to develop research plans to train reasoning models that can work autonomously and have meaningful planning capabilities [1] - Calibration is crucial for products, as models tend to overthink, requiring better management of output tokens relative to problem difficulty [1] - Strategy and abstraction are key subsets of planning, enabling models to choose how to break down problems and utilize tools effectively [1] Reinforcement Learning and Compute - Reinforcement learning with verifiable rewards is a core technique, where language models generate completions and receive feedback to update weights [2] - Parallel compute enhances model robustness and exploration, but doesn't solve every problem, indicating a need for balanced approaches [3] - The industry is moving towards considering post-training as a significant portion of compute, potentially reaching parity with pre-training in GPU hours [3]
How to Train Your Agent: Building Reliable Agents with RL — Kyle Corbitt, OpenPipe
AI Engineer· 2025-07-19 21:12
Core Idea - The presentation discusses a case study on building an open-source natural language assistant (ART E) for answering questions from email inboxes using reinforcement learning [1][2][3] - The speaker shares lessons learned, what worked and didn't, and how they built an agent that worked well with reinforcement learning [2] Development Process & Strategy - The speaker recommends starting with prompted models to achieve the best performance before using any training, including reinforcement learning, to work out bugs in the environment and potentially avoid training altogether [7][8][9] - The company was able to surpass prompted model baselines with reinforcement learning, achieving a 60% reduction in errors compared to the best prompted model (03, which had 90% accuracy, while the RL model achieved 96% accuracy) [10][15] - The training of the ART E model cost approximately $80 in GPU time and one week of engineering time with an experienced engineer [23][24] Key Metrics & Optimization - The company benchmarked cost, accuracy, and latency, finding that the trained model (Quen 2.5 14B) achieved significant cost reduction compared to 03 ($55 per 1,000 searches) and 04 mini ($8 per 1,000 searches) [16][17] - The company improved latency by moving to a smaller model, training the model to have fewer turns, and considering speculative decoding [19][20][21] - The company optimized the reward function to include extra credit for fewer turns and discouraging hallucination, resulting in a significantly lower hallucination rate compared to prompted models [45][46][49][50] Challenges & Solutions - The two hard problems in using RL are figuring out a realistic environment and getting the right reward function [26][27][28] - The company created a realistic environment using the Enron email dataset, which contains 500,000 emails [33][34][35] - The company designed the reward function by having Gemini 2.5 Pro generate questions and answers from batches of emails, creating a verified dataset for the agent to learn from [37][38][39] - The company emphasizes the importance of watching out for reward hacking, where the model exploits the reward function without actually solving the problem, and suggests modifying the reward function to penalize such behavior [51][53][61]
OpenThoughts: Data Recipes for Reasoning Models — Ryan Marten, Bespoke Labs
AI Engineer· 2025-07-19 21:10
[Music] I'm Ryan. I'm a founding engineer at Bespoke Labs. And today I'm going to talk to you about Open Thoughts, which is our project to create the best open-source reasoning data sets.And I'll be switching tack a little bit from our earlier discussions on reasoning and RL and focus on the reasoning part and you'll see why. So just so we're on the same page, we've talked a lot about reasoning, but what's actually going on here. So I like this graph from JSON which shows this incredible performance that's ...
Google Photos Magic Editor: GenAI Under the Hood of a Billion-User App - Kelvin Ma, Google Photos
AI Engineer· 2025-07-19 19:00
Technology & Engineering - Google Photos' Magic Editor integrates complex CV and generative AI models into a seamless mobile experience [1] - The focus is on optimizing massive models for latency and size [1] - Crucial interplay exists with graphics rendering (OpenGL/Halide) [1] - The process involves turning research concepts into polished features for practical use [1] Product Development - The aim is to build tools that improve users' lives through greater expression, skill-building, and communication [1] Personnel - Kelvin Ma, a product engineer with 15 years of experience, is involved in developing innovative consumer applications used by millions [1]
General Intelligence is Multimodal — Keegan McCallum, Luma AI
AI Engineer· 2025-07-19 17:45
Talking about Luma AI, our mission, and how our ML infrastructure enables SOTA multimodal model development About Keegan McCallum I'm Keegan McCallum, the Head of ML infrastructure at Luma AI. I began my career in research focusing on portfolio optimization. Since then I've founded two startups, lead engineering at two others and have landed at Luma AI working on an unconventional multimodal path to AGI among a cracked team of researchers and engineers. When I'm not working, I'm usually out in the woods hik ...
ComfyUI Full Workshop — first workshop from ComfyAnonymous himself!
AI Engineer· 2025-07-19 16:30
Overview - ComfyUI 的快速介绍以及最新内容,包括问答环节 [1] - 该内容在旧金山 AI 工程师世界博览会上录制 [1] Community Engagement - 通过加入时事通讯,及时了解即将举行的活动和内容 [1]