AI Engineer

Search documents
What Is a Humanoid Foundation Model? An Introduction to GR00T N1 - Annika & Aastha
AI Engineer· 2025-07-28 16:29
Market Trends & Industry Dynamics - McKinsey 报告指出,全球 30 个最发达经济体中,职位数量超过了能够胜任的人数,过去十年中,职位增长率超过人口增长率 420% [2][3] - 物理 AI 对于解决休闲、酒店、医疗保健、建筑、交通运输、制造业等行业的问题至关重要,这些行业不能仅靠像 ChatGPT 这样的聊天机器人来解决 [3][4] - 英伟达 Project Groot 是将人形机器人和其他形式的机器人技术引入世界的战略,涵盖了计算基础设施、软件和所需的研究 [11] Robotics Foundation Model & Technology - 英伟达的 GR 101 机器人基础模型是开源且高度可定制的,其一大特点是跨具身性,该模型包含 20 亿参数 [1][12] - 机器人数据策略包括:少量且昂贵的真实世界数据(机器人执行真实任务),大量非结构化的互联网视频数据(人类解决任务),以及理论上无限的合成数据 [14][16][17][18] - Project Groot 的数据解决方案包括数据金字塔策略,强调通过模拟和世界基础模型来增强和倍增高质量数据 [13][18][19] - Groot N1 系统引入了双系统架构,系统一快速执行任务(120 赫兹),系统二缓慢规划复杂任务,灵感来源于 Daniel Kahneman 的《思考快与慢》 [23][24][25] - Groot N1 采用扩散 Transformer 块,结合视觉编码器、VLM(视觉语言模型)和文本分词器处理图像和文本输入,并通过动作解码器生成可用于特定机器人的动作向量 [27][28][29][30] - 机器人学习的两种主要方式是模仿学习(通过复制人类专家)和强化学习(通过试错最大化奖励),Groot N1 结合使用了这两种方法 [32][33][36] Deployment & Compute Infrastructure - 物理 AI 生命周期包括生成数据、使用数据和部署,英伟达称之为“三大计算机问题”,涉及不同计算特征:模拟阶段(OVX Omniverse),训练阶段(DGX),边缘部署阶段(AGX) [9][10]
Real-time Experiments with an AI Co-Scientist - Stefania Druga, fmr. Google Deepmind
AI Engineer· 2025-07-28 16:29
[Music] My name is Stefania. I'm so glad you made it until the yeah uh last day of the conference and came to the robotics track. So, we're going to start with a live demo uh and then we'll switch to the presentation just like to to kind of like swap things around.So, I'm going to try to connect the microscope over here. Uh and let's see the other camera and some sensors. So, my talk is about real time uh science co-scientist.So, think about pair programmers. How many of you use any form of copilot for codi ...
Scaling AI Agents Without Breaking Reliability — Preeti Somal, Temporal
AI Engineer· 2025-07-28 15:15
Temporal's Value Proposition for Agentic AI - Temporal positions itself as a solution for building reliable and scalable agentic AI applications, addressing the inherent unreliability of LLMs and the complexity of distributed systems [2][3][7] - The company's mission is to outsource reliability and scalability, allowing developers to focus on business logic [7][8] - Temporal offers language-idiomatic SDKs and handles plumbing code, ensuring reliable process execution and providing guardrails [8][9] - Customers are currently running agents on Temporal at scale in production, experiencing agility and speed in development [12][13] Technical Architecture and Impact - Temporal introduces a workflow abstraction to simplify complex interactions between LLMs, chat history databases, and tools [15][17] - Using Temporal can accelerate development, with case studies showing feature delivery velocity improvements of over 6x [18] - Temporal Cloud handles the heavy lifting of reliability and scalability, while the agent's code runs in the user's environment [28][29] - Temporal is an open-source product, offering a code exchange for developers to explore and utilize [31] Customer Success and Adoption - Dust and other companies are building agents on top of Temporal [11] - Gorgeous, a customer service provider for brands like Reebok and Timbuktu, uses AI agents built on Temporal [11][12] - A consumer application customer is scaling with events without needing to handle scale logic [19]
Government Agents: AI Agents vs Tough Regulations — Mark Myshatyn, Los Alamos National Laboratory
AI Engineer· 2025-07-28 04:15
https://www.linkedin.com/in/markmyshatyn/ ...
Ship Agents that Ship: A Hands-On Workshop - Kyle Penfound, Jeremy Adams, Dagger
AI Engineer· 2025-07-27 22:30
Coding agents are transforming how software gets built, tested, and deployed, but engineering teams face a critical challenge: how to embrace this automation wave without sacrificing trust, control, or reliability. In this 80 minute workshop, you’ll go beyond toy demos and build production-minded AI agents using Dagger, the programmable delivery engine designed for real CI/CD and AI-native workflows. Whether you're debugging failures, triaging pull requests, generating tests, or shipping features, you'll le ...
The AI Engineer’s Guide to Raising VC — Dani Grant (Jam), Chelcie Taylor (Notable)
AI Engineer· 2025-07-27 18:00
A no fluff, all tactics discussion. More AI engineers should build startups, the world needs more software. But there’s a way to raise VC and it’s hard to do it if you’ve never seen it done. We are going to walk through the exact playbook to raise your first round of funding. We will show you real pitch decks, real cold emails and real term sheets so when you go out to raise your first round of funding, you are setup to do it. Every AI Engineer should be equip to start their own company and this session mak ...
Strategies for LLM Evals (GuideLLM, lm-eval-harness, OpenAI Evals Workshop) — Taylor Jordan Smith
AI Engineer· 2025-07-27 16:15
LLM Evaluation Challenges - Traditional benchmarks often fail to reflect real-world LLM performance, reliability, and user satisfaction [1] - Evaluating reasoning quality, agent consistency, MCP integration, and user-focused outcomes requires going beyond standard benchmarks [1] - Benchmarks and leaderboards rarely reflect the realities of production AI [1] Evaluation Strategies & Frameworks - The industry needs tangible evaluation strategies using open-source frameworks like GuideLLM and lm-eval-harness [1] - Custom eval suites tailored to specific use cases are crucial for accurate assessment [1] - Integrating human-in-the-loop feedback is essential for better user-aligned outcomes [1] Key Evaluation Areas - Evaluating reasoning skills, consistency, and reliability in agentic AI applications is critical [1] - Validating MCP (Model Context Protocol) and agent interactions with practical reliability tests is necessary [1] - Agent reliability checks should reflect production conditions [1] Deployment Considerations - Robust evaluation is critical for confidently deploying LLMs in real-world applications like chatbots, copilots, or autonomous AI agents [1]
Why you should care about AI interpretability - Mark Bissell, Goodfire AI
AI Engineer· 2025-07-27 15:30
The goal of mechanistic interpretability is to reverse engineer neural networks. Having direct, programmable access to the internal neurons of models unlocks new ways for developers and users to interact with AI — from more precise steering to guardrails to novel user interfaces. While interpretability has long been an interesting research topic, it is now finding real-world use cases, making it an important tool for AI engineers. About Mark Bissell Mark Bissell is an applied researcher at Goodfire AI worki ...
Introduction to LLM serving with SGLang - Philip Kiely and Yineng Zhang, Baseten
AI Engineer· 2025-07-26 17:45
SGLang Overview - SGLang is an open-source, high-performance serving framework for large language models (LLMs) and large vision models (VLMs) [5] - SGLang supports day zero releases for new models from labs like Quen and DeepSeek, and has a strong open-source community [7] - The project has grown rapidly, from a research paper in December 2023 to nearly 15,000 GitHub stars in 18 months [9] Usage and Adoption - Base 10 uses SGLang as part of its inference stack for various models [8] - SGLang is also used by XAI for their Glock models, inference providers, cloud providers, research labs, universities, and product companies like Koser [8] Performance Optimization - SGLang's performance can be optimized using flags and configuration options, such as CUDA graph settings [20] - Eagle 3, a speculative decoding algorithm, can be used to improve performance by increasing the token acceptance rate [28][42][43] - The default CUDA graph max batch size on L4 GPUs is eight, but it can be adjusted to improve performance [31][36] Community and Contribution - The SGLang community is active and welcomes contributions [7][54] - Developers can get involved by starring the project on GitHub, filing issues, joining the Slack channel, and contributing to the codebase [9][54][55] - The codebase includes the SGLang runtime, a domain-specific front-end language, and a set of optimized kernels [58]
Robotics: why now? - Quan Vuong and Jost Tobias Springberg, Physical Intelligence
AI Engineer· 2025-07-26 17:00
Sharing recent progress from Physical Intelligence and why it is an exciting time to push the frontier in general purpose robotics About Quan Vuong Quan Vuong is co-founder at Physical Intelligence. His research focuses on generalist robotics and algorithms that enable intelligent behaviors through large scale learning. About Jost Tobias Springenberg Tobias is currently a research scientist at Physical Intelligence where he works on bringing AI into the real world and understanding the fundamentals of seque ...