Workflow
AI Engineer
icon
Search documents
AI Engineer Paris 2025 (Day 2)
AI Engineer· 2025-09-23 18:15
Full schedule at https://www.ai.engineer/paris#schedule - Emil Eifrem, Co-Founder and CEO, Neo4j, “The State of^H^Hin AI Engineering” - Tushar Jain, President, Product & Engineering, Docker, “Democratizing AI Agents: Building, Sharing, and Securing Made Simple” - Martin Woodward, Vice President of DevRel, GitHub, “Building MCP's at GitHub Scale” - Yann Leger, Co-Founder & CEO, Koyeb, “Building for the Agentic Era: The Future of AI Infrastructure” - Andreas Blattmann, Co-founder, Black Forest Labs, “Inside F ...
Opening Keynotes - AIE Paris 2025 (Day 1)
AI Engineer· 2025-09-22 12:00
The opening welcome reception is all about the hallway track and the expo -- meeting and mingling with other founders and engineers who are (mostly) based in Europe. However, for those who can't make it, we'll be streaming 2 talks to kick off the conference: Shawn Swyx Wang, curator of Latent Space and Co-founder of AI Engineer: The Year in Agents Lélio Renard Lavaud, VP of Engineering, Mistral: How open source drives successful enterprise adoption ...
How BlackRock Builds Custom Knowledge Apps at Scale — Vaibhav Page & Infant Vasanth, BlackRock
AI Engineer· 2025-08-23 09:30
Challenges in Building AI Applications at BlackRock - BlackRock faces challenges in prompt engineering, requiring significant time investment from domain experts to iterate, version, and compare prompts effectively [10] - BlackRock encounters difficulties in selecting appropriate LLM strategies (e.g., RAG, chain-of-thought) due to instrument complexity and document size variations, impacting data extraction [11] - BlackRock experiences deployment challenges, including determining suitable cluster types (GPU-based inference vs burstable) and managing cost controls for AI applications [12][14] BlackRock's Solution: Sandbox and App Factory - BlackRock developed a framework with a "sandbox" for domain experts to build and refine extraction templates, accelerating the app development process [15][17] - BlackRock's "sandbox" provides greater configuration capabilities beyond prompt engineering, including QC checks, validations, constraints, and interfield dependencies [19][20] - BlackRock's "app factory" is a cloud-native operator that takes a definition from the sandbox and spins out an app, streamlining deployment [15] Key Takeaways for Building AI Apps at Scale - BlackRock emphasizes investing heavily in prompt engineering skills for domain experts, particularly in the financial space, due to the complexity of financial documents [26] - BlackRock highlights the importance of educating the firm on LLM strategies and how to choose the right approach for specific use cases [27] - BlackRock stresses the need to evaluate the ROI of AI app development versus off-the-shelf products, considering the potential cost [27] - BlackRock underscores the importance of human-in-the-loop design, especially in regulated environments, to ensure compliance and accuracy [28]
Form factors for your new AI coworkers — Craig Wattrus, Flatfile
AI Engineer· 2025-08-22 15:00
AI Development & Application - The industry is moving towards designers, product people, and engineers collaborating to build together, eliminating mock-ups and click-through prototypes [1] - Flat Files AI stack is structured into four buckets: invisible, ambient, inline, and conversational AI, each offering different levels of user interaction [1] - The company is exploring AI agents that can write code to set up demos tailored to specific user use cases, such as creating an HR demo for users from HR companies [1] - The company is developing tools that allow AI to analyze data in the background, identify opportunities for improvement, and provide inline assistance to users working with the data [1] - The company is building no-code/low-code agentic systems that can write Flat File applications, potentially reducing the need for engineers in this process [1] AI Agent Design & Character - The company is shifting from controlling AI agents to character coaching, focusing on building out the desired nature and characteristics of the agents [1] - The company is experimenting with giving AI agents tools like cursors to interact with design tools, exploring how AI can operate in the design space [2] - The company is aiming to create an environment where LLMs can shine, focusing on form factors that help them nail their assignments, stay aligned, and grow as models improve [1] Emergent Behavior & Future Exploration - The industry is seeing emergence in AI, with AI agents exhibiting curiosity, excitability, and focus, leading to unexpected and valuable outcomes [6][7][8] - The company is exploring the use of AI agents with knowledge bases to surface suggestions and help users complete tasks, even when the AI cannot directly fix the issue [12][13][14] - The company is focusing on autocomplete backed by LLMs, designing applications to test and benchmark the performance of different models [16][17]
Building an Agentic Platform — Ben Kus, CTO Box
AI Engineer· 2025-08-21 18:15
AI Platform Evolution - Box transitioned to an agentic-first design for metadata extraction to enhance its AI platform [1] - The shift to agentic architecture was driven by the limitations of pre-generative AI data extraction and challenges with a pure LLM approach [1] - Agentic architecture unlocks advantages in data extraction [1] Technical Architecture - Box's AI agent reasoning framework supports the agentic routine for data extraction [1] - The agentic architecture addresses the challenge of unstructured data in enterprises [1] Key Lessons - Building agentic architecture early is a key lesson learned [1]
Five hard earned lessons about Evals — Ankur Goyal, Braintrust
AI Engineer· 2025-08-21 18:13
AI Development Strategy - Building successful AI applications requires a sophisticated engineering approach beyond just writing good prompts [1] - The industry emphasizes the importance of evaluations (evals) as a core component of the development process [1] - Evaluations should be intentionally engineered to reflect real-world user feedback and drive product improvements [1] Technical Focus - "Context engineering" is emerging as a new frontier, focusing on optimizing the entire context provided to the model [1] - Context engineering includes tool definitions and their outputs [1] - The industry advocates for a flexible, model-agnostic architecture [1] Adaptability - The architecture should quickly adapt to the rapidly evolving landscape of AI models [1] - Optimize the entire evaluation system, not just the prompts [1]
Rishabh Garg, Tesla Optimus — Challenges in High Performance Robotics Systems
AI Engineer· 2025-08-21 16:41
A robot's behavior is influenced by the control policy, the software configuration, and electrical characteristics of the communication protocol. When unexpected behaviors arise, it is not straightforward to root cause them to the RL policy, electrical characteristics, mechanical characteristics. This talk walks through some of these issues and explains what might cause the observed behavior. We will talk about concrete issues that audience will be able to take away from and develop their understanding of p ...
Perceptual Evaluations: Evals for Aesthetics — Diego Rodriguez, Krea.ai
AI Engineer· 2025-08-21 16:30
AI Evaluation Challenges - Current AI evaluations face problems [1] - Limitations exist in both AI and human-centric metrics for evaluating generative media [1] - Evaluating aesthetics and image/generative media is the hardest kind of AI evaluation [1] KREA.ai's Perspective - KREA.ai focuses on perceptual evaluations [1] - Krea's role involves rethinking evaluation and shaping the future of AI [1] - Krea issues a call to action regarding AI evaluation [1] Key Discussion Points - The discussion covers the historical context and compression in relation to AI evaluation [1] - The session emphasizes the importance of evaluating our evaluations [1]
Fuzzing the GenAI Era Leonard Tang
AI Engineer· 2025-08-21 16:26
AI Evaluation Challenges - Traditional evaluation methods are inadequate for assessing GenAI applications' brittleness [1] - The industry faces a "Last Mile Problem" in AI, ensuring reliability, quality, and alignment for any application [1] - Standard evaluation methods often fail to uncover corner cases and unexpected user inputs [1] Haize Labs' Approach - Haize Labs simulates the "last mile" by bombarding AI with unexpected user inputs to uncover corner cases at scale [1] - Haize Labs focuses on Quality Metric (defining criteria for good/bad responses and automating judgment) and Stimuli Generation (creating diverse data to discover bugs) [1] - Haize Labs uses agents as judges to scale evaluation, considering factors like accuracy vs latency [1] - Haize Labs employs RL-tuned judges to further scale evaluation processes [1] - Haize Labs utilizes simulation as a form of prompt optimization [1] Case Studies - Haize Labs has worked with a major European bank's AI app [1] - Haize Labs has worked with a F500 bank's voice agents [1] - Haize Labs scales voice agent evaluations [1]
#define AI Engineer - Greg Brockman, OpenAI (ft. Jensen Huang, NVIDIA)
AI Engineer· 2025-08-10 16:00
People - Greg Brockman 在旧金山 AI 工程师世界博览会上发表职业生涯建议,面向 AI 工程师 [1] Resources - AI 行业可通过订阅时事通讯获取最新活动和内容信息 [1]