AI Engineer
Search documents
AI powered entomology: Lessons from millions of AI code reviews — Tomas Reimers, Graphite
AI Engineer· 2025-07-22 19:50
[Music] Thank you all so much for coming to this talk. Um, thank you for being at this conference. Generally, my name is Tomas.I'm one of the co-founders of Graphite and I'm here to talk to you around AI power entomology. If you don't know, entomology is the study of bugs. It's something that we do.We is very near and dear to our heart and part of what our product does. So, Graphite, for those of you that don't know, builds a product called Diamond. Diamond is an AI powered code reviewer.You go ahead, you u ...
How to run Evals at Scale: Thinking beyond Accuracy or Similarity — Muktesh Mishra, Adobe
AI Engineer· 2025-07-22 19:46
[Music] Hey everyone. Um hope you are having a great conference. Um so I'm going to talk about uh how to run events at scale and thinking beyond accuracy or similarity.Uh so in the last uh presentation we we learned about like how to art u architect the AI applications um and then whys are important. In this presentation I am going to talk about like the importance of ewells as well as what type of ewells we have to choose when we are crafting an application. This is a bit about me.Um so I work as a lead en ...
Continuous Profiling for GPUs — Matthias Loibl, Polar Signals
AI Engineer· 2025-07-22 19:46
GPU Profiling & Performance Optimization - The industry emphasizes improving performance and saving costs by optimizing software, potentially reducing server usage by 10% [4] - Sampled profiling is used to balance data volume and continuous monitoring, with examples of sampling 100 times per second resulting in less than 1% CPU overhead and 4MB memory overhead [5] - The industry highlights the importance of production environment profiling to observe real-world application performance with low overhead [8] - The company's solution leverages Linux EVPF, enabling profiling without application instrumentation [9] Technology & Metrics - The company's GPU profiling solution uses Nvidia NVML to extract metrics, including overall node utilization (blue line), individual process utilization (orange line), memory utilization, and clock speed [11][12] - Key metrics include power utilization (with power limit as a dashed line), temperature (important to avoid throttling at 80 degrees Celsius), and PCIe throughput (negative for receiving, positive for sending, e g 10 MB/s) [13][14] - The solution correlates GPU metrics with CPU profiles collected using EVPF to analyze CPU activity during periods of less than full GPU utilization [14] GPU Time Profiling - The company introduces GPU time profiling to measure time spent on individual CUDA functions, determining start and end times of kernels via the Linux kernel [18] - The solution displays CPU stacks with leaf nodes representing functions taking time on the GPU, with colors indicating different binaries (e g blue for Python) [19][20] Deployment & Integration - The company's solution can be deployed using a binary on Linux, Docker, or as a DaemonSet on Kubernetes, requiring a manifest YAML and token [21] - Turbo Puffer is interested in integrating the company's GPU profiling to improve the performance of their vector engine [22]
Top Ten Challenges to Reach AGI — Stephen Chin, Andreas Kollegger
AI Engineer· 2025-07-22 19:45
[Music] [Music] All right. Hey, so great to see everyone here at AI Engineer World's Fair. Andre and I have the honor of curating the graph rag track which is happening here.And I thought I thought the jokes Simon had about bugs were spot on. Spot on. Hilarious.And that's the reason why we care so much about getting really good data like like building a solid foundation and good grounding for models. And we're going to we're going to chat a bit because I think we have a social responsibility. We're we're ge ...
Practical GraphRAG: Making LLMs smarter with Knowledge Graphs — Michael, Jesus, and Stephen, Neo4j
AI Engineer· 2025-07-22 17:59
Graph RAG Overview - Graph RAG aims to enhance LLMs by incorporating knowledge graphs, addressing limitations like lack of domain knowledge, unverifiable answers, hallucinations, and biases [1][3][4][5][9][10] - Graph RAG leverages knowledge graphs (collections of nodes, relationships, and properties) to provide more relevant, contextual, and explainable results compared to basic RAG systems using vector databases [8][9][10][12][13][14] - Microsoft research indicates Graph RAG can achieve better results with lower token costs, supported by studies showing improvements in capabilities and analyst trends [15][16] Knowledge Graph Construction - Knowledge graph construction involves structuring unstructured information, extracting entities and relationships, and enriching the graph with algorithms [19][20][21][22] - Lexical graphs represent documents and elements (chunks, sections, paragraphs) with relationships based on document structure, temporal sequence, and similarity [25][26] - Entity extraction utilizes LLMs with graph schemas to identify entities and relationships from text, potentially integrating with existing knowledge graphs or structured data like CRM systems [27][28][29][30] - Graph algorithms (clustering, link prediction, page rank) enrich the knowledge graph, enabling cross-document topic identification and summarization [20][30][34] Graph RAG Retrieval and Applications - Graph RAG retrieval involves initial index search (vector, full text, hybrid) followed by traversing relationships to fetch additional context, considering user context for tailored results [32][33] - Modern LLMs are increasingly trained on graph processing, allowing them to effectively utilize node-relationship-node patterns provided as context [34] - Tools and libraries are available for knowledge graph construction from various sources (PDFs, YouTube transcripts, web articles), with open-source options for implementation [35][36][39][43][45] - Agentic approaches in Graph RAG break down user questions into tasks, using domain-specific retrievers and tools in sequence or loops to generate comprehensive answers and visualizations [42][44] - Industry leaders are adopting Graph RAG for production applications, such as LinkedIn's customer support, which saw a 286% reduction in median per-issue resolution time [17][18]
Knowledge Graphs in Litigation Agents — Tom Smoker, WhyHow
AI Engineer· 2025-07-22 17:00
Core Argument - Structured Representations, emphasizing relationships between clauses, documents, entities, and parties, are crucial in the legal field [1] - Structured Context Injection, enabled by Structured Representations, enhances context and reduces hallucinations in legal agents [1] Case Studies & Applications - The report highlights production systems built for legal use-cases, including recursive contractual clause retrieval and HITL legal reasoning news agents [1] - These systems demonstrate the significant improvement in effectiveness and reliability of legal agents through structured representations [1] Key Technologies - Structured Representations are presented as a key technology for improving legal agents [1]
When Vectors Break Down: Graph-Based RAG for Dense Enterprise Knowledge - Sam Julien, Writer
AI Engineer· 2025-07-22 16:30
Enterprise knowledge bases are filled with "dense mapping," thousands of documents where similar terms appear repeatedly, causing traditional vector retrieval to return the wrong version or irrelevant information. When our customers kept hitting this wall with their RAG systems, we knew we needed a fundamentally different approach. In this talk, I'll share Writer's journey developing a graph-based RAG architecture that achieved 86.31% accuracy on the RobustQA benchmark while maintaining sub-second response ...
Stop Using RAG as Memory — Daniel Chalef, Zep
AI Engineer· 2025-07-22 16:00
Problem Statement & Solution - Current memory frameworks struggle with relevance, leading to inaccurate responses or hallucinations due to the storage of arbitrary facts [3][4][5] - Semantic similarity does not equate to business relevance, as vector databases lack causal or relational understanding [7] - The industry needs domain-aware memory solutions instead of relying solely on better semantic search [8] - Zep offers a solution by enabling developers to model memory after their specific business domain, creating more cogent and capable memory [1][2] Zep's Implementation & Features - Zep allows developers to define custom entities and edges within its graph framework, tailoring memory to specific business objects [1][9] - Developers can use Pydantic, Zod, or Go structs to define business rules for these entities and their fields [9][10] - Zep's SDK allows defining entity types with descriptions and business rules for fields, enabling precise control over data stored [10] - Zep allows building tools for agents to retrieve financial snapshots by running multiple searches concurrently and filtering by specific node types [10][11] - Zep's front end provides a knowledge graph visualization, allowing users to see the relationships and fields defined for each entity [12] Demonstration & Use Case - A finance coach application demonstrates Zep's ability to store explicit business objects like financial goals, debts, and income sources [8][9] - The application captures relevant information, such as a $5,000 monthly rent, and stores it as a debt account entity with defined fields [11][12]
HybridRAG: A Fusion of Graph and Vector Retrieval to Enhance Data Interpretation - Mitesh Patel
AI Engineer· 2025-07-22 16:00
[Music] to quickly introduce myself. My name is Mitesh. I lead the develop advocate team at Nvidia.And the goal of my team is to uh create technical workflows, notebooks uh for different applications and then we release that codebase uh on GitHub. So developers in general which is me and you all of us together we can harness that uh that knowledge and take it further for the application or use case that you're working on. So that is what my uh my team does including myself.In today's talk, I'm I'm I'm going ...
tldraw.computer - Steve Ruiz, tldraw
AI Engineer· 2025-07-21 19:14
[Music] My name is Steve. Uh Steve Ruiz. I am from a company that I started called Teal Draw. Teal Draw started as a um well, a couple things. started as like a a digital ink library that then uh Christopher had me implement in Excaladra. When I was working on that, I was like, you know, there should probably be like a kind of a a really good SDK for building these types of things. And I'd already worked on a couple of projects that uh we're kind of going in that direction. So, I did turned out if you build ...