Workflow
LLMs
icon
Search documents
X @Avi Chawla
Avi Chawla· 2025-07-28 06:30
Technology & Development - Open-source tools enable building production-grade LLM web apps rapidly [1] - Interactive apps are more suitable for users focused on results rather than code [1] Data Science & Machine Learning - Data scientists and machine learning engineers commonly use Jupyter for data exploration and model building [1] - Tutorials and insights on DS, ML, LLMs, and RAGs are shared regularly [1]
X @Avi Chawla
Avi Chawla· 2025-07-27 19:23
LLM技术解析 - KV caching in LLMs:LLM 中的 KV 缓存机制被清晰地解释,并附有可视化图表 [1]
X @Avi Chawla
Avi Chawla· 2025-07-27 06:31
Key Takeaways - The author encourages readers to reshare the content if they found it insightful [1] - The author shares tutorials and insights on DS (Data Science), ML (Machine Learning), LLMs (Large Language Models), and RAGs (Retrieval-Augmented Generation) daily [1] Focus Area - The content clearly explains KV caching in LLMs with visuals [1] Author Information - Avi Chawla's Twitter handle is @_avichawla [1]
X @Avi Chawla
Avi Chawla· 2025-07-27 06:30
Technology Overview - KV caching is utilized in Large Language Models (LLMs) to enhance performance [1] - The document provides a clear explanation of KV caching in LLMs with visuals [1]
X @Avi Chawla
Avi Chawla· 2025-07-26 06:30
General Overview - The document is a wrap-up and encourages sharing with the network [1] - It directs readers to Avi Chawla's profile for tutorials and insights on DS, ML, LLMs, and RAGs (Data Science, Machine Learning, Large Language Models, and Retrieval-Augmented Generation) [1] Focus Area - Avi Chawla's content includes explanations of Agentic AI systems [1]
How Intuit uses LLMs to explain taxes to millions of taxpayers - Jaspreet Singh, Intuit
AI Engineer· 2025-07-23 15:51
Intuit's Use of LLMs in TurboTax - Intuit successfully processed 44 million tax returns for tax year 2023, aiming to provide users with high confidence in their tax filings and ensure they receive the best deductions [2] - Intuit's Geni experiences are built on GenOS, a proprietary generative OS platform designed to address the limitations of out-of-the-box tooling, especially concerning regulatory compliance, safety, and security in the tax domain [4][5] - Intuit uses Claude (Anthropic) for static queries related to tax refunds and OpenAI's GPT-4 for dynamic question answering, such as user-specific tax inquiries [9][10][12] - Intuit is one of the biggest users of Claude, with a multi-million dollar contract [9][10] Development and Evaluation - Intuit emphasizes a phased evaluation system, starting with manual evaluations by tax analysts and transitioning to automated evaluations using LLM as a judge [16][17] - Tax analysts also serve as prompt engineers, leveraging their expertise to ensure accurate evaluations and prompt design [16][17] - Key evaluation pillars include accuracy, relevancy, and coherence, with a strong focus on tax accuracy [20][24] - Intuit uses AWS Ground Truth for creating golden datasets for evaluations [22] Challenges and Learnings - LLM contracts are expensive, and long-term contracts are slightly cheaper but create vendor lock-in [25][26] - LLM models have higher latency compared to backend services (3-10 seconds), which can be exacerbated during peak tax season [27][28] - Intuit employs safety guardrails and ML models to prevent hallucination of numbers in LLM responses, ensuring data accuracy [40][41] - Graph RAG outperforms regular RAG in providing personalized and helpful answers to users [42][43]
POC to PROD: Hard Lessons from 200+ Enterprise GenAI Deployments - Randall Hunt, Caylent
AI Engineer· 2025-07-23 15:50
Core Business & Services - Kalin builds custom solutions for clients, ranging from Fortune 500 companies to startups, focusing on app development and database migrations [1][2] - The company leverages generative AI to automate business functions, such as intelligent document processing for logistics management, achieving faster and better results than human annotators [20][21] - Kalin offers services ranging from chatbot and co-pilot development to AI agent creation, tailoring solutions to specific client needs [16] Technology & Architecture - The company utilizes multimodal search and semantic understanding of videos, employing models like Nova Pro and Titan v2 for indexing and searching video content [6][7] - Kalin uses various databases including Postgress, PG vector, and OpenSearch for vector search implementations [13] - The company builds AI systems on AWS, utilizing services like Bedrock and SageMaker, and custom silicon like Tranium and Inferentia for price performance improvements of approximately 60% over Nvidia GPUs [27] AI Development & Strategy - Prompt engineering has proven highly effective, sometimes negating the need for fine-tuning models [40] - Context management is crucial for differentiating applications, leveraging user data and history to make strategic inferences [33][34] - UX design is important for mitigating the slowness of inference, with techniques like caching and UI spinners improving user experience [36][37]
X @Balaji
Balaji· 2025-07-22 21:10
Yes. But then comes the third level of defense, which is trusted human moderators doing occasional bot-or-not flagging to train the algorithms. I think in practice you could get fairly good at this if the system was built for it, and if most humans in the network cooperated.Yishan (@yishan):@balajis I think this will run into the “motivated bears are smarter than the laziest humans” problem and any system that detects all bots will have a high false positive rate.This is probably ok in practice because huma ...
Practical GraphRAG: Making LLMs smarter with Knowledge Graphs — Michael, Jesus, and Stephen, Neo4j
AI Engineer· 2025-07-22 17:59
Graph RAG Overview - Graph RAG aims to enhance LLMs by incorporating knowledge graphs, addressing limitations like lack of domain knowledge, unverifiable answers, hallucinations, and biases [1][3][4][5][9][10] - Graph RAG leverages knowledge graphs (collections of nodes, relationships, and properties) to provide more relevant, contextual, and explainable results compared to basic RAG systems using vector databases [8][9][10][12][13][14] - Microsoft research indicates Graph RAG can achieve better results with lower token costs, supported by studies showing improvements in capabilities and analyst trends [15][16] Knowledge Graph Construction - Knowledge graph construction involves structuring unstructured information, extracting entities and relationships, and enriching the graph with algorithms [19][20][21][22] - Lexical graphs represent documents and elements (chunks, sections, paragraphs) with relationships based on document structure, temporal sequence, and similarity [25][26] - Entity extraction utilizes LLMs with graph schemas to identify entities and relationships from text, potentially integrating with existing knowledge graphs or structured data like CRM systems [27][28][29][30] - Graph algorithms (clustering, link prediction, page rank) enrich the knowledge graph, enabling cross-document topic identification and summarization [20][30][34] Graph RAG Retrieval and Applications - Graph RAG retrieval involves initial index search (vector, full text, hybrid) followed by traversing relationships to fetch additional context, considering user context for tailored results [32][33] - Modern LLMs are increasingly trained on graph processing, allowing them to effectively utilize node-relationship-node patterns provided as context [34] - Tools and libraries are available for knowledge graph construction from various sources (PDFs, YouTube transcripts, web articles), with open-source options for implementation [35][36][39][43][45] - Agentic approaches in Graph RAG break down user questions into tasks, using domain-specific retrievers and tools in sequence or loops to generate comprehensive answers and visualizations [42][44] - Industry leaders are adopting Graph RAG for production applications, such as LinkedIn's customer support, which saw a 286% reduction in median per-issue resolution time [17][18]
Excalidraw: AI and Human Whiteboarding Partnership - Christopher Chedeau
AI Engineer· 2025-07-21 19:12
[Music] Thank you so much for the intro. I'm so excited to be here uh talking about like figure out like how do we like AI and human like work in the world of white bowling and I built excro and if you've don't know about it like you'll see like many thing about it and one of the expectation you probably have uh about speaker at the AI engineer conference is that I talk about AI on every single sentence for the entire talk. So I'm just going to give you a warning.I'm only going to do it for the second half ...