Workflow
LLMs
icon
Search documents
X @Avi Chawla
Avi Chawla· 2025-08-18 06:30
If you found it insightful, reshare it with your network.Find me → @_avichawlaEvery day, I share tutorials and insights on DS, ML, LLMs, and RAGs.Avi Chawla (@_avichawla):Get RAG-ready data from any unstructured file!@tensorlake transforms unstructured docs into RAG-ready data in a few lines of code. It returns the document layout, structured extraction, bounding boxes, etc.Works on any complex layout, handwritten docs and multilingual data. https://t.co/lZoNWZb2ip ...
X @Ethereum
Ethereum· 2025-08-13 16:52
5/ This unlocks a new kind of internet commerce:🧠 LLMs paying for model inference (text, image, video)🕸️ Agents paying for context for task optimization🗃️ Apps streaming stablecoins for permanent storage🧾 Browsers paying to read gated content🚕 A self-driving taxi owns itself and pays for its maintenanceThe new web becomes natively monetizable, by machines. ...
X @Avi Chawla
Avi Chawla· 2025-08-12 06:30
AI Agent Fundamentals - The document covers agent fundamentals, providing foundational knowledge for understanding AI agents [1] - It differentiates LLM, RAG, and Agents, clarifying their roles and relationships in AI systems [1] - Agentic design patterns are explored, offering insights into structuring and organizing AI agents [1] - Building blocks of agents are outlined, detailing the essential components for constructing AI agents [1] Practical Applications - The document includes 12 hands-on projects for AI Engineers, providing practical experience in building AI agents [1] - It covers building custom tools via MCP (likely referring to a specific methodology or platform), enabling customization and extension of AI agent capabilities [1] Resource Availability - A PDF containing all AI Agents posts is available for download, offering a consolidated resource for learning about AI agents [1]
X @Avi Chawla
Avi Chawla· 2025-08-11 06:31
General Overview - The document is a wrap-up message encouraging readers to reshare the content if they found it insightful [1] - It promotes tutorials and insights on Data Science (DS), Machine Learning (ML), Large Language Models (LLMs), and Retrieval-Augmented Generation (RAGs) [1] Call to Action - The author, Avi Chawla (@_avichawla), invites readers to find him for more content [1] Specific Topic - The document mentions fine-tuning OpenAI gpt-oss (100% locally) [1]
The Future of Evals - Ankur Goyal, Braintrust
AI Engineer· 2025-08-09 15:12
Product & Technology - Brain Trust introduces "Loop," an agent integrated into its platform designed to automate and improve prompts, datasets, and scorers for AI model evaluation [4][5][7] - Loop leverages advancements in frontier models, particularly noting Claude 4's significant improvement (6x better) in prompt engineering capabilities compared to previous models [6] - Loop allows users to compare suggested edits to data and prompts side-by-side within the UI, maintaining data visibility [9][10] - Loop supports various models, including OpenAI, Gemini, and custom LLMs [9] User Engagement & Adoption - The average organization using Brain Trust runs approximately 13 evaluations (EVELs) per day [3] - Some advanced customers are running over 3,000 evaluations daily and spending more than two hours per day using the product [3] - Brain Trust encourages users to try Loop and provide feedback [12] Future Vision - Brain Trust anticipates a revolution in AI model evaluation, driven by advancements in frontier models [11] - The company is focused on incorporating these advancements into its platform [11] Hiring - Brain Trust is actively hiring for UI, AI, and infrastructure roles [12]
X @Balaji
Balaji· 2025-08-08 06:46
AI Development - LLMs' development may have reached a temporary peak [1] - The broader deployment of AI has only just started [1]
X @Avi Chawla
Avi Chawla· 2025-08-08 06:34
RAG技术应用 - 企业正在构建基于超过 100 个数据源的 RAG 系统 [1] - Microsoft 在 M365 产品中提供 RAG 技术 [1] - Google 在 Vertex AI Search 中提供 RAG 技术 [1] - AWS 在 Amazon Q Business 中提供 RAG 技术 [1] 技术趋势 - 行业正在构建基于 MCP 驱动的 RAG 系统,数据源超过 200 个,并且 100% 本地化 [1]
X @Avi Chawla
Avi Chawla· 2025-08-08 06:34
In this demo, we used mcp-use.It lets us connect LLMs to MCP servers & build local MCP clients in a few lines of code.- Compatible with Ollama & LangChain- Stream Agent output async- Built-in debugging mode, etcRepo: https://t.co/PWcuwMFvzi(don't forget to star ⭐) ...
X @Avi Chawla
Avi Chawla· 2025-08-06 19:13
AI Engineering Resources - The document provides 12 cheat sheets for AI engineers covering various topics [1] - The cheat sheets include visuals to aid understanding [1] Key AI Topics Covered - Function calling & MCP (likely Mean Cumulative Probability) for LLMs (Large Language Models) is covered [1] - The cheat sheets detail 4 stages of training LLMs from scratch [1] - Training LLMs using other LLMs is explained [1] - Supervised & Reinforcement fine-tuning techniques are included [1] - RAG (Retrieval-Augmented Generation) vs Agentic RAG is differentiated [1]
Evals Are Not Unit Tests — Ido Pesok, Vercel v0
AI Engineer· 2025-08-06 16:14
Key Takeaways on LLM Evaluation - LLMs can be unreliable, impacting user experience and application usability [6] - AI applications are prone to failure in production despite successful demos [7] - It is crucial to build reliable software using LLMs through methods like prompt engineering [8] Evaluation Strategies and Best Practices - Evals should focus on relevant user queries and avoid out-of-bounds scenarios [19] - Data collection methods include thumbs up/down feedback, log analysis, and community forums [21][22][23] - Evals should test across the entire data distribution to understand system performance [20][24] - Constants should be factored into data, and variables into tasks for clarity and reuse [25][26] - Evaluation scores should be deterministic and simple for easier debugging and team collaboration [29][30] - Evals should be integrated into CI pipelines to detect improvements and regressions [34][35] Vercel's Perspective - Vercel's Vzero is a full-stack web coding platform designed for rapid prototyping and building [1] - Vzero recently launched GitHub sync, enabling code push and pull directly from the platform [2] - Vercel emphasizes the importance of continuous evaluation to improve AI app reliability and quality [37] - Vercel has reached 100 million messages sent on Vzero [2]