Workflow
RAG (Retrieval-Augmented Generation)
icon
Search documents
Demo 能博眼球,生产才赢生存:64页AI Agent 创业者落地指南 | Jinqiu Select
锦秋集· 2025-09-25 05:54
过去两年,几乎每一位 AI 创业者都能在 Demo 上收获掌声:一个对话原型,一个多工具的展示,就能轻松让投资人眼前一亮。但现实很快泼来冷水——用户不会 为炫酷的 Demo 买单,企业也不会把关键流程交给一个"不确定的模型玩具"。从 Demo 到生产,隔着的往往不是模型差距,而是工程化、可靠性和商业化的深渊。 谷歌最近发布了一份关于AI Agent开发的深度技术指南,系统性地展示了其对于如何将一个Agent从初期原型,打造为生产级应用的完整思考和方法论。 指南重点介绍了以下几个核心技巧与重点: 锦秋基金(公众号:锦秋集;ID:jqcapital)认为,这篇文章不仅系统地梳理了构建高级AI Agent所需的技术知识,更提供了一套可落地的工程实践和自动化工具, 为拥抱Agent系统潜力的初创公司和开发者提供了一张清晰的、以运维为驱动的路线图。 基于Google的这篇报告,你将会观察到: AI Agent的核心概念 :深入理解构成一个Agent的关键组件,包括其"大脑"(模型)、"双手"(工具)、执行功能(编排)以及实现信息准确性的"知识注 入"(Grounding)机制。 代码优先的构建方法 :学习如何使用谷歌的 ...
X @Avi Chawla
Avi Chawla· 2025-08-18 18:56
RT Avi Chawla (@_avichawla)Get RAG-ready data from any unstructured file!@tensorlake transforms unstructured docs into RAG-ready data in a few lines of code. It returns the document layout, structured extraction, bounding boxes, etc.Works on any complex layout, handwritten docs and multilingual data. https://t.co/lZoNWZb2ip ...
X @Avi Chawla
Avi Chawla· 2025-08-18 06:30
Product Overview - Tensorlake transforms unstructured documents into RAG-ready data with a few lines of code [1] - The solution provides document layout, structured extraction, and bounding boxes [1] - It supports complex layouts, handwritten documents, and multilingual data [1] Technology Focus - The company focuses on enabling RAG (Retrieval-Augmented Generation) applications [1] - The technology extracts structured information from unstructured files [1]
One-Click Enterprise RAG Pipeline with DDN Infinia & NVIDIA NeMo | High-Performance AI Data Solution
DDN· 2025-08-08 16:27
Solution Overview - DDN provides a one-click high-performance RAG (Retrieval-Augmented Generation) pipeline for enterprise use, deployable across various environments [1] - The RAG pipeline solution incorporates NVIDIA NIM within the NVIDIA Nemo framework, hosting embedding, reranking, and LLM models, along with a MILV vector database [2] - DDN Infinia cluster, with approximately 0.75 petabytes capacity, serves as the backend for the MILV vector database [3] Technical Details - Infinia's AI-optimized architecture, combined with KVS, accelerates NVIDIA GPU indexing [3] - The solution utilizes an NVIDIA AI data platform reference design to facilitate the creation of custom knowledge bases for extending LLM capabilities [4] - The one-click RAG pipeline supports multiple foundation models for query optimization [7] Performance and Benefits - Integration between DDN Infinia and NVIDIA Nemo retriever, along with NVIDIA KVS, results in faster response times, quicker data updates, and rapid deployment of custom chatbots [9] - The RAG pipeline enables detailed and accurate responses to specific queries, as demonstrated by the Infinia CLI hardware management features example [8][9]
When Vectors Break Down: Graph-Based RAG for Dense Enterprise Knowledge - Sam Julien, Writer
AI Engineer· 2025-07-22 16:30
Enterprise knowledge bases are filled with "dense mapping," thousands of documents where similar terms appear repeatedly, causing traditional vector retrieval to return the wrong version or irrelevant information. When our customers kept hitting this wall with their RAG systems, we knew we needed a fundamentally different approach. In this talk, I'll share Writer's journey developing a graph-based RAG architecture that achieved 86.31% accuracy on the RobustQA benchmark while maintaining sub-second response ...
X @Avi Chawla
Avi Chawla· 2025-07-10 06:30
RAG Architectures - Naive RAG is contrasted with Agentic RAG, highlighting architectural differences [1] - The explanation includes visuals to aid understanding of the two RAG approaches [1] Key Concepts - The document clearly explains the concepts of Naive RAG and Agentic RAG [1]
RAG in 2025: State of the Art and the Road Forward — Tengyu Ma, MongoDB (Voyage AI)
AI Engineer· 2025-06-27 09:59
Retrieval Augmented Generation (RAG) & Large Language Models (LLMs) - RAG is essential for enterprises to incorporate proprietary information into LLMs, addressing the limitations of out-of-the-box models [2][3] - RAG is considered a more reliable, faster, and cheaper approach compared to fine-tuning and long context windows for utilizing external knowledge [7] - The industry has seen significant improvements in retrieval accuracy over the past 18 months, driven by advancements in embedding models [11][12] - The industry averages approximately 80% accuracy across 100 datasets, indicating a 20% potential improvement headroom in retrieval tasks [12][13] Vector Embeddings & Storage Optimization - Techniques like matryoshka learning and quantization can reduce vector storage costs by up to 100x with minimal performance loss (5-10%) [15][16][17] - Domain-specific embeddings, such as those customized for code, offer better trade-offs between storage cost and accuracy [21] RAG Enhancement Techniques - Hybrid search, combining lexical and vector search with re-rankers, improves retrieval performance [18] - Query decomposition and document enrichment, including adding metadata and context, enhance retrieval accuracy [18][19][20] Future of RAG - The industry predicts a shift towards more sophisticated models that minimize the need for manual "tricks" to improve RAG performance [29][30] - Multimodal embeddings, which can process screenshots, PDFs, and videos, simplify workflows by eliminating the need for separate data extraction and embedding steps [32] - Context-aware and auto-chunking embeddings aim to automate the trunking process and incorporate cross-trunk information, optimizing retrieval and cost [33][36]
X @Avi Chawla
Avi Chawla· 2025-06-17 19:11
I built an MCP-powered RAG over video files.You can:- Ingest a video file.- Ask questions to get natural language responses.- Extract precise timestamps that answer the question.Check the explainer thread below: https://t.co/hy5aVCQj9pAvi Chawla (@_avichawla):Let's build an MCP-powered RAG over videos, step-by-step: ...