RAG

Search documents
具身领域的大模型基础部分,都在这里了......
具身智能之心· 2025-09-20 16:03
Core Viewpoint - The article emphasizes the importance of a comprehensive community for learning and sharing knowledge about large models, particularly in the fields of embodied AI and autonomous driving, highlighting the establishment of the "Large Model Heart Tech Knowledge Planet" as a platform for collaboration and technical exchange [1][3]. Group 1: Community and Learning Resources - The "Large Model Heart Tech" community aims to provide a platform for technical exchange related to large models, inviting experts from renowned universities and leading companies in the field [3][67]. - The community offers a detailed learning roadmap for various aspects of large models, including RAG, AI Agents, and multimodal models, making it suitable for beginners and advanced learners [4][43]. - Members can access a wealth of resources, including academic progress, industrial applications, job recommendations, and networking opportunities with industry leaders [7][70]. Group 2: Technical Roadmaps - The community has outlined specific learning paths for RAG, AI Agents, and multimodal large models, detailing subfields and applications to facilitate systematic learning [9][43]. - For RAG, the community provides resources on various subfields such as Graph RAG, Knowledge-Oriented RAG, and applications in AIGC [10][23]. - The AI Agent section includes comprehensive introductions, evaluations, and advancements in areas like multi-agent systems and self-evolving agents [25][39]. Group 3: Future Plans and Engagement - The community plans to host live sessions with industry experts, allowing members to engage with leading figures in academia and industry [66]. - There is a focus on job sharing and recruitment information to empower members in their career pursuits within the large model domain [70].
但我还是想说:建议个人和小团队不要碰大模型训练!
自动驾驶之心· 2025-09-20 16:03
这个暴论需要叠加很多buff,但我想说的确实就是这个标题。也算是和大家对齐一下认知。 这个暴论自然引申出一个问题: 不训练大模型怎么办? 为什么不要微调?因为没有模型的原始数据配比,更有可能原始的训练数据都没有,微调之后极有可能损失掉大 部分的性能。 那如果开源模型在特定领域的效果非常差,怎么办? 如果是很垂类的领域模型,可以先试试RAG,不行就试试In-context Learning,在上下文中,教LLM一些领域知 识。能尝试的低成本方案都尝试后,再考虑垂类领域模型的微调训练! 一些实际使用过程中的经验,将最需要脑子的任务交给o1系列模型,比较需要脑子的任务,交给4o这一梯队的 模型。 除了付费的模型,还可以考虑国产的大模型,点名表扬DeepSeek、豆包、Qwen等等开源大模型。 这其实就是Agentic AI的思路。 如果你的业务在上面的方案中都跑不通,那么自己训练模型大概率也是白瞎。大模型时代,基础模型能力的每一 次提升,都算是一次地球Online的版本更新。 距离大厂基座模型团队之外的AI人,需要先了解现有LLM的性能边界,敏锐的分辨出现有模型能力和过去方案 的差异,能否给当前的业务带来新的变化, ...
X @Avi Chawla
Avi Chawla· 2025-09-20 06:33
The ultimate Full-stack AI Engineering roadmap to go from 0 to 100.This is the exact mapped-out path on what it actually takes to go from Beginner → Full-Stack AI Engineer.> Start with Coding Fundamentals.> Learn Python, Bash, Git, and testing.> Every strong AI engineer starts with fundamentals.> Learn how to interact with models by understanding LLM APIs.> This will teach you structured outputs, caching, system prompts, etc.> APIs are great, but raw LLMs still need the latest info to be effective.> Learn h ...
真的花了好久才汇总的大模型技术路线......
具身智能之心· 2025-09-16 00:03
Core Insights - The article emphasizes the transformative impact of large models on various industries, highlighting their role in enhancing productivity and driving innovation in fields such as autonomous driving, embodied intelligence, and generative AI [2][4]. Group 1: Large Model Technology Trends - The large model industry is undergoing significant changes characterized by technological democratization, vertical application, and open-source ecosystems [2]. - There is a growing demand for talent skilled in technologies like RAG (Retrieval-Augmented Generation) and AI Agents, which are becoming core competencies for AI practitioners [2][4]. - The article introduces a comprehensive learning community focused on large models, offering resources such as videos, articles, learning paths, and job exchange opportunities [2][4]. Group 2: Learning Pathways - The community provides detailed learning pathways for various aspects of large models, including RAG, AI Agents, and multimodal models [4][5]. - Specific learning routes include Graph RAG, Knowledge-Oriented RAG, and Reasoning RAG, among others, aimed at both beginners and advanced learners [4][5]. - The pathways are designed to facilitate systematic learning and networking among peers in the field [5]. Group 3: Community Benefits - Joining the community offers benefits such as access to the latest academic advancements, industrial applications, and job opportunities in the large model sector [7][9]. - The community aims to create a collaborative environment for knowledge sharing and professional networking [7][9]. - There are plans for live sessions with industry leaders to further enrich the community's offerings [65][66].
RAG 的概念很糟糕,让大家忽略了应用构建中最关键的问题
Founder Park· 2025-09-14 04:43
Core Viewpoint - The article emphasizes the importance of Context Engineering in AI development, criticizing the current trend of RAG (Retrieval-Augmented Generation) as a misleading concept that oversimplifies complex processes [5][6][7]. Group 1: Context Engineering - Context Engineering is considered crucial for AI startups, as it focuses on effectively managing the information within the context window during model generation [4][9]. - The concept of Context Rot, where the model's performance deteriorates with an increasing number of tokens, highlights the need for better context management [8][12]. - Effective Context Engineering involves two loops: an internal loop for selecting relevant content for the current context and an external loop for learning to improve information selection over time [7][9]. Group 2: Critique of RAG - RAG is described as a confusing amalgamation of retrieval, generation, and combination, which leads to misunderstandings in the AI community [5][6]. - The article argues that RAG has been misrepresented in the market as merely using embeddings for vector searches, which is seen as a shallow interpretation [5][7]. - The author expresses a strong aversion to the term RAG, suggesting that it detracts from more meaningful discussions about AI development [6][7]. Group 3: Future Directions in AI - Two promising directions for future AI systems are continuous retrieval and remaining within the embedding space, which could enhance performance and efficiency [47][48]. - The potential for models to learn to retrieve information dynamically during generation is highlighted as an exciting area of research [41][42]. - The article suggests that the evolution of retrieval systems may lead to a more integrated approach, where models can generate and retrieve information simultaneously [41][48]. Group 4: Chroma's Role - Chroma is positioned as a leading open-source vector database aimed at facilitating the development of AI applications by providing a robust search infrastructure [70][72]. - The company emphasizes the importance of developer experience, aiming for a seamless integration process that allows users to quickly deploy and utilize the database [78][82]. - Chroma's architecture is designed to be modern and efficient, utilizing distributed systems and a serverless model to optimize performance and cost [75][86].
宇树科技官宣IPO后王兴兴首次发声:我最后悔的是以前没有学AI;甲骨文与OpenAI签署3000亿美元的算力协议丨AIGC日报
创业邦· 2025-09-12 00:12
更多AIGC资讯…… 2025 DEMO CHINA「AI创新应用专场」招募开启 产品能 Dem o,就有机会入选《2025早期AI创新先锋50强》,直面200+投资机构~ 点击右边链 接,专属报名通道走起➡️ 创业邦·2025 早期 AI 创新先锋50强报名表 1.【腾讯开源Youtu-GraphRAG】9月11日,腾讯优图实验室开源Youtu-GraphRAG。据介绍,这 是一款全新的图检索增强生成框架,主打大语言模型+RAG模式,把知识组织成"图谱",再交给大语 言模型去检索和推理,帮助大模型在处理复杂问答类任务时回答更精准、更可追溯。尤其适用于企业 知识库问答、科研文档解析、个人知识库、私域知识管理等知识密集型场景。(界面新闻) 2.【宇树科技官宣IPO后王兴兴首次发声:我最后悔的是以前没有学AI】9月11日,在2025 Inclusion·外滩大会期间,宇树科技创始人兼CEO王兴兴在圆桌论坛发言时表示,"现在AI写文作 画,已经比99.99%的人都要做的好。但真正让AI干活,还是一片荒漠。"这是宇树科技宣布IPO计划 后,他首次公开现身,畅谈大模型时代机器人产业发展的机遇与挑战。王兴兴及他所创立的宇 ...
0.3B,谷歌开源新模型,手机断网也能跑,0.2GB内存就够用
3 6 Ke· 2025-09-05 07:14
Core Insights - Google has launched a new open-source embedding model called EmbeddingGemma, designed for edge AI applications with 308 million parameters, enabling deployment on devices like laptops and smartphones for retrieval-augmented generation (RAG) and semantic search [2][3] Group 1: Model Features - EmbeddingGemma ranks highest among open multilingual text embedding models under 500 million parameters on the MTEB benchmark, trained on over 100 languages and optimized to run on less than 200MB of memory [3][5] - The model is designed for flexible offline work, providing customizable output sizes and a 2K token context window, making it suitable for everyday devices [5][13] - It integrates seamlessly with popular tools such as sentence-transformers, MLX, and LangChain, facilitating user adoption [5][12] Group 2: Performance and Quality - EmbeddingGemma generates high-quality embedding vectors, crucial for accurate RAG processes, enhancing the retrieval of relevant context and the generation of contextually appropriate answers [6][9] - The model's performance in retrieval, classification, and clustering tasks surpasses that of similarly sized models, approaching the performance of larger models like Qwen-Embedding-0.6B [10][11] - It utilizes Matryoshka representation learning (MRL) to offer various embedding sizes, allowing developers to balance quality and speed [12] Group 3: Privacy and Efficiency - EmbeddingGemma operates effectively offline, ensuring user data privacy by generating document embeddings directly on device hardware [13] - The model's inference time on EdgeTPU is under 15ms for 256 input tokens, enabling real-time responses and smooth interactions [12][13] - It supports new functionalities such as offline searches across personal files and personalized chatbots, enhancing user experience [13][15] Group 4: Conclusion - The introduction of EmbeddingGemma signifies a breakthrough in miniaturization, multilingual capabilities, and edge AI, potentially becoming a cornerstone for the proliferation of intelligent applications on personal devices [15]
程序员的行情跌到谷底了。。
猿大侠· 2025-09-04 04:11
最近找工作太难了! 以前" 前端、后端、数据等各技术领域+3年经验 "基本可以在中厂混个安稳饭碗,现在HR招人时都要问一 句"你熟不熟RAG、Agent""了解微调吗"… 程序员的传统技能优势,在AI技术持续冲击下,仿佛变得不值一提。 但先别慌,不需要放弃原有优势, 能把现有技术跟AI结合的人反而更吃香! 未来有更多AI应用要落地,这些 将是你的主场! 朋友小李原本做电商后端,纯写 Java 接口,今年开始给推荐系统接入大模型。搞模型调用 、 Prompt 设计 ,甚至 和前端一起聊 Bot 交互 …… 现在是他们组唯一懂这套流程的人,老板让他多带带新人,薪资也涨了 30% ! 你不需 要从零开 始, 而是从"原有能力"出发 , 补足大 模型相关原理、技能、实战,不出一年, 竞争优势就 能甩同行几条街了! 今天给大家分享一个免费的,能快速提高大模型能力的 「大模型应用开发-就业实战」 课程, 由阿里MVP 陈旸老师精心打磨研发, 从0-1解构AI应用开发全流程, 项目写进简历,轻松通关高薪offer! 免费 报名,限 前50人 ,人满关闭通道 技术原理+实战项目+就业指导 吃透大模型,抢先享受AI就业红利! ...
开放几个大模型技术交流群(RAG/Agent/通用大模型等)
自动驾驶之心· 2025-09-04 03:35
Group 1 - The establishment of a Tech communication group focused on large models, inviting participants to discuss topics such as RAG, AI Agents, multimodal large models, and deployment of large models [1] - Interested individuals can join the group by adding a designated WeChat assistant and providing their nickname along with a request to join the large model discussion group [2]
AI读网页,这次真不一样了,谷歌Gemini解锁「详解网页」新技能
机器之心· 2025-09-02 03:44
Core Viewpoint - Google is returning to its core business of search by introducing the Gemini API's URL Context feature, which allows AI to "see" web content like a human [1]. Group 1: URL Context Functionality - The URL Context feature enables the Gemini model to access and process content from URLs, including web pages, PDFs, and images, with a content limit of up to 34MB [1][5]. - Unlike traditional methods where AI reads only summaries or parts of a webpage, URL Context allows for deep and complete document parsing, understanding the entire structure and content [5][6]. - The feature supports various file formats, including PDF, PNG, JPEG, HTML, JSON, and CSV, enhancing its versatility [7]. Group 2: Comparison with RAG - URL Context Grounding is seen as a significant advancement over the traditional Retrieval-Augmented Generation (RAG) approach, which involves multiple complex steps such as content extraction, chunking, vectorization, and storage [11][12]. - The new method simplifies the process, allowing developers to achieve accurate results with minimal coding, eliminating the need for extensive data processing pipelines [13][14]. - URL Context can accurately extract specific data from documents, such as financial figures from a PDF, which would be impossible with just summaries [14]. Group 3: Operational Mechanism - The URL Context operates on a two-step retrieval process to balance speed, cost, and access to the latest data, first attempting to retrieve content from an internal index cache [25]. - If the URL is not cached, it performs real-time scraping to obtain the content [25]. - The pricing model is straightforward, charging based on the number of tokens processed from the content, encouraging developers to provide precise information sources [27]. Group 4: Limitations and Industry Trends - URL Context has limitations, such as being unable to access content behind paywalls, specialized tools like YouTube videos, and having a maximum capacity of processing 20 URLs at once [29]. - The emergence of URL Context indicates a trend where foundational models are increasingly integrating external capabilities, reducing the complexity previously handled by application developers [27].