检索增强生成(RAG)
Search documents
AI搜索解
微软· 2026-03-25 09:57
Investment Rating - The report does not provide a specific investment rating for the industry Core Insights - The industry is undergoing a rapid transformation driven by AI technologies, particularly in how information is searched and content is discovered [4][5] - The shift from keyword-based search to conversational AI is significant, requiring marketers to adapt their strategies accordingly [8][10] - The report emphasizes the importance of understanding large language models (LLMs) and their implications for AI search [11][17] Summary by Sections Introduction - The industry is experiencing one of the fastest transformations in history due to advancements in AI, which are changing how people search for information and make decisions [4][5] Purpose of the Guide - The guide aims to provide actionable insights for marketers to navigate the transition from keyword to conversational search, highlighting the rapid pace of AI innovation [8][9] Understanding Large Language Models (LLMs) - LLMs are trained on vast amounts of data and are evolving towards multimodal capabilities, allowing them to understand and respond to various forms of input [11][13] - The report discusses the limitations of LLMs, emphasizing that they do not possess true understanding but rather generate responses based on statistical patterns [15][16] How AI Search Works - AI search engines like Bing and Google utilize LLMs and retrieval-augmented generation (RAG) to provide accurate responses by integrating pre-trained knowledge with real-time data [17][19] Brand Presentation in AI Search - Brands can be presented in AI search through paid advertisements and natural visibility, with the latter relying on traditional SEO principles [20][30] - The report outlines a structured process for how AI generates responses that include brand information, emphasizing the importance of credible sources [21][22][23] Transition from SEO to GEO - The report highlights the emergence of Generative Engine Optimization (GEO) as a new discipline that builds on traditional SEO practices while adapting to the AI-driven search landscape [30][34] Content Strategy Recommendations - Clear and structured content is essential for visibility in AI search, with a focus on semantic clarity and user intent [36][37] - The report advises against vague expressions and emphasizes the need for context and specificity in content creation [38][39] Practical Advice for Content Strategy - The report suggests that general content should be designed to enhance user experience rather than solely drive traffic, with a focus on localized and culturally relevant content [49][50] Maximizing AI Value through Paid Strategies - Paid advertising remains a crucial avenue for brands to reach users in AI search environments, with a shift towards more integrated and contextually relevant ad formats [51][52][54]
让外部知识“长入”模型:动态化与参数化 RAG 技术探索
AI前线· 2026-03-25 04:22
Core Viewpoint - The article discusses the advancements in Retrieval-Augmented Generation (RAG) techniques, emphasizing the need for dynamic and parameterized approaches to enhance the integration of external knowledge into large language models (LLMs) [2][5][21]. Group 1: Background and Motivation - The emergence of large language models has transformed various aspects of life, providing natural interaction, superior language understanding, and remarkable task generalization [6][7]. - Despite their advantages, LLMs face significant limitations, including the "hallucination" problem, lack of traceability in generated results, and high inference costs [7][8]. Group 2: Challenges in Traditional RAG - Traditional RAG methods treat LLMs as static black boxes, relying on external document retrieval and prompt engineering, which leads to three core challenges: when to trigger retrieval, what content to retrieve, and how to inject external knowledge into the model [11][12][14]. - Current systems either default to always retrieving or rely on user-triggered searches, lacking the ability for models to autonomously determine when to retrieve information [11]. Group 3: Dynamic and Parameterized RAG Techniques - The proposed dynamic and parameterized RAG techniques aim to address the challenges of when to retrieve, what to retrieve, and how to inject knowledge by monitoring the internal state of the model in real-time [21][27]. - A lightweight monitor module can observe the model's internal state and determine when new information is needed, allowing for more efficient retrieval [27][29]. Group 4: Experimental Results - The dynamic retrieval model, named DRAGIN, outperformed several baseline models in accuracy while significantly reducing the number of retrieval calls, demonstrating its efficiency [32][35]. - In various public datasets, DRAGIN achieved notable improvements in evaluation metrics compared to traditional static retrieval methods [33][36]. Group 5: Decoupling Retrieval and Generation - The article introduces a framework that decouples the injection of external knowledge from the context input, allowing for real-time dynamic retrieval without overwhelming the model with excessive context [44][46]. - This approach enhances efficiency and performance by processing external documents offline and using a cross-attention mechanism to integrate knowledge without diluting the original instructions [46][49]. Group 6: Parameterized Knowledge Injection - The concept of parameterized knowledge injection involves encoding external documents into low-dimensional vectors or learnable parameters, which can be integrated into the model's feed-forward network during inference [55][62]. - This method allows for seamless integration of external knowledge, enabling the model to utilize it as if it were internal memory, thus overcoming the limitations of traditional prompt-based methods [58][64]. Group 7: Future Directions - The future research agenda includes developing sustainable learning frameworks that bridge the gap between internal parameters, external memory, and real-time perception, ultimately redefining the role of retrieval in general artificial intelligence [75][79].
3·15曝光后,GEO换马甲了
财联社· 2026-03-16 12:01
Core Viewpoint - The article discusses the ongoing issue of GEO "AI poisoning" business despite the exposure during the 3·15 event, highlighting the manipulation of AI responses through the creation of false information [1][2]. Group 1: Current Status of GEO Business - Following the 3·15 exposure, platforms like Xianyu have blocked direct keywords related to "GEO optimization," but related services can still be found using more obscure terms like "engine optimization" [2]. - GEO optimization services are offered in two models: software service priced at 398 yuan/month or 1980 yuan/year, and agency operation priced at 3980 yuan/quarter or 9800 yuan/year [4]. Group 2: Mechanism of GEO Optimization - GEO (AI search recommendation/optimization) aims to ensure that brand and product information appears prominently in AI-generated answers, operating on the principle that "answers are advertisements" [5]. - The process involves "AI distillation," where keywords are inputted to generate related questions, followed by automated content creation and distribution across various self-media platforms [5][6]. Group 3: Risks and Implications - The GEO black market represents a shift from traditional search engine optimization to more dangerous "cognitive manipulation," undermining user trust in AI commercialization [7][8]. - The exploitation of vulnerabilities in the RAG (retrieval-augmented generation) technology used by mainstream AI models allows for the systematic pollution of AI's training data with false information [9]. Group 4: Recommendations for Mitigation - To combat the issue, a comprehensive defense system involving technology, ecology, and legal frameworks is necessary, including the establishment of a high-confidence "whitelist" for credible sources [10]. - Content platforms must take responsibility as "data gatekeepers" for AI, and legal definitions of malicious data poisoning should be established to deter such practices [11].
江波龙(301308) - 2026年2月25日投资者关系活动记录表
2026-02-27 09:40
Group 1: Company Technology and Product Development - The company has launched multiple main control chips for UFS, eMMC, SD cards, and high-end USB, utilizing advanced foundry processes and self-developed core IP, resulting in significant performance and power consumption advantages [3] - As of Q3 2025, the cumulative deployment of the company's self-developed main control chips has exceeded 100 million units, indicating strong market penetration [3] - The mSSD product, as an upgrade to traditional SSDs, offers advantages such as lightweight design, low power consumption, and competitive performance, with a broad market outlook [3][4] Group 2: Market Trends and Future Outlook - The demand for storage is expected to surge due to the structural changes in AI inference, particularly with the application of key-value caching and retrieval-augmented generation technologies [4] - The rapid expansion of AI infrastructure and HDD supply shortages are anticipated to drive explosive growth in storage demand, although short-term output growth may be limited due to the lag in capacity construction [4] Group 3: New Product Innovations - The company has released several cutting-edge storage products, including MRDIMM and CXL2.0 memory expansion modules, and is actively innovating based on market needs [4] - The company has established a robust intellectual property portfolio around mSSD, facilitating the transition from R&D validation to commercial implementation [3][4] Group 4: Investor Relations and Communication - The investor relations activity was conducted on February 25, 2026, with participation from institutions such as Dongfang Securities and Penghua Fund, indicating active engagement with key stakeholders [2]
Nature和Science同时报道了一篇论文,试图根治AI幻觉
3 6 Ke· 2026-02-05 12:24
Core Insights - The article discusses the release of OpenScholar, an 8 billion parameter model that surpasses flagship models in scientific literature review tasks, signaling a shift away from "parameter worship" towards a more reliable knowledge retrieval approach [1][4][6] Model Performance - OpenScholar, with only 8 billion parameters, outperformed flagship models in scientific literature review tasks, demonstrating a significant reduction in reasoning costs to approximately $0.003 per query [4][6] - In benchmark tests, OpenScholar-8B achieved higher accuracy rates compared to existing models, showcasing its effectiveness in retrieving and verifying information [6][8] Methodology - OpenScholar employs a unique process that includes retrieving relevant segments from a database of 45 million open-access papers, reordering them for accuracy, and generating answers through self-review to ensure evidence-backed responses [5][6] - The model's approach contrasts with traditional models that rely on memorization, instead teaching the AI to "look up" information like a human researcher [5][8] Future Developments - The upcoming model, DR Tulu, aims to tackle deeper research tasks by utilizing Reinforcement Learning with Evolving Rubrics, allowing the model to dynamically generate evaluation criteria during research [9][10] - DR Tulu is designed to enhance planning capabilities, enabling it to create outlines and synthesize information from multiple sources for comprehensive reports [9][10] Key Contributors - Akari Asai, a prominent figure in the development of OpenScholar and DR Tulu, emphasizes the importance of democratizing access to advanced AI tools for researchers worldwide [13][15] - Asai's philosophy advocates for models that embrace the vastness of knowledge rather than attempting to encapsulate it entirely within their parameters [15][16]
吕本富:治理AI“藏广告”,需要“内外兼修”
Huan Qiu Wang Zi Xun· 2026-02-01 23:05
Core Insights - The article discusses the emergence of Generative Engine Optimization (GEO), a new advertising method that integrates digital marketing with AI technology, driven by changes in user behavior, technological upgrades, market demand, and the decline of traditional SEO [1][2]. Group 1: GEO Market Dynamics - As of June 2025, the user base for generative AI in China has surpassed 515 million, with significant applications in smart search and content creation [2]. - The domestic GEO market is projected to exceed 4.2 billion yuan by 2025, with a compound annual growth rate of 38% over the past three years [2]. - The shift in user interaction towards AI has led to a decline in traditional search engine usage, with the proportion of search engine users among internet users dropping from previous surveys [2]. Group 2: Technical Aspects of GEO - GEO operates on a Retrieval-Augmented Generation (RAG) architecture, utilizing vector databases, dynamic knowledge graphs, and multimodal adaptation to create a comprehensive content production and AI citation system [2]. - The optimization techniques in GEO include semantic vectorization, which adjusts content to increase its proximity to user queries in vector space, thereby enhancing the likelihood of being referenced by AI [3]. - GEO practitioners may engage in "data pollution" by flooding AI with low-quality or repetitive content to manipulate AI responses [3]. Group 3: Ethical and Regulatory Challenges - The unregulated growth of GEO raises legal and ethical challenges, creating conflicts between commercial interests and information neutrality, as well as between technological manipulation and ecological fairness [4]. - There is an urgent need to establish standards for the adoption and purification of corpora to prevent content pollution and ensure the integrity of AI-generated information [4]. - The article emphasizes the importance of distinguishing between advertising and regular content, advocating for clear labeling of GEO-adjusted content to avoid user confusion [5].
检索做大,生成做轻:CMU团队系统评测RAG的语料与模型权衡
机器之心· 2026-01-06 00:31
Core Insights - The core argument of the research is that expanding the retrieval corpus can significantly enhance Retrieval-Augmented Generation (RAG) performance, often providing benefits that can partially substitute for increasing model parameters, although diminishing returns occur at larger corpus sizes [4][22]. Group 1: Research Findings - The study reveals that the performance of RAG is determined by both the retrieval module, which provides evidence, and the generation model, which interprets the question and integrates evidence to form an answer [7]. - The research indicates that smaller models can achieve performance levels comparable to larger models by increasing the retrieval corpus size, with a consistent pattern observed across multiple datasets [11][12]. - The findings show that the most significant performance gains occur when moving from no retrieval to having retrieval, with diminishing returns as the corpus size increases [13]. Group 2: Experimental Design - The research employed a full factorial design, varying only the corpus size and model size while keeping other variables constant, using a large dataset of approximately 264 million real web documents [9]. - The evaluation covered three open-domain question-answering benchmarks: Natural Questions, TriviaQA, and Web Questions, using common metrics such as F1 and ExactMatch [9]. Group 3: Mechanisms of Improvement - The increase in corpus size enhances the probability of retrieving answer-containing segments, leading to more reliable evidence for the generation model [16]. - The study defines the Gold Answer Coverage Rate, which measures the probability that at least one of the top chunks provided to the generation model contains the correct answer string, showing a monotonic increase with corpus size [16]. Group 4: Practical Implications - The research suggests that when resources are constrained, prioritizing the expansion of the retrieval corpus and improving coverage can allow medium-sized generation models to perform close to larger models [20]. - The study emphasizes the importance of tracking answer coverage and utilization rates as diagnostic metrics to identify whether bottlenecks are in the retrieval or generation components [20].
系统学习Deep Research,这一篇综述就够了
机器之心· 2026-01-01 04:33
Core Insights - The article discusses the evolution of Deep Research (DR) as a new direction in AI, moving from simple dialogue and creative writing applications to more complex research-oriented tasks. It highlights the limitations of traditional retrieval-augmented generation (RAG) methods and introduces DR as a solution for multi-step reasoning and long-term research processes [2][30]. Summary by Sections Definition of Deep Research - DR is not a specific model or technology but a progressive capability pathway for research-oriented agents, evolving from information retrieval to complete research workflows [5]. Stages of Capability Development - **Stage 1: Agentic Search** - Models gain the ability to actively search and retrieve information dynamically based on intermediate results, focusing on efficient information acquisition [5]. - **Stage 2: Integrated Research** - Models evolve to understand, filter, and integrate multi-source evidence, producing coherent reports [6]. - **Stage 3: Full-stack AI Scientist** - Models can propose research hypotheses, design and execute experiments, and reflect on results, emphasizing depth of reasoning and autonomy [6]. Core Components of Deep Research - **Query Planning** - Involves deciding what information to query next, incorporating dynamic adjustments in multi-round research [10]. - **Information Retrieval** - Focuses on when to retrieve, what to retrieve, and how to filter retrieved information to avoid redundancy and ensure relevance [12][13][14]. - **Memory Management** - Essential for long-term reasoning, involving memory consolidation, indexing, updating, and forgetting [15]. - **Answer Generation** - Stresses the logical consistency between conclusions and evidence, requiring integration of multi-source evidence [17]. Training and Optimization Methods - **Prompt Engineering** - Involves designing multi-step prompts to guide the model through research processes, though its effectiveness is highly dependent on prompt design [20]. - **Supervised Fine-tuning** - Utilizes high-quality reasoning trajectories for model training, though acquiring annotated data can be costly [21]. - **Reinforcement Learning for Agents** - Directly optimizes decision-making strategies in multi-step processes without complex annotations [22]. Challenges in Deep Research - **Coordination of Internal and External Knowledge** - Balancing reliance on internal reasoning versus external information retrieval is crucial [24]. - **Stability of Training Algorithms** - Long-term task training often faces issues like policy degradation, limiting exploration of diverse reasoning paths [24]. - **Evaluation Methodology** - Developing reliable evaluation methods for research-oriented agents remains an open question, with existing benchmarks needing further exploration [25][27]. - **Memory Module Construction** - Balancing memory capacity, retrieval efficiency, and information reliability is a significant challenge [28]. Conclusion - Deep Research represents a shift from single-turn answer generation to in-depth research addressing open-ended questions. The field is still in its early stages, with ongoing exploration needed to create autonomous and trustworthy DR agents [30].
2025年AI大模型资料汇编
Sou Hu Cai Jing· 2025-12-24 10:45
Group 1: Core Insights - The AI large model industry is undergoing a structural transformation in 2025, shifting competition from mere capability to sustainability across four dimensions: technological paradigms, market structure, application forms, and global governance [1] - Significant breakthroughs in technology include a shift from RLHF to RLVR training paradigms, enabling models to achieve leaps in reasoning capabilities through self-verification [1] - The mixed expert (MoE) architecture is making a strong comeback, balancing parameter scale and computational costs through sparse activation modes, thus achieving extreme cost-effectiveness [1] Group 2: Market Dynamics - The market is experiencing a dual tension of centralization and democratization, with Google’s Gemini 3 ending OpenAI's long-standing lead, while Chinese models achieve competitive advantages through cost-effectiveness [2] - The market is concentrating towards leading players, with top startups like Anthropic receiving significant funding, while second and third-tier players face elimination [2] - Open-source models, led by Chinese firms, are approaching the performance of closed-source products, creating a counterbalance in the market [2] Group 3: Application Evolution - Applications are evolving into a new stage of deep integration, transitioning from general chat assistants to specialized tools and autonomous agents embedded in professional workflows [2] - The rise of "AI-native application layers" is transforming software development, with developers shifting roles from coders to system designers and AI trainers [2] - Deployment models are trending towards "cloud + edge collaboration," with local deployments gaining traction due to privacy compliance needs [2] Group 4: Global Governance - Global governance is entering a phase of differentiated competition, with the EU prioritizing safety through strict regulations, the US focusing on industry self-regulation, and China advocating a balanced approach to development and safety [3] - The regulatory competition is driven by the struggle for technological standard-setting authority, emerging as a new battleground in tech competition [3] - The societal impact of AI is beginning to show through employment structure adjustments and educational model transformations, with human-AI collaboration becoming a new trend [3] Group 5: Future Outlook - The AI large model industry is transitioning from a scale competition to a new phase emphasizing efficiency, depth, and integration [3] - Future winners will need to navigate the complex interactions of four forces: technological efficiency, scenario integration, ecological positioning, and compliance adaptation [3] - Key opportunities include "cloud + edge collaboration," parallel tracks of open-source and closed-source development, and the evolution of the agent ecosystem [3]
AI智能体时代中的记忆:形式、功能与动态综述
Xin Lang Cai Jing· 2025-12-17 04:42
Core Insights - Memory is identified as a core capability for agents based on foundational models, facilitating long-term reasoning, continuous adaptation, and effective interaction with complex environments [1][11][15] - The field of agent memory research is rapidly expanding but is becoming increasingly fragmented, with significant differences in motivation, implementation, assumptions, and evaluation schemes [1][11][16] - Traditional classifications of memory, such as long-term and short-term memory, are insufficient to capture the diversity and dynamics of contemporary agent memory systems [1][11][16] Summary by Sections Introduction - Over the past two years, powerful large language models (LLMs) have evolved into robust AI agents, achieving significant progress across various fields such as deep research, software engineering, and scientific discovery [4][14] - There is a growing consensus in academia that agents require capabilities beyond just LLMs, including reasoning, planning, perception, memory, and tool usage [4][14][15] Importance of Memory - Memory is crucial for transforming static LLMs into adaptive agents capable of continuous adaptation through environmental interaction [5][15] - Various applications, including personalized chatbots, recommendation systems, social simulations, and financial investigations, depend on agents' ability to manage historical information actively [5][15] Need for New Classification - The increasing importance of agent memory systems necessitates a new perspective on contemporary agent memory research [6][16] - Existing classification systems are outdated and do not reflect the breadth and complexity of current research, highlighting the need for a coherent classification that unifies emerging concepts [6][16] Framework and Key Questions - The review aims to establish a systematic framework to reconcile existing definitions and connect emerging trends in agent memory [19] - Key questions addressed include the definition of agent memory, its relationship with related concepts, its forms, functions, and dynamics, as well as emerging research frontiers [19] Emerging Research Directions - The review identifies several promising research directions, including automated memory design, integration of reinforcement learning with memory systems, multimodal memory, shared memory in multi-agent systems, and issues of trustworthiness [20][12] Contributions of the Review - The review proposes a multidimensional classification of agent memory from a "form-function-dynamics" perspective, providing a structured view of current developments in the field [20] - It explores the applicability and interaction of different memory forms and functions, offering insights on aligning various memory types with different agent objectives [20] - A comprehensive resource collection, including benchmark tests and open-source frameworks, is compiled to support further exploration of agent memory systems [20]