上下文学习
Search documents
圣母大学团队打造分子设计新利器:让AI像写文章一样创造分子
仪器信息网· 2025-11-19 09:08
Core Insights - The article discusses the breakthrough AI system DemoDiff developed by a team from the University of Notre Dame, which can design new molecular structures by learning from a few examples, significantly accelerating drug and material development processes [7][8][10]. Group 1: AI Understanding of Molecular Design - DemoDiff mimics human learning by analyzing a few successful molecular examples to understand design patterns, allowing it to generate new candidates quickly [10][11]. - The system can even learn from negative examples, generating high-quality molecules based on poorly performing ones, showcasing its advanced reasoning capabilities [21][22]. Group 2: Innovative Molecular Representation - The team introduced a new method called "node-pair encoding," which simplifies complex molecular structures, improving efficiency by 5.5 times [9][12]. - This method allows for a significant reduction in the number of atoms needed to describe a molecule, enhancing the AI's ability to process more examples [12][13]. Group 3: Comprehensive Molecular Database - DemoDiff was trained on an extensive database containing over 1 million molecular structures and 155,000 different molecular properties, providing a rich resource for learning [14][15]. - The database includes various sources, such as the ChEMBL database, which records millions of drug molecules and their biological activities [14][15]. Group 4: Diffusion Model for Molecular Generation - The core technology of DemoDiff is based on a "diffusion model," which generates molecular structures through a progressive refinement process, ensuring chemical validity [16][17]. - This model incorporates context learning, allowing the AI to adapt its output based on different sets of example molecules [18]. Group 5: Performance Testing and Validation - DemoDiff underwent rigorous testing across 33 different molecular design tasks, demonstrating performance comparable to much larger AI models [19][20]. - The system excels in generating diverse molecular structures, providing researchers with multiple options for further exploration [20]. Group 6: Negative Learning Capability - The AI's ability to learn from negative examples allows it to infer what makes a successful molecule, enhancing its design capabilities [21][22]. - This feature is particularly valuable in early drug development stages, where researchers often have more negative examples than positive ones [21][22]. Group 7: Technical Innovations - The system employs a "graph attention mechanism" to focus on multiple important parts of a molecule simultaneously, ensuring a holistic understanding during generation [23]. - A multi-layer validation mechanism checks the generated molecules against fundamental chemical rules, ensuring their feasibility [23][24]. Group 8: Implications for Molecular Design - DemoDiff represents a paradigm shift in molecular design, potentially reducing the time and cost associated with drug development significantly [25][26]. - The technology may democratize molecular design, allowing a broader range of researchers to participate in innovation [26]. Group 9: Future Considerations - While DemoDiff shows impressive capabilities, there is recognition of the need for further improvements, particularly in handling specific design tasks [27]. - Future developments may include expanding the model's scale and enhancing data quality to tackle more complex challenges [27][28].
NeurIPS 2025 | 上下文元学习实现不微调跨被试脑活动预测
机器之心· 2025-11-19 04:07
Core Insights - The article discusses the development of BraInCoRL, a novel brain encoding model that utilizes meta-learning and context learning to predict brain responses from visual stimuli with minimal data requirements [3][32]. - This model addresses the limitations of traditional visual encoding models, which require extensive data collection for each individual, making them costly and difficult to implement in clinical settings [6][32]. Background and Innovation - The research highlights significant functional differences in the human higher visual cortex among individuals, necessitating the creation of brain encoding models that can effectively represent these differences [2][6]. - BraInCoRL allows for the prediction of brain responses using only a small number of example images and their corresponding brain activity data, eliminating the need for model fine-tuning [3][32]. Methodology - The BraInCoRL framework treats each voxel as an independent function mapping visual stimuli to neural responses, leveraging meta-learning and context learning to enhance data efficiency and generalization [7][10]. - During training, the model learns shared structures of visual cortex responses from multiple subjects, and during testing, it can generate a subject-specific voxel encoder using just a few image-brain response pairs [11][20]. Experimental Results - BraInCoRL demonstrates high data efficiency, achieving comparable variance explanation to models trained on thousands of images while only using 100 context images [20][22]. - The model shows robust performance across different datasets and scanning protocols, confirming its cross-device and cross-protocol generalization capabilities [22][23]. - Semantic clustering visualizations reveal clear functional organization within the visual cortex, with distinct areas for faces, scenes, and other categories [26][27]. Conclusion - BraInCoRL introduces in-context learning to computational neuroscience, creating a data-efficient, interpretable, and language-interactive framework for visual cortex encoding [32]. - This innovation significantly lowers the barriers for constructing individualized brain encoding models, paving the way for applications in clinical neuroscience and other data-limited scenarios [32].
深度|Andrej Karpathy:行业对Agent的发展过于乐观,一个能真正帮你工作的Agent还需要十年发展时间
Z Potentials· 2025-11-05 02:57
Core Insights - The article discusses the evolution of AI, particularly focusing on the development of agent systems and the challenges they face in achieving true intelligence [4][5][6][7][8][9][10]. Group 1: Future of AI Agents - Andrej Karpathy emphasizes that the next decade will be crucial for the development of AI agents, suggesting that current systems are not yet mature enough to be fully utilized in practical applications [5][6][7]. - The concept of a "cognitive core" is introduced, which refers to a stripped-down version of knowledge that retains intelligent algorithms and problem-solving strategies, highlighting the need for better data quality in training models [5][16]. - Karpathy expresses concern that society may lose understanding and control over AI systems as they become more integrated into daily life, leading to a disconnect between users and the underlying mechanisms of these systems [5][6]. Group 2: Historical Context and Learning Mechanisms - The article outlines significant milestones in AI development, such as the introduction of AlexNet and the Atari reinforcement learning era, which shaped the current landscape of AI research [8][9][10]. - Karpathy argues that human learning differs fundamentally from reinforcement learning, suggesting that humans build rich world models through experience rather than relying solely on reward signals [40]. - The discussion includes the limitations of current AI models in terms of continuous learning and the need for a more sophisticated understanding of context and memory [22][23]. Group 3: AI's Current Limitations - Karpathy critiques the current state of AI, stating that many generated code outputs are of mediocre quality and that the industry is experiencing a phase of over-optimism regarding AI capabilities [5][6][37]. - The article highlights the challenges AI faces in understanding complex code structures and the limitations of code generation models in producing original, contextually appropriate code [30][31][36]. - The need for a more nuanced approach to AI development is emphasized, suggesting that improvements must occur across multiple dimensions, including algorithms, data, and computational power [24][25][27].
Meta拆掉AI持续学习路上的最大炸弹,“微调”又有了一战之力
3 6 Ke· 2025-10-27 05:13
Core Insights - The article discusses the recent advancements in large language models (LLMs) regarding their ability to achieve continual learning and self-evolution, addressing criticisms about their lack of genuine learning capabilities [1][2]. Group 1: Paths to Continual Learning - The ability of LLMs to learn continuously is fundamentally linked to their memory depth and plasticity, with three main paths identified for enhancing this capability [2]. - The first path involves modifying the "context" or "working memory" of the model through In-Context Learning (ICL), where new information is provided in prompts to help the model learn to solve specific problems [4][6]. - The second path introduces an "external memory bank" (RAG), allowing models to access and maintain an external database for comparison and retrieval, exemplified by Google's DeepMind's "Reasoningbank" [7]. - The third path focuses on parameter-level continual learning, which has faced challenges due to the complexities and instabilities associated with methods like Reinforcement Learning (RL) and Low-Rank Adaptation (LoRA) [10][11]. Group 2: Sparse Memory Fine-Tuning - Meta AI's recent paper introduces Sparse Memory Fine-Tuning (SFT) as a solution to the challenges of traditional SFT, particularly addressing the issue of catastrophic forgetting [11][28]. - The proposed method involves a three-step process: modifying the architecture to include a memory layer, using TF-IDF to identify which parameters to update, and performing sparse updates to only the most relevant parameters [12][22][23]. - This new approach has shown significant improvements, with models experiencing only an 11% drop in performance on original tasks after learning new facts, compared to 71% and 89% drops with LoRA and full fine-tuning, respectively [23][25]. Group 3: Implications for the Future of LLMs - The advancements in SFT suggest a potential shift in how models can be updated safely and effectively, moving away from static tools to dynamic agents capable of continuous learning [31][32]. - The successful implementation of these methods could mark the beginning of a new era for self-evolving models, aligning with the vision of models that grow and adapt through experience [31][32].
X上63万人围观的Traning-Free GRPO:把GRPO搬进上下文空间学习
机器之心· 2025-10-22 08:46
Core Viewpoint - The article discusses the introduction of Training-Free Group Relative Policy Optimization (GRPO), a method that allows for reinforcement learning (RL) without the need to update model parameters, making it more accessible and cost-effective for developers and smaller teams [4][20][28]. Summary by Sections GRPO Overview - GRPO has gained popularity in large model reinforcement learning, particularly for tasks like mathematical reasoning and multi-agent collaboration [2]. - The core mechanism of GRPO involves "multi-path parallelism + group advantage," which, while powerful, is costly in terms of model parameter optimization [3]. Training-Free GRPO - Tencent Youtu's recent paper proposes a solution to the high costs of parameter updates by moving the GRPO learning process into the context space, allowing for multiple answer paths to be generated and evaluated without changing model parameters [4][6]. - The method involves generating multiple rollout paths for the same problem, scoring them, and using the advantage signals to refine the model's preferences for high-quality solutions [4][10]. Experimental Results - In mathematical reasoning tasks, Training-Free GRPO can enhance performance using only 100 training samples at a cost of approximately $8 to $18 on a 671 billion parameter model [13][24]. - The method shows significant improvements in performance metrics, such as a 4.6% increase in Pass@1 in web search scenarios without updating model parameters [17][18]. Advantages of Training-Free GRPO - The approach retains the advantages of GRPO, including multi-path exploration and independent training/testing sets, while drastically reducing costs by eliminating the need for parameter updates [20][21]. - It allows for better generalization across different tasks without the complexity and maintenance costs associated with multiple specialized models [25]. Conclusion - Training-Free GRPO represents a shift in the understanding of reinforcement learning, demonstrating that effective RL can be achieved without traditional parameter updates, making it a viable option for developers with limited resources [26][28].
大佬开炮:智能体都在装样子,强化学习很糟糕,AGI 十年也出不来
自动驾驶之心· 2025-10-22 00:03
Core Insights - The article discusses the current state and future of AI, particularly focusing on the limitations of reinforcement learning and the timeline for achieving Artificial General Intelligence (AGI) [5][6][10]. Group 1: AGI and AI Development - AGI is expected to take about ten years to develop, contrary to the belief that this year would be the year of agents [12][13]. - Current AI agents, such as Claude and Codex, are impressive but still lack essential capabilities, including multi-modal abilities and continuous learning [13][14]. - The industry has been overly optimistic about the pace of AI development, leading to inflated expectations [12][15]. Group 2: Limitations of Reinforcement Learning - Reinforcement learning is criticized as being inadequate for replicating human learning processes, as it often relies on trial and error without a deep understanding of the problem [50][51]. - The approach of reinforcement learning can lead to noise in the learning process, as it weights every action based on the final outcome rather than the quality of the steps taken [51][52]. - Human learning involves a more complex reflection on successes and failures, which current AI models do not replicate [52][53]. Group 3: Future of AI and Learning Mechanisms - The future of AI may involve more sophisticated attention mechanisms and learning algorithms that better mimic human cognitive processes [33][32]. - There is a need for AI models to develop mechanisms for long-term memory and knowledge retention, which are currently lacking [31][32]. - The integration of AI into programming and development processes is seen as a continuous evolution rather than a sudden leap to superintelligence [45][47].
Andrej Karpathy 开炮:智能体都在装样子,强化学习很糟糕,AGI 十年也出不来
机器之心· 2025-10-18 05:44
Core Viewpoint - AI is projected to contribute an annual GDP increase of 2%, but the current state of the industry is criticized for being overly optimistic and disconnected from reality [2][5]. Group 1: AGI and Learning - AGI is expected to take about ten years to develop, as current AI agents lack the necessary cognitive abilities and continuous learning capabilities [9][11]. - Current AI models, particularly large language models (LLMs), exhibit cognitive deficiencies that hinder their performance [34][36]. - The concept of reinforcement learning is deemed inadequate for replicating human learning processes, as it oversimplifies the complexity of human decision-making [44][46]. Group 2: AI Development and Challenges - The industry is experiencing a phase of rapid development, but there is skepticism about the actual capabilities of AI models, which are often overhyped [5][41]. - Current AI agents struggle with understanding and integrating unique coding implementations, leading to inefficiencies and misunderstandings in code generation [36][41]. - The reliance on pre-trained models and the limitations of current AI tools highlight the need for further advancements in AI technology [20][42]. Group 3: Future of AI - The future of AI is expected to involve more sophisticated attention mechanisms and potentially a shift towards more efficient learning algorithms [29][30]. - There is a belief that while AI will continue to evolve, it will still rely on foundational principles such as gradient descent for training large neural networks [29][30]. - The ongoing improvements in AI tools and models suggest a continuous integration of new techniques and methodologies to enhance performance [42][43].
万字长文!RAG实战全解析:一年探索之路
自动驾驶之心· 2025-08-07 09:52
Core Viewpoint - The article discusses the Retrieval Augmented Generation (RAG) method, which combines retrieval-based models and generative models to enhance the quality and relevance of generated text. It addresses issues such as hallucination, knowledge timeliness, and long text processing in large models [1]. Group 1: Background and Challenges - RAG was proposed by Meta in 2020 to enable language models to access external information beyond their internal knowledge [1]. - RAG faces three main challenges: retrieval quality, enhancement process, and generation quality [2]. Group 2: Challenges in Retrieval Quality - Semantic ambiguity can arise from vector representations, leading to irrelevant results [5]. - User input has become more complex, transitioning from keywords to natural dialogue, which complicates retrieval [5]. - Document segmentation methods can affect the matching degree between document blocks and user queries [5]. - Extracting and representing multimodal content (e.g., tables, charts) poses significant challenges [5]. - Integrating context from retrieved paragraphs into the current generation task is crucial for coherence [5]. - Redundancy and repetition in retrieved content can lead to duplicated information in generated outputs [5]. - Determining the importance of multiple retrieved paragraphs for the generation task is challenging [5]. - Over-reliance on retrieval content can exacerbate hallucination issues [5]. - Irrelevance of generated answers to the query is a concern [5]. - Toxicity or bias in generated answers is another issue [5]. Group 3: Overall Architecture - The product architecture consists of four layers, including model layer, offline understanding layer, online Q&A layer, and scenario layer [7]. - The RAG framework is divided into three main components: query understanding, retrieval model, and generation model [10]. Group 4: Query Understanding - The query understanding module aims to improve retrieval by interpreting user queries and generating structured queries [14]. - Intent recognition helps select relevant modules based on user queries [15]. - Query rewriting utilizes LLM to rephrase user queries for better retrieval [16]. - Query expansion breaks complex questions into simpler sub-questions for more effective retrieval [22]. Group 5: Retrieval Model - The retrieval model's effectiveness depends on the accuracy of embedding models [33]. - Document loaders facilitate loading document data from various sources [38]. - Text converters prepare documents for retrieval by segmenting them into smaller, semantically meaningful chunks [39]. - Document embedding models create vector representations of text to enable semantic searches [45]. - Vector databases support efficient storage and search of embedded data [47]. Group 6: Generation Model - The generation model utilizes retrieved information to generate coherent responses to user queries [60]. - Different strategies for prompt assembly are employed to enhance response generation [62][63]. Group 7: Attribution Generation - Attribution in RAG is crucial for aligning generated content with reference information, ensuring accuracy [73]. - Dynamic computation methods can enhance the generation process by matching generated text with reference sources [76]. Group 8: Evaluation - The article emphasizes the importance of defining metrics and evaluation methods for assessing RAG system performance [79]. - Various evaluation frameworks, such as RGB and RAGAS, are introduced to benchmark RAG systems [81]. Group 9: Conclusion - The article summarizes key modules in RAG practice and highlights the need for continuous research and development to refine these technologies [82].
不靠海量数据,如何精准喂养大模型?上交Data Whisperer:免训练数据选择法,10%数据逼近全量效果
机器之心· 2025-07-29 06:38
Core Viewpoint - The article introduces "Data Whisperer," a novel framework for efficient data selection in fine-tuning large language models (LLMs) without the need for additional training, achieving near-optimal performance with only 10% of the data compared to full datasets [2][4][36]. Group 1: Methodology and Mechanism - Data Whisperer utilizes the in-context learning (ICL) capabilities of pre-trained models to select "golden training samples" without requiring a scoring model [2][6]. - The framework employs a scoring mechanism based on the model's own outputs and attention weights, ensuring a stable and reasonable selection process [10][12]. - It introduces a new efficiency metric, Selection-to-Tuning Ratio (STR), which shows that Data Whisperer significantly outperforms traditional methods in terms of time efficiency [17][18]. Group 2: Performance Metrics - In various tasks, Data Whisperer achieved impressive results, such as 72.46% accuracy on the GSM8K dataset using only 10% of the data, surpassing the full dataset performance of 71.39% [19]. - The framework also demonstrated superior performance in the DialogSum and BioInstruct tasks, with notable improvements over existing state-of-the-art methods [19][21]. Group 3: Robustness and Adaptability - Data Whisperer shows robustness in input scale, with optimal configurations identified for the number of demonstration and query samples, indicating that it effectively selects core samples rather than relying on sheer volume [26][28]. - The framework supports a weak-to-strong mechanism, allowing smaller models to select tasks for larger models, thus reducing computational burdens while maintaining performance [22][24]. Group 4: Comparative Analysis - Data Whisperer outperforms all mainstream data selection methods across accuracy, efficiency, and stability, particularly in low-budget scenarios [35]. - The framework's theoretical foundation is based on the relationship between ICL and fine-tuning, allowing it to effectively pre-train for training efficiency without adjusting model parameters [36][37]. Group 5: Future Directions - Potential future explorations include applying the method to complex structured tasks in fields like law and medicine, enhancing task alignment capabilities, and integrating human feedback [41][42].
一个任务50次调用,成本狂砍90%?Manus首次公开上下文工程秘诀,一堆反复重写换来的教训
AI前线· 2025-07-21 07:04
Core Insights - The article emphasizes the importance of context engineering in developing AI agents, highlighting the need for rapid iteration and improvement in response to evolving models and technologies [1][2]. Group 1: KV Cache Design - KV cache hit rate is identified as the most critical metric for AI agents in production, directly impacting latency and cost [4]. - The average input to output token ratio in Manus is approximately 100:1, which significantly benefits from KV caching, reducing the cost of cached input tokens to $0.30 per MTok compared to $3 per MTok for uncached tokens [5]. - Key practices to improve KV cache hit rate include maintaining stable prompt prefixes, appending content only, and marking cache breakpoints explicitly [8][9][10]. Group 2: Tool Management - As agents develop more capabilities, the complexity of the action space increases, leading to potential inefficiencies if tools are dynamically added or removed during iterations [11][14]. - Manus employs a context-aware state machine to manage tool availability without removing tools, thus preventing confusion and maintaining KV cache integrity [14][15][16]. Group 3: Context as a File System - The article discusses the limitations of context windows in modern large language models, suggesting that a file system can serve as an infinite context, allowing agents to read and write files as structured external memory [21]. - Manus implements a recoverable compression strategy, retaining essential information like URLs while allowing for context length reduction [24]. Group 4: Attention Manipulation - Manus uses a "todo.md" file to keep track of tasks, which helps maintain focus and avoid losing sight of goals during complex tasks [26][30]. - Retaining errors in the context is proposed as a method to improve agent behavior, allowing the model to learn from mistakes and reduce the likelihood of repeating them [32][35]. Group 5: Sample Diversity - The article warns against the pitfalls of few-shot prompting in agent systems, which can lead to repetitive and suboptimal actions [36]. - Introducing structured variations in actions and observations can help break patterns and adjust the model's attention, enhancing overall performance [37][38]. Group 6: Conclusion - Context engineering is deemed essential for AI agents, influencing their speed, recovery capabilities, and scalability [39]. - The future of agents will focus on constructing context effectively, underscoring the importance of thoughtful design [40].