模型可解释性 - filings, earnings calls, financial reports, news

模型可解释性

Search documents

Nature Health：钱学骏/裴静合作开发CT基础模型，实现“一扫多筛”的多癌种筛查

生物世界· 2026-02-09 01:00

Core Viewpoint - Early cancer screening is crucial for reducing incidence and mortality rates, with a need for cost-effective, high-throughput screening methods that can address the demands of asymptomatic populations [2][5][6]. Group 1: Research Development - The research team developed OMAFound, a foundational model for multi-cancer screening using non-contrast computed tomography (CT), enabling simultaneous detection of lung and breast cancer [3][7]. - OMAFound's predictive performance matches that of specialized organ AI models and surpasses experienced radiologists in sensitivity for screening scenarios [3][11]. Group 2: Cancer Statistics and Screening Needs - In 2022, approximately 20 million new cancer cases and 9.7 million deaths were reported globally, highlighting the ongoing rise in cancer burden due to aging populations and lifestyle factors [5][6]. - Early-stage cancer patients have significantly higher five-year survival rates compared to late-stage patients, emphasizing the urgency for effective early screening strategies in high-risk, asymptomatic populations [5][6]. Group 3: Screening Methodology and Model Features - Traditional single-cancer screening methods are inadequate for high-throughput needs, necessitating the exploration of cost-effective multi-cancer early screening strategies [6][9]. - OMAFound utilizes over 200,000 CT images for pre-training and employs self-supervised learning to enhance robustness against variations in equipment and settings, improving multi-cancer screening capabilities [7][9]. Group 4: Performance Metrics - In a cohort of over 20,000 individuals, OMAFound achieved detection accuracies of 82.2% for breast cancer and 88.0% for lung cancer in women, and 86.1% for lung cancer in men [9][11]. - The model significantly improved the sensitivity of experienced radiologists by 38.9% for breast cancer and 16.0% for lung cancer, while maintaining specificity [9][11]. Group 5: Clinical Implications - OMAFound emphasizes model interpretability, aiding physicians in identifying potential lesions more effectively, thus enhancing the clinical applicability and acceptance of opportunistic breast cancer screening [11]. - The model represents a practical technological tool for AI-enabled multi-cancer screening, aiming to facilitate early detection and diagnosis, with significant clinical and societal implications [11].

OpenAI又Open了一下：发布可解释性新研究，作者来自Ilya超级对齐团队

量子位· 2025-11-15 02:08

Core Insights - OpenAI has introduced a new method for training smaller models that enhances interpretability, making the internal mechanisms of models easier for humans to understand [5][6][7] - The research focuses on creating sparse models with many neurons but fewer connections, simplifying neural networks for better comprehension [7][11] Summary by Sections Model Interpretability - OpenAI's language models have complex structures that are not fully understood, and the new method aims to bridge this gap [6] - The core idea is to train sparse models that maintain a high number of neurons while limiting their connections, making them simpler and more interpretable [7][11] Research Methodology - The researchers designed a series of simple algorithmic tasks to evaluate the model's interpretability, identifying the "circuit" for each task [13][18] - A "circuit" is defined as the smallest computational unit that allows the model to perform a specific task, represented as a graph of nodes and edges [15][16] Example of Circuit - An example task involves predicting the correct closing quote for a string in Python, demonstrating how the model can remember the type of opening quote to complete the string [19][22] Findings and Implications - The research indicates that larger, sparser models can produce increasingly powerful functions while maintaining simpler circuits [26] - This suggests potential for extending the method to understand more complex behaviors in models [27] Current Limitations - The study acknowledges that sparse models are significantly smaller than state-of-the-art models and still contain many "black box" elements [30] - Training efficiency for sparse models is currently low, with two proposed solutions: extracting sparse circuits from existing dense models or developing more efficient training techniques [31][32]

模型可解释性

稀疏模型

Artificial Intelligence

Artificial Intelligence

ChatGPT

GPT-2

GPT-4

Claude 4 核心成员访谈：提升 Agent 独立工作能力，强化模型长程任务能力是关键

Founder Park· 2025-05-28 13:13

Core Insights - The main change expected in 2025 is the effective application of reinforcement learning (RL) in language models, particularly through verifiable rewards, leading to expert-level performance in competitive programming and mathematics [4][6][7]. Group 1: Reinforcement Learning and Model Development - Reinforcement learning has activated existing knowledge in models, allowing them to organize solutions rather than learning from scratch [4][11]. - The introduction of Opus 4 has significantly improved context management for multi-step actions and long-term tasks, enabling models to perform meaningful reasoning and execution over extended periods without frequent user intervention [4][32]. - The current industry trend prioritizes computational power over data and human feedback, which may evolve as models become more capable of learning in real-world environments [4][21]. Group 2: Future of AI Agents - The potential for AI agents to automate intellectual tasks could lead to significant changes in the global economy and labor market, with predictions of "plug-and-play" white-collar AI employees emerging within the next two years [7][9]. - The interaction frequency between users and models is expected to shift from seconds and minutes to hours, allowing users to manage multiple models simultaneously, akin to a "fleet management" approach [34][36]. - The development of AI agents capable of completing tasks independently is anticipated to accelerate, with models expected to handle several hours of work autonomously by the end of the year [36][37]. Group 3: Model Capabilities and Limitations - Current models still lack self-awareness in the philosophical sense, although they exhibit a form of meta-cognition by expressing uncertainty about their answers [39][40]. - The models can simulate self-awareness but do not possess a continuous identity or memory unless explicitly designed with external memory systems [41][42]. - The understanding of model behavior and decision-making processes is still evolving, with ongoing research into mechanisms of interpretability and the identification of features that drive model outputs [46][48]. Group 4: Future Developments and Expectations - The frequency of model releases is expected to increase significantly, with advancements in reinforcement learning leading to rapid improvements in model capabilities [36][38]. - The exploration of long-term learning mechanisms and the ability for models to evolve through practical experience is a key area of focus for future research [30][29]. - The ultimate goal of model interpretability is to establish a clear understanding of how models make decisions, which is crucial for ensuring their reliability and safety in various applications [46][47].

Artificial Intelligence

Artificial Intelligence