持续学习 - filings, earnings calls, financial reports, news - Reportify

持续学习

Search documents

“AI 2028危机”，究竟有多少已然发生

Xi Niu Cai Jing· 2026-02-26 06:57

来源：未尽研究作者：未尽研究这些危机叙事显得逼真，是因为它们并非从未来倒推，而是从当下延伸。现实已经出现了若干可被拼接的信号。 "软件只是开场"正在变成一种共识。近期，Anthropic突破白领工业革命的积极尝试，让市场相信，那些支撑美国经济的高价值知识工作，不再被视为自动化的例外，正在接近系统性替代的拐点。进入2026年，进入马年，市场正弥漫着一种诡异的、前所未有的气氛。资本市场正在提前交易一场尚未真正到来的结构变化。从"一件大事正在发生"，到"2028年全球AI危机"，美国股市越来越容易被这些宏大叙事所牵引。华尔街怀疑，AI能力的自我递归提升，会先于制度与市场的调节能力，同时制造一次宏观经济的自我递归式的恶化。昨晚，独立研究机构Citrini用一篇假设写于2028年6月的备忘录，讲述了这样相互嵌套的递归链条。当晚，美国再次崩盘，道指下跌800点，不仅包括前阵子就深跌的SaaS股与网络安全股，这次还包括黑石（Blackstone）、万事达（Mastercard）这样的金融大蓝筹。 Citrini并带来没有太多新鲜的观点，很多都是ChatGPT刚出现时，经济学家就已经提出的担忧，以及一些AI大 ...

反身性循环

Artificial Intelligence

反身性循环

Artificial Intelligence

AI主线开年布局-春节期间海内外大模型产业动态

2026-02-24 14:15

Summary of Key Points from the Conference Call Industry Overview - The conference call discusses the developments in the AI industry, particularly focusing on domestic models like Zhipu and Minimax, which have shown strong performance in Agent AI and cost optimization, leading in usage on third-party platforms like Open Router [1][2]. Core Insights and Arguments - **Domestic Model Performance**: Zhipu and Minimax have released new versions (GM5 and M2.5) that excel in coding and agent capabilities, with Zhipu performing well in benchmark tests and Minimax leading in agent capabilities and cost optimization [2]. - **Token Demand Growth**: The rise of Agent AI has significantly increased token demand, making global developers more price-sensitive. Domestic models are capturing substantial demand due to their high cost-performance ratio [1][2]. - **Revenue Growth**: Kimi's K2.5 version generated revenue equivalent to its entire previous year's income within 20 days post-launch, with a higher proportion of revenue coming from overseas [4]. - **ByteDance's C-DOS 2.0**: ByteDance's C-DOS 2.0 is recognized as a leader in video generation, outperforming competitors in effectiveness, cost-performance, and usability, especially during the Spring Festival [5]. - **Alibaba's Progress**: Alibaba's Qianwen 3.5 has improved in multi-modal understanding and reasoning capabilities, maintaining a strong open-source approach despite a slower C-end deployment compared to ByteDance [6]. - **OpenAI's Revenue Goals**: OpenAI aims for $280 billion in revenue by 2030, planning to invest $665 billion in computing power, indicating strong commercial expectations [7]. - **Google's Gemini 3.1**: Google released Gemini 3.1, which is considered to have the leading comprehensive capabilities globally, competing closely with OpenAI's GPT-5.2 [7]. Additional Important Insights - **Future Trends**: The AI industry is expected to see significant advancements in reasoning technology by 2026, with unified models being a key trend that integrates content understanding and generation across various media [3][9]. - **SaaS Model Challenges**: The SaaS model faces challenges, particularly with user-based pricing, but underlying demand for AI infrastructure remains strong, benefiting companies in cloud computing and related fields [11]. - **Investment Opportunities**: Despite short-term pressures, companies with strong industry knowledge and customer barriers are expected to prove their value in the long term, with high-margin companies like TaxFriend and Glodon maintaining significant advantages in the AI era [12]. - **Multi-Agent Collaboration**: The Multi-Agent Scaling Law suggests that collaborative agents can significantly enhance overall efficiency, as demonstrated by Kimi K2.5, which utilizes multiple agents for improved task performance [17]. Conclusion - The AI industry is rapidly evolving, with domestic companies gaining ground through innovative models and competitive pricing. Key players like ByteDance and Alibaba are making strides in multi-modal capabilities, while global giants like OpenAI and Google set ambitious revenue targets. Investors should focus on the ongoing demand for AI solutions and the potential for significant advancements in technology and infrastructure.

Multi-Agent Scaling Law

Artificial Intelligence

Multi-Agent Scaling Law

Artificial Intelligence

深度｜Gemini 3预训练负责人揭秘Gemini 3巨大飞跃的关键，行业正从“数据无限”向“数据有限”范式转变

Z Potentials· 2026-02-21 03:43

图片来源： The MAD Podcast Z Highlights Sebastian Borgeaud 是 Google DeepMind 的 Gemini 3 预训练负责人，同时也是开创性论文 RETRO 的合著者，在 AI 前沿模型研发与系统构建领域具备深厚专业积淀。 2025 年 12 月 18 日，他在首次播客访谈中揭秘了这款今年 AI 领域里程碑式前沿模型的研发逻辑，分享了模型背后并非单纯依赖算力提升的系统构建思路。 Gemini 3的巨大提升是庞大团队通力协作、融合无数改进与创新的成果，其基于Transformer的混合专家架构，核心是将计算量使用与参数规模分离开来。规模是预训练中提升模型性能的重要因素，但并非唯一，架构和数据创新的重要性如今可能更甚，且预训练领域在长上下文能力、注意力机制等方面有诸多值得关注的发展方向。行业正从 "数据无限"向"数据有限"范式转变，合成数据需谨慎使用，模型架构改进能助力模型用更少数据实现更好效果，同时评估在预训练中至关重要且极具难度。的朴素。所以我很好奇你的看法，从某种程度上来说，事情真的这么简单吗？ Sebastian Bourge ...

数据有限模式

Artificial Intelligence

数据有限模式

Artificial Intelligence

ARR 140亿美元，新融300亿美元，Anthropic CEO说AI行业2030年将是万亿美元生意 | Jinqiu Select

锦秋集· 2026-02-14 09:08

Core Insights - Anthropic recently completed a $30 billion Series G funding round, achieving a valuation of $380 billion, marking the second-largest single funding round in venture capital history, with an annual revenue of $14 billion [2] - The CEO of Anthropic, Dario Amodei, predicts that the AI industry will likely reach a trillion-dollar revenue level by 2030, driven by technological and diffusion indices [3][17] - Amodei's aggressive forecast suggests that within 1 to 3 years, AI systems will reach or exceed the capabilities of Nobel Prize winners in various fields [5] Company Strategy and Growth - Anthropic's revenue is projected to grow approximately tenfold each year, from nearly zero to $1 billion in 2023, $10 billion in 2024, and around $90-100 billion in 2025, with significant increases already noted in January 2025 [14][48] - The company has adopted an aggressive yet calculated investment strategy in computing resources, emphasizing the importance of early procurement to avoid potential bankruptcy due to demand forecasting errors [15] - The internal perception at Anthropic indicates that AI tools have significantly enhanced productivity, contributing to an overall acceleration of 15-20% in operations [12] Industry Dynamics and Predictions - The AI industry's competitive landscape is expected to resemble that of cloud computing, characterized by a few dominant players and high entry barriers, ensuring that profits will not be driven to zero [16] - Amodei believes that while AI diffusion into the economy is rapid, it will not happen instantaneously due to factors like corporate procurement processes and compliance reviews [13] - The anticipated "genius nation in data centers" is expected to emerge within 1 to 3 years, fundamentally transforming various professional fields [8][41] Technological Insights - The scaling laws for pre-training and reinforcement learning (RL) remain effective, supporting the hypothesis that large computational blocks are essential for AI development [9] - Continuous learning is not deemed necessary for models, as pre-training and RL generalization, combined with longer context windows, are likely sufficient for performance [10] - The spectrum of coding capabilities ranges from AI writing 90% of code to potentially replacing software engineering entirely, though full replacement is still some distance away [11] Safety and Ethical Considerations - Amodei advocates for transparency in AI safety standards, suggesting that regulations should evolve as risks are validated, rather than imposing blanket bans [21][22] - The potential for AI to dissolve authoritarian structures is viewed optimistically, akin to the early expectations surrounding social media [23] - The importance of building data centers in developing countries is emphasized to ensure they do not fall behind in the AI-driven economy [24] Cultural and Operational Insights - Maintaining company culture is a priority for Anthropic, with regular all-hands meetings and open communication to foster cohesion among employees [27] - Decision-making speed is highlighted as critical, with the potential for significant historical decisions to be made in brief moments [28]

数据中心里的天才国度

数据中心里的天才国度

2026开年关键词：Self-Distillation，大模型真正走向「持续学习」

机器之心· 2026-02-10 03:46

机器之心编辑部 2026 年刚拉开序幕，大模型（LLM）领域的研究者们似乎达成了一种默契。当你翻开最近 arXiv 上最受关注的几篇论文，会发现一个高频出现的词汇： Self-Distillation 。近年来，基础模型取得了显著的成功，为语言、视觉、机器人等领域的 AI 应用提供了强大的支持。但在真正落地、长期使用的过程中，研究者逐渐发现：如何让模型在不断吸收新知识的同时，不丢失已有的核心能力 —— 即「持续学习」，正成为制约大模型进化的关键瓶颈。传统的强教师依赖范式因成本与数据依赖，难以适配高频的持续进化。 Self-Distillation（自蒸馏）随之成为破局点 —— 通过合理的上下文引导或反馈机制，模型完全可以构建出一个比当前权重更聪明的临时自我，让模型在没有外部强教师的情况下实现内生增长。基于这一深刻洞察，由 MIT、ETH Zurich、Meta 及斯坦福等顶尖机构组成的紧密学术圈，在 2026 年 1 月密集发布了三项研究成果。 1.Self-Distillation Enables Continual Learning 在持续学习领域，传统的监督微调（SFT）常因「灾难性 ...

Self-Distillation

Artificial Intelligence

SDFT（自蒸馏微调）方法

SDPO（自蒸馏策略优化）框架

OPSD（策略内自蒸馏）框架

Self-Distillation

Artificial Intelligence

SDFT（自蒸馏微调）方法

SDPO（自蒸馏策略优化）框架

OPSD（策略内自蒸馏）框架

中金：2026年大模型将取得更多突破向实现AGI长期目标更进一步

Zhi Tong Cai Jing· 2026-02-05 01:39

智通财经APP获悉，中金发布研报称，2025年全球大模型技术能力向前演进，逐步攻克生产力场景，在推理、编程、Agentic以及多模态等能力方向取得明显进步，但模型通用能力在稳定性、幻觉率等方面仍存在短板。展望2026年，该行认为大模型在强化学习、模型记忆、上下文工程等方面将取得更多突破，从短context生成到长思维链任务，从文本交互到原生多模态，并向实现AGI长期目标更进一步。中金主要观点如下：强化学习重要性提升，成为解锁模型高级能力的关键强化学习的引入提高了模型的智能上限，让模型可以更有逻辑、更符合人类偏好进行思考和推理，其本质是"自我生成数据+多轮迭代"，强化学习的关键在于大规模算力+高质量数据。海外OpenAI、Gemini 等模型厂商对于强化学习非常重视，国内DeepSeek、阿里千问等也在跟进，该行预计2026年海内外模型厂商强化学习占比将进一步提升。持续学习、模型记忆、世界模型等新路线将迎来核心突破持续学习和模型记忆本质上是解决大模型"灾难性遗忘"问题，让模型具备选择性记忆机制。Google提出的Titans、MIRAS、Nested Learning等算法和架构核心是让模 ...

预训练Scaling - Law

预训练Scaling - Law

中金 | AI十年展望（二十六）：2026关键趋势之模型技术篇

中金点睛· 2026-02-04 23:52

Core Insights - The article discusses the advancements in large model technology, highlighting improvements in reasoning, programming, agentic capabilities, and multimodal abilities, while also noting existing shortcomings in general reliability and memory capabilities [1][4]. Model Architecture and Optimization - The Transformer architecture continues to dominate, with a consensus on the efficiency of the Mixture of Experts (MoE) model, which activates only a subset of parameters, significantly reducing computational costs [17][18]. - The industry is exploring various attention mechanisms to balance precision and efficiency, including Full-Attention, Linear-Attention, and Hybrid-Attention [20]. Model Capabilities - Significant progress has been made in reasoning, programming, agentic tasks, and multimodal applications, with models achieving real productivity levels in various domains [3][4]. - The introduction of reinforcement learning is crucial for unlocking advanced model capabilities, allowing for more logical reasoning aligned with human preferences [2][23]. Competitive Landscape - Major players like OpenAI, Gemini, and Anthropic are intensifying their competition, with OpenAI focusing on enhancing reasoning and multimodal integration, while Gemini has made significant strides in model capabilities and is leveraging high-quality data for improvements [11][42][43]. - Domestic models are catching up, maintaining a static gap of about six months behind their international counterparts, with companies like Alibaba and ByteDance producing competitive models [12][14]. Future Directions - The focus for 2026 includes further advancements in reinforcement learning, continuous learning, and world models, with expectations for models to tackle more complex tasks and achieve long-term goals like AGI [27][40]. - Continuous learning and model memory are seen as essential for achieving lifelong learning capabilities, with new algorithms like MIRAS and HOPE being pivotal in this evolution [28][32].

Artificial Intelligence

Artificial Intelligence

OpenAI现离职潮

3 6 Ke· 2026-02-04 02:46

Core Insights - OpenAI is shifting its focus from long-term foundational research to accelerating the development of ChatGPT, leading to the departure of several senior employees [1][2] - The company, valued at $500 billion, is adapting to increasing competition from rivals like Google and Anthropic [1] - OpenAI is reallocating resources to enhance its flagship chatbot, ChatGPT, while reducing experimental research funding [1][2] Group 1 - Several employees, including VP of Research Jerry Tworek and model policy researcher Andrea Vallone, have left due to dissatisfaction with the strategic shift [1][2] - Under CEO Sam Altman's leadership, OpenAI is transitioning from a research lab to one of Silicon Valley's largest tech companies, necessitating proof of revenue growth to justify its valuation [1][3] - OpenAI's Chief Researcher Mark Chen asserts that foundational research remains a core focus, with significant resources still allocated to long-term projects [1][3] Group 2 - Researchers not involved in large language model development have faced resource limitations, impacting their ability to validate research hypotheses [2] - Teams working on video and image generation models, such as Sora and DALL-E, feel neglected as resources are prioritized for ChatGPT [2] - The competitive landscape is intense, with companies striving to release the strongest models quarterly, leading to a resource concentration on the most promising directions [2][3] Group 3 - Tworek left OpenAI after seven years, seeking to explore research types that are difficult to pursue within the company, such as continual learning [3] - Vallone joined competitor Anthropic after being assigned a challenging task related to user mental health concerning ChatGPT [3] - Investors remain optimistic, believing OpenAI's true competitive advantage lies in its large user base of ChatGPT [3][4] Group 4 - The focus on whether OpenAI has the strongest model is deemed misguided; the company is converting its technological lead into a platform lock-in effect [4] - The competitive edge has shifted from research capabilities to user behavior, making it harder to disrupt [4]

大语言模型

自动化研究员

大语言模型

自动化研究员

OpenAI推理第一人创业了：要造“活到老学到老”的AI，先来融它70个亿

3 6 Ke· 2026-01-29 07:16

Core Insights - Jerry Tworek, a key figure in AI model reasoning, has founded a new company named Core Automation, focusing on "continuous learning" in AI models [1][5][7] - The company aims to raise between $500 million to $1 billion to develop a new type of AI model that can learn continuously from new data and experiences [1][8][10] Company Background - Jerry Tworek has a strong theoretical and mathematical background, having completed a master's degree in mathematics and worked in quantitative research before joining OpenAI in 2019 [3][5] - At OpenAI, he played a significant role in developing major models like o1, o3, GPT-4, ChatGPT, and Codex, pushing the boundaries of AI from mere generation to reasoning capabilities [3][5] Industry Context - The current mainstream AI models are primarily trained once and deployed, which limits their ability to adapt to new situations [5][10] - Continuous learning is seen as a solution to reduce costs and improve efficiency, allowing models to learn from real-world experiences rather than relying solely on static data [10][12] - The concept of continuous learning is gaining traction, with other companies and academic institutions, such as Google Research, also exploring this area [15][17] Future Outlook - The industry consensus suggests that achieving Artificial General Intelligence (AGI) will require models to possess continuous learning capabilities, which is a key focus for Tworek's new venture [12][15] - There is a growing belief that 2026 could mark a significant advancement in continuous learning technologies [19]

Venture(US:VEMLY)

Artificial Intelligence

Artificial Intelligence

OpenAI推理第一人创业了：要造“活到老学到老”的AI，先来融它70个亿

量子位· 2026-01-29 05:03

Core Viewpoint - Jerry Tworek, a key figure in AI model reasoning, has founded a new company called Core Automation, focusing on "continuous learning" in AI models and plans to raise $1 billion (approximately 70 billion RMB) for this venture [1][15][20]. Company Background - Jerry Tworek played a crucial role in the development of OpenAI's reasoning capabilities and has a strong theoretical and mathematical background, having completed a master's degree in mathematics at the University of Warsaw [4][6][9]. - Before joining OpenAI in 2019, he worked in quantitative research, which shaped his interest in reinforcement learning [7][9]. Focus on Continuous Learning - The new company aims to address the challenge of how models can continuously learn from new data and experiences, rather than being static after deployment [12][15]. - Tworek believes that current mainstream models are limited to a "train and deploy" approach, which does not adapt to new situations encountered in real-world applications [12][22]. Implementation Strategy - Core Automation plans to develop a new architecture that does not rely on Transformers and aims to integrate the training process into a continuous system, allowing models to learn while in operation [17][20]. - The goal is to enable AI models to learn from ongoing experiences while retaining previously acquired knowledge [16][22]. Industry Context - The continuous learning approach is gaining traction, with other companies and academic institutions also exploring similar directions, such as Ilya's SSI company and Google Research's new methodologies [24][28]. - The industry consensus suggests that achieving Artificial General Intelligence (AGI) requires models to possess capabilities akin to biological systems, including continuous evolution and self-optimization, making continuous learning a critical aspect [23][24]. Future Outlook - The ambition to raise $1 billion reflects the high expectations for the potential of continuous learning in AI, with industry experts predicting that 2026 could be a pivotal year for this field [31].

Artificial Intelligence

Artificial Intelligence