Jamba - filings, earnings calls, financial reports, news

Jamba

Search documents

Huan Qiu Wang Zi Xun· 2025-12-31 04:12

据介绍，Maestro是AI21的商业化支柱，年化收入约5000万美元。该平台帮助企业预处理结构化与非结构化数据，优化输入以供AI代理高效分析，并对生成结果进行准确性验证与格式化输出。知情人士透露，英伟达有意将Maestro整合进其企业级AI软件套件 NVIDIA AI Enterprise，以增强其在AI 代理开发与部署领域的端到端能力。此举将进一步丰富该套件中的预训练模型、开发工具及工作流管理功能，强化英伟达在生成式AI基础设施市场的领导地位。此次潜在收购紧随英伟达另一项重大技术合作之后——近日，该公司宣布以200亿美元获得AI芯片初创公司Groq的技术授权，并吸纳其创始CEO及核心团队加入。Groq专注于高性能推理处理器，其技术有望与英伟达现有GPU架构形成互补。外媒称，AI21 Labs曾于2023年完成一轮由英伟达、谷歌、三星电子等共同参与的融资，并在今年早些时候低调完成3亿美元新融资，估值与前轮基本持平。尽管此前已有投资关系，此次拟议收购表明英伟达正从"财务支持者"转向"战略整合者"，意图通过垂直整合关键AI软件能力，巩固其在大模型时代软硬协同的生态优势。（青云）来源：环球网 ...

Artificial Intelligence

Artificial Intelligence

Maestro

斯坦福最新论文，揭秘大语言模型心智理论的基础

3 6 Ke· 2025-09-24 11:04

Core Insights - The article discusses how AI, specifically large language models (LLMs), are beginning to exhibit "Theory of Mind" (ToM) capabilities, traditionally considered unique to humans [2][5] - A recent study from Stanford University reveals that the ability for complex social reasoning in these models is concentrated in a mere 0.001% of their total parameters, challenging previous assumptions about the distribution of cognitive abilities in neural networks [8][21] - The research highlights the importance of structured order and understanding of sequence in language processing as foundational to the emergence of advanced cognitive abilities in AI [15][20] Group 1: Theory of Mind in AI - The concept of "Theory of Mind" refers to the ability to understand others' thoughts, intentions, and beliefs, which is crucial for social interaction [2][3] - Recent benchmarks indicate that LLMs like Llama and Qwen can accurately respond to tests designed to evaluate ToM, suggesting they can simulate perspectives and understand information gaps [5][6] Group 2: Key Findings from the Stanford Study - The study identifies that the parameters driving ToM capabilities are highly concentrated, contradicting the belief that such abilities are widely distributed across the model [8][9] - The research utilized a sensitivity analysis method based on the Hessian matrix to pinpoint the parameters responsible for ToM, revealing a "mind core" that is critical for social reasoning [7][8] Group 3: Mechanisms Behind Cognitive Abilities - The findings suggest that the attention mechanism in models, particularly those using RoPE (Rotary Positional Encoding), is directly linked to their social reasoning capabilities [9][14] - Disrupting the identified "mind core" parameters in models using RoPE leads to a collapse of their ToM abilities, while models not using RoPE show resilience [8][14] Group 4: Emergence of Intelligence - The study posits that advanced cognitive abilities in AI emerge from a foundational understanding of sequence and structure in language, which is essential for higher-level reasoning [15][20] - The emergence of ToM is seen as a byproduct of mastering basic language structures and statistical patterns in human language, rather than a standalone cognitive module [20][23]

3700 次预训练寻找 “线性注意力” 非共识，MiniMax-01 开发者讲述 4 年探索

晚点LatePost· 2025-03-09 12:00

"我们跑的是下半场，赌的就是未来的长文本需求。" MiniMax 在今年 1 月发布了参数为 4560 亿的开源大模型 MiniMax-01，该模型就用到了他们开发的线性注意力机制 "Lightning Attention"。我们邀请了这个项目的负责人，MiniMax 高级研究总监钟怡然，来与我们一起聊线性注意力的研发过程。钟怡然在 MiniMax 负责大模型网络架构设计，目前正开发多模态深度推理模型。钟怡然曾担任上海人工智能实验室青年科学家，是新架构探索组的 PI（项目负责人）；他在澳洲国立大学获得博士学位，师从李宏东教授和 Richard Hartley 院士。他和他的团队已在一些国际顶级学术会议和期刊上发表了 20 余篇关于模型新架构的论文，覆盖了当前多类非 Transformer 架构，如线性注意力机制（线性注意力）、长卷积（Long Convolution）和线性循环网络（Linear RNN）。在 2021 年，线性注意力还是一个 "看起来很美好的泡泡"，怡然和团队就开始探索线性架构的实现。嘉宾丨钟怡然整理丨刘倩程曼祺上期播客中，我们与清华的两位博士生，肖朝军和傅 ...

线性注意力

稀疏注意力

Transformer

Artificial Intelligence

Artificial Intelligence

MiniMax-01

Lightning Attention