Workflow
Connection Machine
icon
Search documents
腾讯研究院AI速递 20250912
腾讯研究院· 2025-09-11 16:01
生成式AI 一、 估值120亿美元的Thinking Machines,发布首篇研究博客 1. Thinking Machines发布首篇研究博客,解决LLM推理中的非确定性问题,核心是批次不变性; 2. 研究团队通过改进RMSNorm、矩阵乘法和注意力机制,实现完全可复现的推理结果,性能损失可接受; 3. 公司估值达120亿美元,创始团队多来自OpenAI,首款产品命名为Connection Machine。 https://mp.weixin.qq.com/s/2m_8ZPYBBIs3SuKoEJWiIw 二、 ChatGPT终于支持MCP了,一句Prompt即可全自动化 1. OpenAI宣布ChatGPT正式支持MCP(模型上下文协议),Plus和Pro用户可一句Prompt实现自动化操作; 2. MCP实现了AI模型、工具和数据源的标准化交互,使不同模型能共享上下文,支持即插即用; 3. 用户可通过开启开发人员模式连接第三方服务(如Stripe),完成复杂任务,但目前无法与其他ChatGPT功能同时使 用。 https://mp.weixin.qq.com/s/09par8_260tRn10VEEg ...
成立7个月首发声,百亿美金独角兽万字雄文:攻克LLM推理非确定性难题
3 6 Ke· 2025-09-11 08:11
Core Insights - Thinking Machines Lab has launched its flagship product named "Connection Machine" and introduced a research blog titled "Connectionism" to share advancements in AI research [1][3][4] - The blog's first article discusses the challenge of achieving reproducible results in large language model (LLM) inference, highlighting the non-deterministic nature of LLM outputs [6][9][20] Group 1: Product and Research Focus - The "Connectionism" blog will evolve with the company's research, covering topics from numerical computation to prompt engineering [3] - The name "Connection Machine" reflects a historical reference to early AI research focused on neural networks [4] Group 2: Non-Determinism in LLM Inference - Achieving reproducibility in LLM inference is crucial, yet challenging, as identical inputs can yield different outputs due to sampling processes [9][11] - The research identifies a hypothesis linking non-determinism to floating-point arithmetic and concurrent execution, suggesting that the order of operations can affect results [14][20] Group 3: Solutions for Reproducibility - The study proposes that non-determinism in LLM inference arises from batch size variations rather than atomic competition, emphasizing the need for batch-invariant operations [20][21] - Implementing batch-invariant strategies for operations like RMSNorm, matrix multiplication, and attention mechanisms is essential for achieving reproducibility [21][28][33] Group 4: Performance and Experimentation - Initial experiments with a deterministic kernel showed that while performance may decline, it remains acceptable, with a performance drop of about 20% compared to standard methods [29][43] - The research demonstrated that using a deterministic kernel resulted in identical outputs across multiple completions, contrasting with the variability seen in non-deterministic settings [42]
她们估值840亿,刚发了第一个AI成果
量子位· 2025-09-11 01:58
Core Insights - Thinking Machines, valued at $12 billion, has released its first research blog focusing on overcoming nondeterminism in large language model (LLM) inference [1][51]. - The research emphasizes the challenge of reproducibility in LLM outputs, attributing it to batch non-invariance [3][12]. Group 1: Research Focus - The main theme of the research is "Defeating Nondeterminism in LLM Inference," which addresses why LLM inference results are often non-reproducible [3][8]. - The root cause identified is batch non-invariance, where the output of a single request is influenced by the number of requests in the same batch [14][15]. Group 2: Technical Findings - The research indicates that floating-point non-associativity and concurrent execution lead to different results in LLM inference, but this explanation is incomplete [9][10]. - The study reveals that the lack of batch invariance is the primary issue, as dynamic adjustments to batch sizes during deployment affect the computation order of key operations [15][16]. Group 3: Proposed Solutions - To achieve batch invariance, the research suggests fixing the reduction order in operations like RMSNorm and matrix multiplication, regardless of batch size [18][19]. - The proposed method involves compiling a unified kernel configuration for all input shapes to avoid switching parallel strategies due to batch size changes, even if it results in a performance loss of about 20% [22][21]. Group 4: Experimental Validation - Three types of experiments were conducted to validate the findings: inference determinism verification, performance verification, and real online policy reinforcement learning application verification [25]. - Results showed that using batch invariant kernels led to 1000 identical outputs, achieving deterministic inference, while non-invariant kernels produced 80 different results [27][28]. Group 5: Company Background - Thinking Machines was co-founded by Mira Murati, former CTO of OpenAI, and includes a team of notable figures from the AI industry, primarily from OpenAI [36][38][46]. - The company recently completed a $2 billion seed funding round, setting a record for AI funding, and is now valued at $12 billion despite not having any product yet [51][50].