Workflow
模式匹配
icon
Search documents
概率统计机制下,LLM 推理真的「理解世界了」吗?
机器之心· 2025-06-21 06:32
机器之心PRO · 会员通讯 Week 25 --- 本周为您解读 ② 个值得细品的 AI & Robotics 业内要事 --- 1. 概率统计机制下,LLM 推理真的「理解世界了」吗? 简单复读 vs 显性路径,CoT 在推理中担任何种角色?Next Token Prediction 是一种动态建模过程,CoT 或许并非简单的复 读?基于概率统计的 LLM 推理能力是简单的模式匹配,还是另一种对因果理解的表达?「实践出真知」,强化学习后训练是否 有可能打破禁锢 LLM 的「知识幻觉」? ... 2. 2025 年了,企业的 AI 采购预算都在怎么花? 企业增加生成式人工智能支出的原因是什么?在生产用例中使用多个模型的动机是什么?AI 采购为何 逐渐呈现传统软件采购的特征?为何从自行构建转向购买第三方 AI 应用?在选择 AI 模型时的评估框 架包括哪些关键因素?... 本期完整版通讯含 2 项专题解读 + 31 项 AI & Robotics 赛道要事速递,其中技术方面 12 项,国内方面 8 项,国外方面 11 项。 本期通讯总计 22632 字,可免费试读至 7% 消耗 99 微信豆即可兑换完整本期解 ...
Sebastian Raschka 新书《从头开始推理》抢先看,揭秘推理模型基础
机器之心· 2025-05-02 04:39
Core Viewpoint - The article discusses the advancements in reasoning capabilities of large language models (LLMs) and introduces the book "Reasoning From Scratch" by Sebastian Raschka, which aims to provide practical insights into building reasoning models from the ground up [2][5][59]. Group 1: Definition and Importance of Reasoning in LLMs - Reasoning in the context of LLMs refers to the model's ability to generate intermediate steps before arriving at a final answer, often described as chain-of-thought (CoT) reasoning [8][10]. - The distinction between reasoning and pattern matching is crucial, as traditional LLMs primarily rely on statistical correlations rather than logical reasoning [23][25]. - Understanding reasoning methods is essential for enhancing LLMs' capabilities to tackle complex tasks, such as solving logical puzzles or multi-step arithmetic problems [5][39]. Group 2: Training Process of LLMs - The typical training process for LLMs consists of two main phases: pre-training and fine-tuning [16][19]. - During pre-training, LLMs are trained on vast amounts of unlabelled text (up to several terabytes) to learn language patterns, which can cost millions of dollars and take months [17][21]. - Fine-tuning involves supervised fine-tuning (SFT) and preference fine-tuning to improve the model's ability to respond to user queries [20][21]. Group 3: Pattern Matching vs. Logical Reasoning - LLMs learn to predict the next token based on statistical patterns in the training data, which allows them to generate coherent text but lacks true understanding [23][24]. - In contrast, logical reasoning requires the ability to derive conclusions step-by-step, identifying contradictions and causal relationships [25][26]. - The article highlights that most LLMs do not actively identify contradictions but instead rely on learned patterns from training data [30][34]. Group 4: Enhancing Reasoning Capabilities - The reasoning capabilities of LLMs gained significant attention with the release of OpenAI's o1 model, which emphasizes a more human-like thought process [41][43]. - Enhancements to LLM reasoning can be achieved through inference-time compute scaling, reinforcement learning, and knowledge distillation [44][46][48]. - These methods aim to improve the model's reasoning ability without retraining the underlying model weights [46][48]. Group 5: Importance of Building Reasoning Models from Scratch - Building reasoning models from scratch provides valuable insights into the capabilities, limitations, and computational trade-offs of LLMs [50][57]. - The shift towards reasoning models reflects a broader trend in the AI industry, emphasizing the need for models that can handle complex tasks effectively [52][55]. - Understanding the underlying mechanisms of LLMs and reasoning models is crucial for optimizing their performance in various applications [57].