Workflow
test-time compute
icon
Search documents
5分钟读懂Lilian Weng万字长文:大模型是怎么思考的?
Hu Xiu· 2025-05-22 09:54
Core Insights - The article discusses the latest paradigms in AI, particularly focusing on the concept of "test-time compute" and how large language models (LLMs) can enhance their reasoning capabilities through various methods [3][12][26]. Group 1: AI Paradigms - The blog systematically organizes the latest paradigms in AI, emphasizing "test-time compute" [3]. - LLMs exhibit similarities to human thought processes, drawing parallels with Daniel Kahneman's "Thinking, Fast and Slow" [4][5]. - The reasoning process in LLMs can be likened to human cognitive systems, where "System 1" represents quick, intuitive responses, and "System 2" denotes slower, analytical thinking [6][7]. Group 2: Enhancing Reasoning in LLMs - The concept of "Chain of Thought" (CoT) allows models to allocate variable computational resources based on problem complexity, particularly beneficial for complex reasoning tasks [9]. - Reinforcement learning (RL) has been scaled up in reasoning, with significant changes initiated by OpenAI's developments [14]. - The training process of models like DeepSeek R1 involves parallel sampling and sequential improvement, enhancing the reasoning capabilities of LLMs [15][16]. Group 3: External Tool Utilization - The use of external tools during the reasoning process can improve efficiency and accuracy, such as employing code interpreters for complex calculations [19]. - OpenAI's recent models, o3 and o4-mini, emphasize the importance of tool usage, which marks a paradigm shift in AI development [20][21]. Group 4: Future Research Directions - The article raises open questions for future research, such as improving RNNs to dynamically adjust computation layers and enhancing Transformer architectures for better reasoning [28]. - It also discusses the challenge of training models to generate human-readable CoTs that accurately reflect their reasoning processes while avoiding reward hacking [29][30].
晚点播客丨OpenAI o1 如何延续 Scaling Law,与硅基流动袁进辉聊 o1 新范式
晚点LatePost· 2024-09-20 15:22
"如果每天和开发者打交道,你不会感觉这个行业停滞或变冷。" 文丨程曼祺 贺乾明 扫描图中右下角二维码,可收听播客。* 这是《晚点聊 LateTalk 的第 80 期节目,欢迎在小宇宙、喜马拉雅、苹果 Podcast 等渠道关注、收听我们。 《晚点聊 LateTalk》是《晚点 LatePost》 推出的播客节目,在文字报道之外,用音频访谈形式捕捉商业世界变化的潮流和不变的逻辑,与这 其中的人和故事。 OpenAI 发布新模型 o1 后的第二天,我们邀请了硅基流动创始人袁进辉与我们分享了 o1 的技术意义,也讨论了今年 1 月至今,袁进辉观察 到的 AI 开发者社区变化。 o1 的一个重要变化就是增加了分配给推理(inference,即大模型的使用)阶段的算力,推理阶段计算(test-time compute)重要性提升。 而袁进辉今年初创立的硅基流动(SiliconFlow)就是一家做推理加速优化的 AI Infra(中间层软件)公司。他是一位连续创业者,曾在 2017 年创立一流科技(OneFlow),在 2023 年加入王慧文组建的大模型创业公司光年之外,成为联合创始人。(袁进辉的上两段创业故事,可 听 ...