OpenAI的o1
Search documents
林俊旸离职后首发长文
第一财经· 2026-03-26 15:05
2026.03. 26 本文字数:1192,阅读时长大约2分钟 作者 | 第一财经 陈杨园 3月26日晚间,前千问大模型技术负责人林俊旸在社交平台发文,在从阿里离职后,他首度发布长文 详谈自己对大模型发展路线的理解及对AI下一阶段的预判。 林俊旸表示,过去两年重塑了行业对大模型的评估方式与核心期待。OpenAI 的 o1 表明,"思考"可 以成为一种被训练出来的能力。DeepSeek-R1 紧随其后,证明推理式的后训练可以在原始实验室 之外被复现、被扩展。这一阶段至关重要。但 2025 年上半年,行业焦点主要停留在"推理式思 考"本身:如何让模型在推理的时候多想一会儿。现在该问下一步了。他的判断是智能体式思考:为 了行动而思考,在与环境交互的过程中,并根据来自世界的反馈持续更新计划。 真正的难点在于数据。当人们谈论合并思考与指令时,往往首先想到模型侧的兼容性,更深层的问题 是,两种模式的数据分布和行为目标存在显著差异。在尝试平衡模型合并与提升后训练数据质量和多 样性的过程中,团队并未把所有事情都做对,结果往往是在两个方向上都表现平庸:"思考"行为变得 嘈杂、冗余或不够果断,而"指令"行为则变得不够清晰、不够可 ...
1500篇关于提示工程的学术论文表明你所知道的一切都是错误的
3 6 Ke· 2025-08-22 03:12
Core Insights - Companies with annual recurring revenue (ARR) exceeding $50 million are adopting strategies that contradict popular social media advice on prompt engineering [1][11] - The research indicates that traditional prompt engineering wisdom is often based on anecdotal evidence and small-scale tests, leading to ineffective practices [2] Misconceptions in Prompt Engineering - Misconception 1: Longer and more detailed prompts yield better results; research shows structured short prompts are more effective and cost-efficient, reducing API costs by 76% [3] - Misconception 2: More examples always help; recent studies indicate that excessive examples can confuse advanced models like GPT-4 and Claude [4][5] - Misconception 3: Perfect wording is crucial; the format and structure of prompts are more important than specific wording, with XML format outperforming natural language by 15% for certain models [6] - Misconception 4: Chain of thought prompts are universally applicable; they are effective for math and logic tasks but can hinder performance in data analysis, where table-based reasoning is more effective [7] - Misconception 5: Human experts create the best prompts; AI systems can optimize prompts more effectively and quickly than human experts, taking only 10 minutes compared to 20 hours for humans [8] - Misconception 6: Prompt engineering is a one-time task; ongoing optimization is essential as prompt performance declines over time, with systematic improvements potentially increasing performance by 156% over 12 months [9][10] Effective Strategies for High-Performing Companies - Successful companies focus on optimizing business metrics rather than model metrics, prioritizing user satisfaction and task completion rates [11] - They automate prompt optimization, employing systematic methods for continuous testing and improvement rather than manual iterations [11] - These companies emphasize structure, organization, and clarity over clever wording or lengthy examples [11] - They tailor techniques to specific task types, using appropriate methods like chain of thought for math and direct instructions for other applications [11][14] - They treat prompts as products, requiring ongoing maintenance and improvement based on real user data [11] Methodological Gap - The persistence of misconceptions stems from a fundamental methodological gap between academic research and industry practices, with academia relying on controlled experiments and industry often depending on intuition [12] - Understanding these research findings is crucial for anyone building AI capabilities, emphasizing structure over content and the importance of automated optimization [12][13] Competitive Advantage - Companies that base their prompt engineering on research rather than traditional views achieve significant competitive advantages, realizing higher performance at lower costs [17][18] - They can focus human expertise on high-value activities like defining goals and evaluating outcomes instead of manual prompt crafting [18] Questions for Teams - Teams should shift their focus from "How can we write better prompts?" to "How can we systematically optimize our AI interactions based on empirical evidence?" [19] - This perspective encourages data-driven approaches, enabling the development of scalable AI functionalities that deliver sustainable value [19]