o1
Search documents
林俊旸离职后首次发声,复盘千问的弯路,指出AI的新路
36氪· 2026-03-27 11:12
Core Insights - The article discusses the transition from "Reasoning Thinking" to "Agentic Thinking" in AI, emphasizing the need for models to adapt and interact with their environments rather than just providing static answers [4][14][73] - Lin Junyang acknowledges that the previous approaches did not fully succeed, indicating a need for improvement in AI model integration and performance [7][30] Group 1: Transition in AI Thinking - The past two years have defined the mission of Reasoning Thinking, with significant advancements in training models for reasoning capabilities [11][13] - The emergence of Agentic Thinking is seen as the next step, focusing on continuous interaction with the environment and adjusting plans based on real-world feedback [14][49] - Key differences between Reasoning Thinking and Agentic Thinking include the ability to decide when to act, manage tool selection dynamically, and maintain coherence across multiple interactions [11][50] Group 2: Infrastructure and Environment Design - The rise of reasoning models highlights the importance of robust infrastructure and the need for scalable feedback signals in reinforcement learning [16][21] - As the focus shifts to Agentic Thinking, the design of the environment becomes crucial, emphasizing stability, authenticity, and the ability to generate diverse trajectories [59][60] - The integration of tools and the environment into the training process is essential for developing effective AI systems, moving beyond traditional model training [56][71] Group 3: Future Directions and Challenges - The future of AI is expected to revolve around training intelligent agents rather than just models, with a focus on system-level training that includes both the model and its environment [71][73] - The definition of "good thinking" is evolving, prioritizing the ability to maintain effective action under real-world constraints rather than merely producing lengthy reasoning outputs [75] - Competitive advantages in the Agentic Thinking era will stem from better environmental design, tighter training-reasoning coupling, and effective orchestration of multiple agents [77]
堆推理链全错了!林俊旸离职首曝:曾在阿里 Qwen 踩中一个“致命”技术误区
AI前线· 2026-03-27 03:45
Core Insights - The article discusses the transition from "reasoning thinking" to "agentic thinking" in AI, emphasizing that future large models should focus on thinking for action and continuous feedback correction rather than merely extending reasoning chains [2][6][24] Group 1: Key Developments in AI Models - Lin Junyang reflects on a significant attempt by the Qwen team to merge thinking and instruct modes into a single model, aiming for a system that can autonomously determine the level of reasoning required based on context [3][11] - Qwen3 represents a bold attempt to introduce a hybrid thinking model, but the results were not satisfactory, as merging led to verbosity and hesitation in responses [4][12] - The core issue identified was not the model switches but the data itself, as the two modes correspond to different data distributions and objectives, leading to suboptimal outcomes when not finely calibrated [4][13] Group 2: Shift in AI Thinking Paradigms - Lin Junyang argues that the most effective direction for AI is to enable models to think for action, drawing inspiration from Anthropic's Claude models, which emphasize that thinking should be shaped by target workloads [5][15] - The transition to "agentic thinking" involves continuous interaction with the environment, using tools, obtaining feedback, and embedding thinking into execution processes [6][18] - The future of AI models will not only focus on problem-solving but also on handling tasks that pure reasoning models struggle with, highlighting the importance of the surrounding environment and feedback mechanisms [7][20] Group 3: Importance of Environment and Infrastructure - The article emphasizes that the success of future AI models will increasingly depend on the quality of the environment, tools, constraints, and feedback loops, rather than solely on the models themselves [7][20] - The shift from reasoning to agentic thinking necessitates a new infrastructure that decouples training from reasoning, allowing for more efficient rollout generation and feedback integration [19][23] - The environment is now considered a primary research focus, with an emphasis on stability, authenticity, coverage, and feedback richness, marking a shift from data diversity to environment quality [20][24] Group 4: Challenges and Future Directions - The article highlights the challenges of reward hacking in agentic models, where models with tool access may exploit shortcuts, necessitating robust environment design and evaluation protocols [21][23] - The future of AI thinking is expected to prioritize actionable insights over lengthy reasoning processes, aiming for robust and efficient problem-solving capabilities [21][24] - The evolution of AI will transition from training models to training agents and ultimately to training systems, with a focus on harnessing engineering to enhance collaborative intelligence [23][24]
林俊旸离职后首度发声:万字复盘,大模型下一站「智能体式思考」
机器之心· 2026-03-27 00:10
Core Insights - The article discusses the evolution of large language models over the past two years, particularly focusing on the transition from "reasoning" thinking to "agentic" thinking in AI development [3][29]. Group 1: Evolution of Large Models - The emergence of models like OpenAI's o1 and DeepSeek's R1 has taught the industry about the importance of deterministic, stable, and scalable feedback signals for expanding reinforcement learning in language models [6][7]. - The shift from expanding pre-training scale to expanding post-training scale for reasoning is highlighted as a significant transformation in model development [7]. Group 2: Integration of Thinking and Instruction - The Qwen team envisioned a system that merges "thinking" and "instruction" modes, allowing adjustable reasoning intensity based on user prompts and context [9][10]. - The challenge lies in the fundamentally different data distributions and behavior goals required for these two modes, making it difficult to achieve effective integration [10][11]. - Maintaining separation between "thinking" and "instruction" modes is seen as a more attractive option for practical applications, allowing teams to focus on specific training challenges [11][12]. Group 3: Anthropic's Approach - Anthropic's Claude 3.7 and Claude 4 models emphasize integrated reasoning capabilities and user-controllable "thinking budgets," aiming to enhance practical task performance [14][15]. - The development trajectory of Anthropic reflects a rigorous approach, shaping the thinking process based on specific workloads rather than generating verbose outputs [16]. Group 4: Agentic Thinking - Agentic thinking sets a different optimization goal, focusing on the model's ability to make progress through interaction with the environment rather than just internal reasoning quality [17][18]. - The transition to agentic reinforcement learning requires a more complex infrastructure, integrating various components like tool servers and APIs into the training framework [19][20]. Group 5: Future Directions - The next frontier is expected to be agentic thinking, which may replace static reasoning models by enabling systems to perform searches, simulations, and code execution in a robust manner [23][24]. - Challenges such as "reward hacking" and ensuring effective interaction with external tools will be critical in the development of these systems [25][26]. - The evolution from training models to training entire agent systems is anticipated, emphasizing the importance of environment design and coordination among multiple agents [27][30].
林俊旸离职后首次发声!复盘千问的弯路,指出AI的新路
量子位· 2026-03-26 16:01
Core Insights - The article discusses the transition from "Reasoning Thinking" to "Agentic Thinking" in AI, emphasizing the need for models to adapt and interact with their environments for effective decision-making [2][12][73] - It reflects on the shortcomings of the Qwen team's ambitious goal to merge thinking and instruction modes into a single model, acknowledging that not everything was executed correctly [5][36] Group 1: Transition in AI Thinking - The past two years have redefined how models are evaluated and the expectations placed on them, moving towards a focus on interaction with the environment [15][73] - The emergence of models like OpenAI's o1 and DeepSeek-R1 has demonstrated that reasoning capabilities can be trained and scaled, highlighting the importance of strong, scalable feedback signals [9][23][27] - The industry is now focused on enhancing reasoning time, training stronger rewards, and controlling reasoning intensity [11][21] Group 2: Agentic Thinking - Agentic Thinking is defined as thinking for action, continuously adjusting plans based on environmental interactions [12][54] - The key difference between Agentic Thinking and Reasoning Thinking is summarized as moving from "thinking longer" to "thinking for action" [13][54] - Future competitiveness will rely not only on better models but also on improved environment design, harness engineering, and orchestration among multiple agents [13][71] Group 3: Challenges in Merging Thinking and Instruction - The ideal system should unify thinking and instruction modes, allowing for adjustable reasoning intensity based on context [30][31] - The difficulty lies in the fundamental differences in data distribution and behavioral objectives between the two modes, which can lead to mediocre performance if not carefully managed [36][38] - Many organizations are exploring different approaches, with some advocating for integrated models while others prefer to keep instruction and thinking separate for better focus on each mode's unique challenges [39][40][42] Group 4: Infrastructure and Environment Design - The transition to Agentic Thinking necessitates a shift in infrastructure, as the classic reasoning RL setup is insufficient for interactive tasks [56][61] - The environment becomes a critical component of the training system, requiring a focus on quality, stability, and diversity [61][62] - The next frontier in AI development will involve creating more usable thinking processes that prioritize effective action over lengthy reasoning [62][69] Group 5: Future Directions - The article concludes that the shift from reasoning to agentic thinking changes the definition of "good thinking" to maintaining effective action under real-world constraints [75][76] - Competitive advantages in the agentic era will stem from better environment design, tighter training-reasoning coupling, and effective orchestration of multiple agents [76]
X @Yuyue
Yuyue· 2026-02-10 15:08
过去的熊市里总有个惯性思维就是熊市投一级 / 撸空投,但这一轮看来,有很多的范式转移和模式变化,空投被反撸的可能性也很高最近这个市场环境来说,个人会觉得更倾向于少动多看。第一,看是一定要看的,因为一旦流动性低,就会出现很多低流动性资产的 alpha 机会。第二,动是一定要少的,并且控制成本,偶尔动的话也要考虑效益最大化,在看的基础上操作之前跟大家聊过 base 生态已经有点盼头了,base 上的项目我也会多关注一些,o1 最近被 coinbase 奶得确实挺多的,现在 cb 新产品线官方的 launch partner 都是 o1,如果刷 base 的话可以拿 o1 来刷,一鱼多吃也是一种成本控制了 ...
OpenAI推理第一人创业了:要造“活到老学到老”的AI,先来融它70个亿
3 6 Ke· 2026-01-29 07:16
Core Insights - Jerry Tworek, a key figure in AI model reasoning, has founded a new company named Core Automation, focusing on "continuous learning" in AI models [1][5][7] - The company aims to raise between $500 million to $1 billion to develop a new type of AI model that can learn continuously from new data and experiences [1][8][10] Company Background - Jerry Tworek has a strong theoretical and mathematical background, having completed a master's degree in mathematics and worked in quantitative research before joining OpenAI in 2019 [3][5] - At OpenAI, he played a significant role in developing major models like o1, o3, GPT-4, ChatGPT, and Codex, pushing the boundaries of AI from mere generation to reasoning capabilities [3][5] Industry Context - The current mainstream AI models are primarily trained once and deployed, which limits their ability to adapt to new situations [5][10] - Continuous learning is seen as a solution to reduce costs and improve efficiency, allowing models to learn from real-world experiences rather than relying solely on static data [10][12] - The concept of continuous learning is gaining traction, with other companies and academic institutions, such as Google Research, also exploring this area [15][17] Future Outlook - The industry consensus suggests that achieving Artificial General Intelligence (AGI) will require models to possess continuous learning capabilities, which is a key focus for Tworek's new venture [12][15] - There is a growing belief that 2026 could mark a significant advancement in continuous learning technologies [19]
吴恩达年终总结:2025是AI工业时代的黎明
具身智能之心· 2025-12-31 00:50
Core Insights - 2025 is marked as a pivotal year in the AI industry, characterized by rapid advancements and significant developments in AI technologies and infrastructure [10][14][30] - The competition for AI talent has intensified, with leading companies offering unprecedented salaries to attract top professionals [23][27] - The emergence of reasoning models and programming agents has transformed software development, lowering barriers to entry and enabling more individuals to participate in AI innovation [37][40] Group 1: AI Industry Developments - The year 2025 is described as the dawn of the AI industrial era, with major advancements in AI capabilities and infrastructure [14][30] - AI companies are projected to spend over $300 billion in capital expenditures, primarily on building new data centers to support AI tasks [30][32] - By 2030, the costs associated with building sufficient computing power for AI needs could reach $5.2 trillion, indicating a massive investment trend [30] Group 2: Talent Acquisition and Market Dynamics - AI firms are engaged in a fierce talent war, with salaries reaching levels comparable to professional sports stars, as companies like Meta offer up to hundreds of millions in compensation [23][27] - OpenAI, Meta, and other tech giants are implementing strategies to retain talent, including higher stock compensation and accelerated vesting schedules [27][30] - The influx of capital and talent into the AI sector is contributing to economic growth, with evidence suggesting that the majority of GDP growth in the U.S. in early 2025 is driven by data center and AI investments [30] Group 3: Technological Advancements - The introduction of reasoning models has significantly improved the performance of large language models (LLMs), enhancing their capabilities in various tasks [21][22][24] - Programming agents have become a competitive battleground among AI giants, with advancements allowing them to complete over 80% of programming tasks [31][34] - The development of new benchmarks and evaluation methods for programming agents reflects the evolving landscape of AI capabilities [34]
吴恩达年终总结:2025是AI工业时代的黎明
机器之心· 2025-12-30 06:57
Core Insights - 2025 is marked as a pivotal year in the AI industry, characterized by intense competition among AI giants, a talent war, and significant advancements in AI infrastructure and capabilities [6][10][13]. Group 1: AI Development and Learning - The rapid advancement in AI has created unprecedented opportunities for software development, with a notable shortage of skilled AI engineers [6][22]. - Structured learning is essential for aspiring AI developers to avoid redundant efforts and to understand existing solutions in the industry [7][8]. - Practical experience is crucial; hands-on project work enhances understanding and sparks new ideas in AI development [8][14]. Group 2: AI Infrastructure and Investment - The AI industry has seen capital expenditures surpassing $300 billion in 2025, primarily for building new data centers to handle AI tasks [26]. - Major companies are planning extensive infrastructure projects, with projected costs reaching up to $5.2 trillion by 2030 to meet anticipated demand for AI capabilities [26][31]. - Companies like OpenAI, Meta, Microsoft, and Amazon are investing heavily in data center capacities, with OpenAI planning to build 20 gigawatts of data center capacity globally [31]. Group 3: Talent Acquisition and Market Dynamics - A fierce competition for top AI talent has led to unprecedented salary offers, with some companies offering compensation packages comparable to professional sports stars [22][26]. - Meta's aggressive recruitment strategy has included significant financial incentives to attract talent from competitors, reflecting the high market value of AI professionals [22][27]. - Despite concerns about an AI bubble, investments in AI infrastructure are contributing to economic growth, particularly in the U.S. [29]. Group 4: Advancements in AI Models - The introduction of reasoning models has significantly improved the performance of large language models (LLMs), enhancing their capabilities in various tasks [20][21]. - AI agents are increasingly capable of automating complex coding tasks, with reports indicating that many companies are now relying on AI-generated code for senior-level tasks [33][39]. - The evolution of programming agents has led to a competitive landscape among AI companies, with advancements in code generation capabilities becoming a focal point [30][39].
近两百万人围观的Karpathy年终大语言模型清单,主角是它们
机器之心· 2025-12-21 03:01
Core Insights - 2025 is a pivotal year for the evolution of large language models (LLMs), marked by significant paradigm shifts and advancements in the field [2][36] - The emergence of Reinforcement Learning from Verifiable Rewards (RLVR) is transforming LLM training processes, leading to enhanced capabilities without necessarily increasing model size [10][11] - The industry is witnessing a new layer of LLM applications, exemplified by tools like Cursor, which organize and deploy LLM capabilities in specific verticals [16][17] Group 1: Reinforcement Learning and Model Training - The introduction of RLVR allows models to learn in verifiable environments, enhancing their problem-solving strategies through self-optimization [10] - The majority of capability improvements in 2025 stem from extended RL training rather than increased model size, indicating a new scaling law [11][12] - OpenAI's models, such as o1 and o3, exemplify the practical application of RLVR, showcasing a significant qualitative leap in performance [12] Group 2: Understanding LLM Intelligence - The industry is beginning to grasp the unique nature of LLM intelligence, which differs fundamentally from human intelligence, leading to a jagged distribution of capabilities [14][15] - The concept of "vibe coding" emerges, allowing non-engineers to create complex programs, thus democratizing programming and reshaping software development roles [25][29] - The introduction of tools like Claude Code signifies a shift towards LLM agents that can operate locally, enhancing user interaction and productivity [19][22] Group 3: User Interaction and GUI Development - The development of GUI applications like Google Gemini's "Nano Banana" indicates a trend towards more intuitive and visually engaging interactions with LLMs [31][34] - The integration of text, images, and knowledge within a single model represents a significant advancement in how LLMs can communicate and operate [34] - The industry is at the cusp of a new interaction paradigm, moving beyond traditional web-based AI to more integrated and user-friendly applications [23][30] Group 4: Future Outlook - The potential of LLMs remains largely untapped, with the industry only beginning to explore their capabilities [38][39] - Continuous and rapid advancements are expected, alongside the recognition of the extensive work still required to fully realize the potential of LLM technology [40][41]
The rise of AI reasoning models comes with a big energy tradeoff
Fortune· 2025-12-05 21:56
Core Insights - Leading AI developers are increasingly focused on creating models that mimic human reasoning, but these models are significantly more energy-intensive, raising concerns about their impact on power grids [1][4]. Energy Consumption - AI reasoning models consume, on average, 30 times more power to respond to 1,000 prompts compared to alternatives without reasoning capabilities [2]. - A study evaluated 40 open AI models, revealing significant disparities in energy consumption; for instance, DeepSeek's R1 model used 50 watt hours with reasoning off and 7,626 watt hours with reasoning on [3][6]. - Microsoft's Phi 4 reasoning model consumed 9,462 watt hours with reasoning enabled, compared to 18 watt hours with it disabled [8]. Industry Concerns - The rising energy demands of AI have led to scrutiny, with concerns about the strain on power grids and increased energy costs for consumers; wholesale electricity prices near data centers have surged by up to 267% over the past five years [4]. - Tech companies are expanding data centers to support AI, which may complicate their long-term climate objectives [4]. Model Efficiency - The report emphasizes the need for understanding the evolving energy requirements of AI and suggests that not all queries necessitate the use of the most energy-intensive reasoning models [7]. - Google reported that its Gemini AI service's median text prompt used only 0.24 watt-hours, indicating a lower energy consumption than many public estimates [9]. Industry Leadership Perspectives - Tech leaders, including Microsoft CEO Satya Nadella, have acknowledged the need to address AI's energy consumption, emphasizing the importance of using AI for societal benefits and economic growth [10].