AGI 凉了？吴恩达、斯坦福、谷歌云罕见同频：AI 测评逻辑正被 Agent 颠覆

Core Insights - The AI industry is shifting focus from "can it be done" to "under what conditions, at what cost, and for whom does it create value" as of early 2026 [2][4][6] - Reports from Stanford HAI and other institutions indicate that 2026 will mark a transition from evangelism to evaluation in AI [2][7] Group 1: Investment and ROI - Many companies have completed their first round of generative AI deployment and are beginning to assess their investments and returns [4][5] - A report by Google Cloud titled "The ROI of AI 2025" surveyed 3,466 executives from companies with revenues over $10 million, revealing that sustainable returns come from a system-level implementation of "Agent + Process + Organization" rather than isolated generative AI capabilities [6][29] - Approximately 88% of early adopters of Agentic AI have seen positive returns in at least one generative AI scenario, with the success linked to clear C-level strategies and organizational alignment [30][31] Group 2: Evaluation Standards and AGI - The traditional Scaling Law, which posits that larger models and more data lead to better performance, is becoming inadequate as AI enters high-risk fields like law and medicine [9][10] - There is a growing consensus that the evaluation of AI must evolve to account for the complexity of real-world applications, moving beyond mere capability assessments [10][21] - Wu Enda's proposal for a new Turing-AGI test aims to redefine the standards for evaluating AI, focusing on its ability to perform tasks in unpredictable environments rather than just solving predefined problems [14][19] Group 3: Agentic AI and System Integration - The current focus in AI has shifted from merely enhancing model strength to effectively integrating these models into operational systems [31][32] - Google Cloud's report emphasizes that successful AI implementations are characterized by clear processes and the deployment of Agents in production environments, with over 52% of companies using Agents [33][34] - The report categorizes Agents into three levels, with Level 2 Agents being capable of understanding goals and completing tasks within a defined process, while Level 3 involves collaborative workflows among multiple Agents [37][40] Group 4: Future Directions and Challenges - The future of AI will not be about simply increasing the number of Agents but rather about managing them effectively to ensure stable collaboration and clear accountability [40][41] - The concept of "Skill" is emerging as a critical component in AI, where each task is broken down into manageable, verifiable units that can be monitored and reused [43][44] - The industry is warned about the potential bubble in AI investments, with calls for more empirical research to clarify what AI can and cannot do [27][28]