Agent系统
Search documents
黄仁勋150分钟访谈:Agent 是“Token 的 iPhone 时刻”|Jinqiu Select
锦秋集· 2026-03-24 07:24
Core Insights - The core moat of companies in the AI era is not just the product itself but the ability to integrate technology, ecosystem, organization, and infrastructure into a complete system [1][2]. NVIDIA's Technology and Vision - Jensen Huang defines AI infrastructure as a problem of "extreme collaborative design," where algorithms, chips, networks, power, cooling, software, and organizational structure must be optimized in sync [3]. - The Amdahl's Law has become critical in large-scale distributed AI, indicating that faster computation does not equate to faster systems, as network and scheduling can become bottlenecks [4]. - NVIDIA's competitive focus has shifted from "chip-level" to "rack-level, pod-level, and data center-level" engineering capabilities [5]. - Power is viewed as a major constraint, but it can be continuously optimized through "tokens per second per watt" [6]. - Over the past decade, computational scale has increased by approximately 1 million times, far exceeding traditional Moore's Law projections [7]. - The NVL72 and Vera Rubin architectures have integrated "supercomputing" into the supply chain, making the manufacturing system itself a core capability [8]. - A single rack contains about 1.3 million components, and a Rubin Pod exceeds a scale of over 10,000 chips, indicating a complexity that has entered the realm of industrial systems engineering [9]. NVIDIA's Frontiers and Trends - Huang categorizes AI scaling into four types: pre-training scaling, post-training scaling, test-time scaling, and agentic scaling [10]. - In his view, inference is not light computation; it is essentially "thinking," thus requiring more computational power than many expect [11]. - The next critical phase is not the capability of individual models but the "concurrent replication capability" of agent systems [12]. - The significance of OpenClaw for agent systems is likened to the impact of ChatGPT on generative AI [13]. - NVIDIA proposes a "2/3 permission constraint" for agent security, where access to sensitive data, code execution, and external communication cannot all be open simultaneously [14]. Organizational Design for Systems - Huang manages over 60 direct reports and intentionally builds a "high-density information flow" organization rather than a traditional hierarchical structure [17]. - He emphasizes that the company's architecture should reflect its environment and desired outputs, indicating that organizational design is crucial for systemic output [26]. Transition from Accelerators to Computing Platforms - NVIDIA initially started as an accelerator company but recognized the need to evolve into a general computing company to expand its market impact [36][37]. - The introduction of CUDA was a strategic decision that significantly broadened the application of NVIDIA's technology, despite initial profit margin pressures [42][50]. Scaling Laws and Future Challenges - Huang believes that the limitations of high-quality data will not hinder the achievement of intelligent AI, as the model size will continue to grow with the availability of synthetic data [65]. - Post-training scaling will expand as computational power becomes the limiting factor rather than data volume [66]. - Test-time scaling reveals that inference is a complex, computation-intensive process, and the development of agent systems will create a feedback loop enhancing training and testing phases [67]. Energy Efficiency and Supply Chain Dynamics - Power is a significant concern, but NVIDIA is focused on increasing the number of tokens generated per watt while also seeking to secure more power [82][84]. - Huang has actively engaged with supply chain partners to ensure they understand NVIDIA's growth dynamics and future needs, fostering trust and collaboration [86][92]. China's Rapid Technological Advancement - China has produced many world-class companies and engineering teams due to a combination of high-quality education, competitive local markets, and a culture that values open-source collaboration [119][120].
吃够了全自动的龙虾,我决定把AI的方向盘抢回来
量子位· 2026-03-13 03:51
Core Viewpoint - The article discusses the emergence of MorphMind, the world's first controllable AI platform, which allows users to have greater control over AI outputs and interactions, addressing the common frustrations associated with traditional AI systems [2][51]. Group 1: Introduction to MorphMind - MorphMind transforms AI from a black box into a controllable work system, enabling users to intervene at any time [2]. - The platform allows users to manage a team of AI experts, ensuring transparency and clarity in the workflow [3][4]. Group 2: User Experience and Functionality - Users can oversee each step of the AI's process, allowing for real-time adjustments and interventions [10][80]. - The platform provides a structured approach to tasks, breaking them down into manageable steps and assigning roles to different AI experts [16][76]. Group 3: Unique Features of MorphMind - MorphMind emphasizes a collaborative structure where users can continuously engage with the AI, enhancing its understanding and capabilities over time [66][68]. - The system is designed to learn from user feedback, ensuring that the AI becomes more aligned with user preferences and logic [67][70]. Group 4: Comparison with Traditional AI - Traditional AI systems often operate as a single model generating outputs, leading to a lack of transparency and control for users [92][94]. - MorphMind's approach allows for a more interactive and iterative process, where users can pause, modify, or roll back steps without starting over [80][82]. Group 5: Future Implications - The platform aims to redefine the relationship between humans and AI, promoting a symbiotic interaction where AI handles repetitive tasks while humans maintain control over critical decisions [109][110]. - MorphMind envisions a future where individual users can manage a team of AI agents, transforming the traditional team-based work structure into a more flexible and efficient model [121][124].
复盘字节扣子空间开发历程:瞄准工作场景,做一个 Agent 系统
晚点LatePost· 2025-04-21 09:36
大厂团队开发 AI 产品的样本。 4 月 18 日周五晚,字节的 Agent 产品 "扣子空间(space.coze.cn) " 开启内测。团队为此准备不少算 力资源,但短短几小时内,服务器就被涌入的用户挤爆。 超出预期的用户热情,让扣子团队再次验证一个判断:用户一直在等待能用的 AI 产品,去解决工作 中的问题。 ChatGPT 让聊天窗成为大模型应用首选的交互界面。理由是当 AI 足够聪明,似乎就不需要用户学 习,不用熟悉按钮和菜单、只用自然语言下命令就够了。 字节在 2023 年下半年搭起 AI 应用开发平台 "扣子(Coze)",让开发者不需要熟悉复杂的技术能 力,就能将自己的数据接入最前沿的大模型,做各式各样的应用。到 2024 年中,扣子团队发现,尽 管聊天机器人应用成千上万地出现,从知识问答到情感陪伴,覆盖几乎所有热门的场景,但大多面临 相同的问题:用户增长难,留存更难。 这是一个产品形态与用户真实需求形成偏差的案例。聊天界面简单易用,但对大模型要求极高,导致 不论中美都是一两个通用 AI 聊天应用断层式领先。扣子团队发现,平台上有一类应用的增长和留存 明显更好——嵌入到用户工作流的大模型应用, ...