海外独角兽
Search documents
当人读不懂 AI 代码,Traversal 如何做企业运维的 AI 医生?
海外独角兽· 2026-02-11 12:06
作者:Haozhen 编辑:Cage 代码运维一直是开发者的痛点,AI Coding 的飞速进步放大了运维难度:Claude Code 贡献的代码 push 已经占到了公开 Github 的 4%,但 AI 写的系统逻 辑会有人类很难捕捉的问题,开发者将其称为"Claude Hole"现象。传统以 Datadog 为代表的可观测性工具虽能展示指标,却难以解释根本原因并指导修 复,工程师仍需依赖经验进行高成本排障,形成明确且持续扩大的行业痛点。 Traversal 这家初创公司由 MIT 和 Berkeley 的教授及量化交易员组成,这种稀缺的背景使他们未陷入传统的日志分析路径,而是从第一性原理出发解决 SRE 问题。公司以因果推断为基础构建自主决策型 SRE Agent,通过仿真与代码级扫描,将问题定位直接映射到具体变更并自动化处理。这一能力已在 头部客户的真实生产环境中已验证出显著效果。 • 行业痛点明确 尽管 Datadog 等工具垄断了数据可视化,但它们仅能展示指标波动而无法解释背后的因果,导致工程师面对仪表盘仍需人工猜测。 尤其随着 AI coding 的发展 导致代码复杂度呈指数级增长,人类往往难以 ...
深度讨论 OpenClaw:高价值 Agent 解锁 10x Token 消耗,Anthropic 超越微软之路开启
海外独角兽· 2026-02-05 12:18
讨论主题: OpenClaw 参与嘉宾: 拾象 Best Ideas 社群 效果。从 Cowork、Clawdbot(OpenClaw) 到 Claude in Excel,这些产品不再只是"更聪明的助 手",而是开始直接接管复杂任务、嵌入核心工作流,对既有的 SaaS 形态与人机分工方式形成实质 性冲击。 本文是上周日我们组织的一场 Best Ideas 深度讨论的总结实录,我们讨论的关注点并不仅仅局限在 单点产品体验上,而是从更底层的视角出发,对 Agent 的价值边界、Infra 机会、2026 年 token 大爆 炸以及商业模式变化等重要问题进行了观点碰撞: 1. OpenClaw 最大的巧思是预装了 Claude Skills; 2. Excel 是生产力的放大和延伸,Cowork 和 Claude Code in Excel 会打开"10x 微软"市场; 7. 三个让 Agent 真正泛化的推演路径。 希望这份来自 Best Ideas 社群的阶段性总结,能够为 AI 从业者提供一套更接近现实约束的思考框 架,帮助理解 Agent 时代正在悄然发生的深层变化。 3. 高价值 Agent 必须由行 ...
How To Play AI Beta:拾象 2026 AGI 投资思考开源
海外独角兽· 2026-02-02 01:14
作者:Guangmi,Penny,Cage,Haina,Feihong,Siqi,Nathan AI 领域的变化速率和格局演化永远比市场想象中更加迅速,几乎每个月市场共识和叙事都在翻 转。 本篇报告是拾象团队围绕这些变化做的一次系统复盘,用来重新校准对当下 AI 竞争时局的判断, 也对 2026 年可能成为主线的一些核心技术和产品趋势进行了拆解。 我们将这份报告开源出来,希望和大家共同探讨 :哪些是结构性机会,哪些只是阶段性的噪音: 1. Google 重回叙事顶峰,但 AI 不是零和博弈, OpenAI 和 Anthropic 的"赢面"仍很大; 2. Continual learning 已经成为几乎所有 AI labs 押注的新范式共识,2026 年会看到新的信号; 3. AGI 竞赛很像自动驾驶,从 L3 到全面实现 L4 难度极大,但在知识类工作这些垂直领域,局部 L3/L4 已经实现了可观的效率提升和经济价值; 4. "NVIDIA + OpenAI" 这条主线在短期内可能被市场低估, 今天继续 bet OpenAI 是在下注 AI 时代 的 "something never seen"; 5. ...
OpenAI 关键九问:2026 AI 战局升级后迎来叙事反转
海外独角兽· 2026-01-30 10:53
作者:Penny 有悲观者认为,OpenAI 的护城河看不到了。模型没有壁垒,ChatGPT 没有网络效应,流量和算力 比不过 Google,高价值任务落后于 Anthropic。 客观说,这些因素是有道理的,2026 年刚过一个月,模型的格局不仅没有更稳定,反而更激烈 了。这也是 OpenAI 这个公司自发布 ChatGPT 以来,第一次打这种逆风局。 但我们对 OpenAI 还是抱有乐观信心,认为 2026 年能迎来叙事反转。以下是我们的 9 个关键判 断。 Insight 01 OpenAI 受到 Gemini 的影响有多大? OpenAI 受到 Gemini 的影响主要来自三个方面:叙事、模型、流量。 从叙事上看,影响最大。Google 王者归来,让 OpenAI 跌落 SOTA 位置。也让大众意识到,OpenAI 从 4o 之后一直没发出大幅提升的模型, 不是 scaling 出了问题,而是 OpenAI 的 scaling 出了问题。 叙事直接体现在股价上,Google 自 Gemini 3 发布后涨了 20%,软银(OpenAI 在二级市场的映射)跌了 17%。 Google 王者归来,Anth ...
凭借 27 万小时真机数据,Generalist 可能是最接近“GPT-1 时刻”的顶级机器人团队
海外独角兽· 2026-01-29 12:06
编辑:Penny 机器人领域是我们长期关注的赛道,而 Generalist 是当前机器人领域中极少数具备长期竞争潜力的 公司,核心优势集中在数据规模、团队能力与清晰的 scaling 路径上。 1. 高质量真机数据是机器人行业公认的核心稀缺资源,凭借 27 万小时的训练数据,Generalist 可能 是全球首个在数据规模上达到 GPT-1 量级的机器人团队,有领先其他团队 6-12 个月时间窗口。更 引人关注的是,去年 11 月,Generalist 宣称在机器人领域首次验证了类似语言模型的 scaling law。 2. 团队核心成员来自 OpenAI、Boston Dynamics、Google DeepMind 等机构,是 PaLM-E、RT-2 等 具身智能里程碑项目的主要贡献者,技术实力非常强大。 作者:Haozhen 3. 团队已经通过一系列的 demo 展示出了清晰的研究路径和模型出色的灵巧性。 我们认为,虽然目前机器人的数据依然非常匮乏,但如果模型性能可通过人类视频与真机数据的混 合持续提升,竞争焦点或将从数据规模转向数据配比。率先跑通并工程化最优数据配比的团队,可 能不仅能在性能上取得领先 ...
红杉对话 LangChain 创始人:2026 年 AI 告别对话框,步入 Long-Horizon Agents 元年
海外独角兽· 2026-01-27 12:33
Core Insights - The article asserts that AGI represents the ability to "figure things out," marking a shift from the era of "Talkers" to "Doers" in AI by 2026, driven by Long Horizon Agents [2] - Long Horizon Agents are characterized by their ability to autonomously plan, operate over extended periods, and exhibit expert-level features across complex tasks, expanding from coding to various domains [3][4] - The emergence of these agents is seen as a significant turning point, with the potential to revolutionize how complex tasks are approached and executed [3][21] Long Horizon Agents' Explosion - Long Horizon Agents are finally beginning to work effectively, with the core idea being to allow LLMs to operate in a loop and make autonomous decisions [4] - The ideal interaction with agents combines asynchronous management and synchronous collaboration, enhancing their utility in various applications [3][4] - The coding domain has seen the most rapid adoption of these agents, with examples like AutoGPT demonstrating their capabilities in executing complex multi-step tasks [4][5] Transition from General Framework to Harness Architecture - The distinction between models, frameworks, and harnesses is crucial, with harnesses being more opinionated and designed for specific tasks, while frameworks are more abstract [8][9] - The evolution of harness engineering is particularly advanced in coding companies, which have successfully integrated these concepts into their products [12][14] - The integration of file system permissions into agents is essential for effective context management and task execution [24] Future Interactions and Production Forms - Memory is identified as a critical component for self-improvement in agents, allowing them to retain and utilize past interactions to enhance performance [35] - The future of agent interaction is expected to blend asynchronous and synchronous modes, facilitating better user engagement and task management [36] - The necessity for agents to access file systems is emphasized, as it significantly enhances their operational capabilities [39]
2026 年的 Coding 时刻是 Excel
海外独角兽· 2026-01-26 12:46
作者:Freda Duan (Partner@Alitmeter) 编译:Haozhen 近期 Claude Code 推出的 Excel 功能非常惊艳,我们认为 Excel 可能成为继 Coding 之后,下一个迎 来"aha moment"、并快速爆发的高价值领域。 本文是 Altiemeter 合伙人 Freda Duan 对 Coding 和 Excel 这两个 AI 垂直领域的深度解读,原文发布 于她的 Substack Robonomics。 简单来说,正如 Coding 凭借庞大的市场规模、向相邻场景自然延展的能力以及以产品驱动的 GTM 模式,迅速崛起为最强势的 AI 应用之一,Excel 也具备同样的条件: • 全球电子表格的 MAU 约为 15–16 亿; • Excel 可以延展至金融、运营、分析等场景,从某种角度看,大部分软件都可以被视为一层层叠加 在 Excel 之上的"Excel wrappers"; • Excel 可以通过用户自助完成快速采用(self-serve adoption)。 Coding 已经证明了这条路径下的爆发力,而 Excel 很可能是体量更大的下一站。 In ...
当顶级视频模型半衰期只有 30 天,fal.ai 为什么收入反而一年增长 60 倍?
海外独角兽· 2026-01-16 08:05
Core Insights - The article discusses the rapid rise of fal.ai as a generative media infrastructure company, providing a unified, low-latency API and cloud inference platform for high-performance access to multimodal generative models, including images, videos, and audio [2][4]. - fal.ai experienced explosive growth in 2025, with a revenue increase of 60 times over the past 12 months and a valuation tripling to $4.5 billion following a $140 million Series D funding round [2][5]. Group 1: Company Overview - fal.ai focuses on high-performance AI generative media platforms, enabling quick inference and deployment of various AI models through its API and cloud acceleration engine [4]. - The company completed a $140 million Series D funding round in December 2025, led by Sequoia Capital, with participation from other notable investors, raising its valuation to $4.5 billion [5]. Group 2: Market Positioning - fal.ai strategically chose to invest in generative video early on, recognizing the rapid growth in customer demand despite the market being perceived as niche at the time [6][8]. - The company believes that the market for generative video should be as large, if not larger, than that for large language models (LLMs), as video accounts for over 80% of internet bandwidth [8]. Group 3: Technical Advantages - fal.ai's team identified that video generation models face unique computational challenges, requiring significantly more processing power compared to LLMs and image generation [12][13]. - The company has developed a specialized tracing compiler to optimize performance across various video model architectures, allowing for efficient execution on heterogeneous hardware [15]. Group 4: Cost Management - fal.ai manages a distributed computing infrastructure across approximately 35 data centers, allowing for efficient resource allocation and cost management [17][18]. - The company strategically avoids traditional hyperscalers, opting instead to leverage emerging cloud providers (Neo-clouds) for more competitive pricing, which can be up to 2-3 times lower than hyperscalers [20][23]. Group 5: Ecosystem Development - fal.ai serves as a single hub connecting multiple model suppliers, allowing developers to utilize a wide range of models without being tied to a single provider [24][26]. - The platform supports over 600 generative media models, enabling developers to adapt quickly to the rapidly changing landscape of model performance and capabilities [24][26]. Group 6: User Engagement and Use Cases - Developers on fal.ai's platform typically use an average of 14 different models simultaneously, reflecting a modular approach to media production that allows for greater control and flexibility [32]. - The company highlights innovative use cases in education and gaming, such as personalized training videos and the potential for text-to-game applications, showcasing the versatility of generative media [35][37]. Group 7: Future Predictions - fal.ai predicts that within a year, fully AI-generated short films will emerge, with animation styles likely to see faster adoption than photorealistic styles due to lower production costs [41][42]. - The company emphasizes that the generative media industry will face a scenario where computational resources will be exhausted before data, indicating a unique growth trajectory compared to other sectors [41].
TPU vs GPU 全面技术对比:谁拥有 AI 算力最优解?
海外独角兽· 2026-01-15 12:06
Core Insights - The article emphasizes that the Total Cost of Ownership (TCO) is highly dependent on the specific use case, suggesting that TPU is preferable for training and latency-insensitive inference, while GPU is better for prefill and latency-sensitive inference scenarios [3][4][5] - The fundamental difference between the 3D Torus and Switch Fabric (NVSwitch/Fat-tree) interconnect systems lies not in speed but in their assumptions about traffic patterns [4][5] - Google's historical TCO advantage established through TPU has been significantly weakened in the v8 generation [6] TCO Analysis - TPU v7 offers a cost advantage of 45-56% in training scenarios, based on the assumption that TPU's Model FLOPs Utilization (MFU) is 5-10 percentage points higher than that of GPUs [4][16] - In inference scenarios, GPUs (GB200/GB300) outperform TPU v7 by approximately 35-50% during the prefill phase due to their FP4 computational advantage [4][18] - The TCO comparison shows that TPU v8's cost efficiency has decreased, with the TCO ratio dropping from 1.52x for GB200/TPUv7 to 1.23x for VR200/TPUv8p [6] Interconnect Architecture - The 3D Torus architecture assumes predictable and orchestrated communication patterns, maintaining high MFU in large-scale training tasks, while Switch Fabric accommodates uncertain traffic patterns [5][38] - TPU Pods utilize a 3D Torus topology for high bandwidth and low latency communication, with a maximum cluster size limited by the number of OCS ports [31][34] Performance Bottlenecks - In training, the bottleneck typically arises from computational power and scale-out communication bandwidth, while in inference, the prefill phase is limited by computational power and the decode phase is constrained by memory bandwidth [12][22] - The performance requirements differ across training and inference scenarios, with TPU needing FP8 and scale-out bandwidth for training, while GPU requires FP4 and scale-up bandwidth for inference [12][13] Software Optimization - TPU's software optimizations aim to mitigate its inherent weaknesses in handling irregular traffic, transforming unpredictable workloads into stable data flows [46][47] - The introduction of SparseCore in TPU is designed to enhance its capability to handle dynamic all-to-all routing, acknowledging the need for communication-computation decoupling similar to NVSwitch [48] Competitive Landscape - Google TPU v8 adopts a dual-supplier strategy to reduce costs, collaborating with Broadcom and MediaTek for different SKUs, which impacts the overall design and production timeline [49][50] - Nvidia's Rubin architecture aggressively enhances performance and TCO for inference, with significant improvements in FP4 computational power and HBM bandwidth, positioning it as a strong competitor against TPU [51][52]
当 AI 接管钱包:Agentic Commerce 如何重构互联网经济?
海外独角兽· 2026-01-14 04:05
Core Insights - Agentic Commerce represents a significant shift in the way commerce operates, potentially transforming the landscape of internet advertising, e-commerce, and payment infrastructure if successfully implemented [2][5] - The article explores two main questions: 1) Can Agentic Commerce be commercially viable? 2) If successful, how will it reshape the distribution of benefits across the internet ecosystem? [5] Commercial Viability - The article reviews past failures of Meta and Google in e-commerce, contrasting their approaches with those of OpenAI and Perplexity, to identify which third-party models (3P) are most likely to succeed in the future [5][24] - The potential total addressable market (TAM) for three consumer behavior categories—Impulse Buys, Routine Essentials, and Life Purchases—is estimated to be $3 trillion, with Lifestyle and Functional Purchases being the most promising areas for Agentic Commerce [8][9] E-commerce Spectrum - E-commerce is described as a continuous spectrum, with Amazon and Shopify at opposite ends, defined by who acts as the Merchant of Record (MoR) [10][11] - The distinction between "Platform is the MoR" (e.g., Amazon) and "Merchant is the MoR" (e.g., Shopify) affects the business scale, merchant control over customer data, and the potential for disruption in payment systems [12][13] Agentic Commerce Paths - Perplexity and ChatGPT represent two different approaches to Agentic Commerce, with Perplexity acting as the MoR and ChatGPT allowing merchants to retain that role [14][19] - OpenAI's Agentic Commerce Protocol (ACP) decouples the front-end checkout experience from back-end payment processing, allowing merchants to maintain their existing payment service providers while integrating with ACP [15][18] Historical Context - Google and Meta's reluctance to become MoR contributed to their struggles in e-commerce, as they prioritized advertising revenue over the complexities of managing e-commerce transactions [24][26] - The article suggests that if Google or Meta had developed a protocol similar to ACP, their e-commerce trajectories might have been different [26] Impact on Advertising and Payment - The article discusses how Agentic Commerce could redefine the relationship between advertising costs and commission rates, likening both to a form of "digital tax" [32][33] - Shopify is positioned as a structural winner in the Agentic Commerce context, benefiting from its lack of MoR responsibilities and the potential for increased market penetration among small and medium-sized businesses (SMBs) [38][39] Future Considerations - The article envisions a future where a Universal Catalog could be developed to facilitate AI-driven shopping experiences, requiring rich and structured metadata to support precise consumer needs [44]