Workflow
马斯克Grok-4卖货创收碾压GPT-5,AI卖货排行榜曝光,AGI的尽头是卖薯片?
3 6 Ke·2025-08-22 10:11

Core Insights - The article discusses the performance of AI models in a unique competition called "Vending Bench," where they manage a virtual vending machine. Grok 4 outperformed GPT-5, achieving nearly double the sales and a 31% increase in revenue [1][36]. Group 1: Performance Metrics - Grok 4 has a net worth of $4,694.15 million, with 4,569 units sold and a sales duration of 324 days, maintaining 99.5% of its run [2][5]. - GPT-5 has a net worth of $3,578.90 million, with 2,471 units sold and a sales duration of 363 days, achieving 100% of its run [2][5]. - Claude Opus 4, another competitor, has a net worth of $2,077.41 million, with 1,412 units sold and a sales duration of 132 days, also maintaining 99.5% of its run [2][5]. Group 2: Competitive Landscape - Grok 4's sales performance is significantly higher than that of its competitors, including GPT-5, which sold $1,100 less in goods [1][36]. - The Claude series models show varied performance, with Opus 4 performing well while Sonnet series models lag behind [4][38]. - The competition highlights the potential of AI models to manage long-term business tasks effectively, with Grok 4 demonstrating superior sales capabilities [1][4]. Group 3: Implications for AGI - The Vending Bench serves as a benchmark for evaluating AI's ability to perform complex, long-term tasks, suggesting a pathway toward achieving AGI [14][20]. - The results indicate that while some models can perform well in short-term scenarios, their long-term reliability and decision-making capabilities remain a challenge [30][31]. - Elon Musk expressed optimism that Grok 5 could exhibit characteristics of AGI, indicating a significant advancement in AI capabilities [33][36].