赚钱，DeepSeek果然第一！全球六大顶级AI实盘厮杀，人手1万刀开局

Core Insights - The article discusses a new experiment called Alpha Arena, initiated by nof1.ai, where top AI models compete in a real trading market to determine which can perform best in stock trading [2][51] - The competition includes leading models such as OpenAI's GPT-5, Google's Gemini 2.5 Pro, Anthropic's Claude 4.5 Sonnet, xAI's Grok 4, Alibaba's Qwen3 Max, and DeepSeek V3.1 Chat [3][51] - Each model starts with an initial capital of $10,000 and receives identical market data and trading instructions, simulating a level playing field for evaluation [5][51] Performance Summary - DeepSeek V3.1 Chat emerged as the top performer with an account value of $13,677, achieving a return of +36.77% [8] - Grok 4 followed with an account value of $13,168 and a return of +31.68% [8] - Claude Sonnet 4.5 ranked third with an account value of $11,861 and a return of +18.61% [8] - Qwen3 Max had an account value of $10,749, yielding a return of +7.49% [8] - GPT-5 and Gemini 2.5 Pro performed poorly, with account values of $7,491 and $6,787, resulting in returns of -25.09% and -32.13% respectively [8] Trading Dynamics - The trading strategies employed by the models varied significantly, with DeepSeek and Grok showing similar patterns of initial losses followed by substantial gains [28] - GPT-5 and Gemini 2.5 Pro initially experienced gains but later faced declines, contrasting with the performance of DeepSeek and Grok [34][35] - The competition highlights the volatility of financial markets and the challenges AI models face in adapting to real-time data and market conditions [46][51] Market Environment - The article emphasizes that financial markets serve as the ultimate testing ground for AI, as they are dynamic and unpredictable, unlike traditional static benchmarks [46][48] - The Alpha Arena experiment aims to assess AI models' abilities to interpret market fluctuations, manage risks, and learn from mistakes, effectively turning trading into a new form of Turing test [51][53]