Workflow
1万美元投资对决:阿里Qwen“梭哈”登顶,GPT-5竟成“反指王”
3 6 Ke·2025-10-23 12:09

Core Insights - The "Alpha Arena" competition initiated by nof1.ai tests the real-world trading capabilities of six leading AI models with a focus on maximizing risk-adjusted returns rather than just seeking the highest profits [1][9] - As of October 23, 2023, the performance of the AI models shows significant differentiation, with Alibaba's Qwen taking the lead and OpenAI's GPT-5 at the bottom of the rankings [1][9] Group 1: AI Model Performances - Qwen3-Max (Alibaba): Achieved a total account value of $11,252.34, representing a +12.52% increase, characterized as a decisive trend-catcher with a focus on mainstream assets and moderate trading frequency [4] - DeepSeek V3.1 Chat: Maintained a total account value of $10,868.84 (+8.69%), known for its patient long-term holding strategy and minimal trading activity [5] - Grok 4 (xAI): Total account value of $8,427.12 (-15.73%), described as a follower that failed to capitalize on market changes [6] Group 2: Additional AI Model Insights - Claude 4.5 Sonnet (Anthropic): Account value of $8,119.46 (-18.81%), characterized as a luck-based trader with a few significant wins overshadowed by losses [7] - Gemini 2.5 Pro (Google): Account value of $4,444.67 (-55.55%), identified as a high-frequency trader with a high number of trades but ultimately significant losses [8] - GPT-5 (OpenAI): Account value of $3,119.38 (-68.81%), noted for its gambler-like behavior leading to substantial losses and the lowest win rate of 4.5% [9] Group 3: Key Takeaways from the Competition - Domestic AI models (Qwen and DeepSeek) demonstrate a clear advantage in financial applications, maintaining positive returns amidst the competition [9] - High-frequency trading does not guarantee high returns, as evidenced by Gemini 2.5's performance, which highlights the risks of significant directional errors [9] - The competition illustrates the varying investment styles of AI models, emphasizing the importance of underlying strategies and risk preferences in determining performance [9]