Workflow
GPT-4.5登顶6小时即失守!Grok-3上演1分逆袭
量子位·2025-03-04 04:51

Core Viewpoint - The competition among foundational AI models is intensifying, with GPT-4.5 initially leading but quickly being surpassed by Elon Musk's Grok-3 within hours of its release [1][2][5]. Group 1: Model Performance - Both GPT-4.5 and Grok-3 received over 3,000 votes, with Grok-3 scoring 1412 and GPT-4.5 scoring 1411, indicating a very close competition [2]. - Grok-3 currently leads in overall scoring, only slightly trailing GPT-4.5 in specific categories such as style control and difficult prompts [3][5]. - DeepSeek-R1 ranks sixth overall, tying with GPT-4.5 in math and difficult prompts with style control [4]. Group 2: User Perception and Reception - Initial perceptions of GPT-4.5 were mixed, with criticisms regarding its high cost and claims of emotional intelligence, which seemed unsubstantiated by performance metrics [8][9]. - However, user sentiment has shifted positively, with more users praising GPT-4.5's emotional intelligence shortly after its release [7][9]. - Notably, a user expressed a desire for GPT-4.5 to remain available, highlighting its impact and the community's interest [11]. Group 3: AI Model Capabilities - GPT-4.5 has demonstrated strong performance in various assessments, achieving 71.4% in GPQA (science) and 36.7% in AIME '24 (math), outperforming its predecessor GPT-4o in several areas [9]. - The model's capabilities extend beyond traditional tasks, as it has excelled in a unique competition format that involves strategy and social interaction, outperforming human participants [15][16]. Group 4: Future Implications - The discourse surrounding AI models like GPT-4.5 suggests a transformative potential for AI in reshaping human thought, creativity, and communication, although the full implications remain uncertain [13][14].