Core Insights - The release of the GPT-5.2 series models by OpenAI marks a significant advancement in AI capabilities, transitioning from mere technical demonstrations to value creation in various professional fields [1][2] - The models demonstrate human expert-level performance in abstract reasoning and complex knowledge tasks, indicating the potential for AI to generate economic value in high-end professional domains [1][2] Summary by Sections Model Performance - GPT-5.2 achieved a score of 52.9% in the ARC-AGI-2 test, a significant increase from 17.6% in GPT-5.1, showcasing a nearly threefold improvement in abstract reasoning capabilities [2] - In the GDPval benchmark, GPT-5.2 Thinking outperformed or matched industry experts in 70.9% of tasks, while GPT-5.2 Pro reached 74.1%, marking the first time an AI model has achieved top human-level performance in comprehensive knowledge work assessments [2] Specialized Tasks - The average score for financial modeling tasks in investment banking increased from 59.1% to 68.4%, indicating deeper penetration of AI into core productivity processes [2] Multimodal Capabilities - GPT-5.2 demonstrated significant improvements in code generation, long-context understanding, and visual comprehension, achieving a 55.6% score in the SWE-Bench Pro evaluation [3] - The model's long-context processing capability saw a qualitative leap, with nearly 100% accuracy in a "multi-needle retrieval" test at 256K tokens, compared to only 30% for GPT-5.1 [3] - In visual tasks, error rates in scientific chart question answering and GUI interface understanding were reduced by nearly half, enhancing spatial reasoning capabilities [3] Enterprise Application - The reliability of tool invocation in GPT-5.2 improved significantly, achieving a score of 98.7% in multi-step complex tool invocation tests, demonstrating strong end-to-end task execution capabilities [4] - OpenAI has implemented a phased deployment strategy, offering the GPT-5.2 series to paid users while maintaining access to GPT-5.1 for three months to ensure a smooth transition [4] - Although API prices increased by approximately 40%, OpenAI emphasized that improvements in token efficiency would keep overall costs manageable [4]
国泰海通|计算机:GPT-5.2系列发布:重新定义AI生产力,驱动AI从模型竞争转向场景落地
国泰海通证券研究·2025-12-18 14:09