Core Viewpoint - The article discusses the challenges faced by large AI models in recognizing a specific artwork created by Japanese artist Akiyoshi Kitaoka, highlighting their limitations in visual perception and recognition tasks [1][3]. Group 1: AI Model Performance - ChatGPT could only identify the image as a face but failed to recognize the specific individual depicted [4][12]. - Gemini misidentified the person entirely, showcasing a significant error in recognition [6][15]. - Grok was unable to recognize the image and requested a clearer photo, indicating a lack of capability in handling such visual tasks [16]. Group 2: Domestic AI Model Analysis - Domestic model Doubao performed similarly to Gemini, recognizing the image style and facial contours but failing to identify the specific person [18]. - Doubao's deep thinking mode led it to incorrectly conclude that the image depicted Albert Einstein, demonstrating a flawed reasoning process [20]. - Qwen3-235B-A22B identified the image as a silhouette but did not specify the individual, reflecting a partial understanding of the visual content [21][22]. Group 3: Successful Recognition - The o3-Pro model stood out by successfully recognizing the artwork, attributed to its stronger reasoning capabilities compared to its non-Pro counterpart [26][29]. - There were discussions about whether o3-Pro used search capabilities to achieve its recognition, but it was clarified that it did not rely on search functions [31]. - The article suggests that prompting the model with hints about the artwork could lead to better recognition outcomes, akin to a guessing game [34].
蒙娜丽莎让大模型们几乎全军覆没!网友:懂了,AI不会眯眼睛
量子位·2025-07-06 05:12