当AI成”视觉神探“，准确性如何？隐私暴露风险如何抵御？

Core Insights - The article discusses the launch of the GLM-4.5V visual reasoning model by Zhiyu AI, which claims to be the best in its class with a capacity of 100 billion parameters, capable of accurately identifying image details and inferring background information without relying on search tools [1][6] - The competition in visual reasoning capabilities among major AI players, including OpenAI, Google, and domestic companies like Doubao and Tongyi Qianwen, is highlighted, emphasizing the growing importance of multimodal capabilities in AI models [1][6] - Concerns regarding privacy risks associated with AI's ability to pinpoint locations from images are raised, particularly in light of previous models that have sparked "open box" worries [1][6][7] Model Performance - In a practical test, Doubao achieved a 100% accuracy rate in identifying locations from images, while Zhiyu's GLM-4.5V had a 60% accuracy rate, and Tongyi Qianwen's QVQ-Max only reached 20% [2][3] - The models performed differently based on the clarity and type of images, with landmark photos being the easiest to identify accurately [3][4] - Doubao's superior performance is attributed to its ability to connect to the internet for real-time data comparison, enhancing its accuracy [5] Technical Developments - The article notes the rapid advancements in visual reasoning technology, with several new models being released this year, including OpenAI's o3 and o4-mini, and Google's Gemini 2.5 pro, all showcasing strong visual reasoning capabilities [6][7] - Zhiyu AI's GLM-4.5V has been tested in a global competition against top human players, demonstrating its competitive edge in visual reasoning tasks [7] Privacy Concerns - The ability of AI models to infer geographic locations from images raises significant privacy concerns, as highlighted by a study indicating that advanced multimodal models can lower the barrier for extracting user location data from social media images [7][8] - Experts recommend that AI companies implement safety boundaries for image analysis capabilities to mitigate privacy risks, such as restricting access to sensitive data like Exif information [8]