Workflow
主观评测五大AI助手识图能力,奇葩卫生间标识识别大PK
Hu Xiu·2025-08-17 04:08

Core Viewpoint - The recent advancements in AI, particularly the launch of the GLM-4.5V visual reasoning model by Zhipu, have garnered significant attention due to its impressive performance in visual benchmark tests, achieving first place in 41 out of 42 tests [2][3]. Group 1: Company Developments - Zhipu has introduced the GLM-4.5V visual reasoning model, which is an open-source model that has excelled in visual benchmark tests [2]. - The GLM-4.5 model has shown substantial improvements in logical reasoning, code writing, and tool invocation [1]. - Despite rapid developments in the AI sector, Zhipu, being a more technology-focused company, has not received as much public attention [3]. Group 2: Evaluation Process - An evaluation task was conducted to test the visual recognition capabilities of various AI tools, inspired by a recent international AI competition [5][12]. - The evaluation involved ten images of confusing restroom signs, with a scoring system based on correct identification [11][15]. - Zhipu's GLM-4.5 model (without reasoning) scored the highest at 86 points, while the reasoning version and ChatGPT's GPT-5 both scored 78 points [12]. Group 3: Performance Insights - The evaluation revealed that Zhipu's models made errors in identifying restroom signs, with the non-reasoning version being the only one to answer incorrectly on one of the ten questions [26]. - Other AI tools, such as Doubao and Kimi, performed better in certain instances, showcasing the varying capabilities of different models [26][23]. - The evaluation highlighted the potential for AI tools to improve in visual recognition tasks, which could have significant applications across various industries [39][42].