格式塔心理学
Search documents
AI看不懂的色盲测试背后,藏着一场像素与诗意的战争。
数字生命卡兹克· 2026-02-03 01:31
Core Viewpoint - The article discusses the limitations of AI in visual perception, particularly in color recognition tasks, suggesting that AI lacks the holistic understanding that humans possess when interpreting visual information [13][62]. Group 1: AI's Color Recognition Limitations - Recent tests revealed that advanced AI models, including Gemini 3 Pro and Claude Opus 4.5, failed to accurately identify numbers in color-blind tests, with responses like "74" and "8" instead of the correct "45" [5][6]. - The only model that succeeded was GPT 5.2 Thinking, which utilized a coding technique to visualize the numbers, indicating a reliance on external methods rather than genuine understanding [7]. Group 2: Human vs. AI Perception - Humans perceive images as cohesive wholes, quickly organizing visual information into meaningful patterns, while AI processes images in fragmented parts, leading to a lack of overall comprehension [22][56]. - The article references Gestalt psychology, emphasizing that humans naturally integrate visual elements into a unified perception, whereas AI struggles with this holistic approach [30][22]. Group 3: Research Findings - A study titled "Pixels, Patterns, but No Poetry: To See The World like Humans" concludes that current AI does not "see" the world like humans but rather computes it, lacking the ability to appreciate the abstract and meaningful connections between visual elements [13][14]. - The study employed a Turing Vision Test (TET) to evaluate AI's visual perception capabilities, revealing significant shortcomings in recognizing patterns and meanings in visual data [32][38]. Group 4: AI's Processing Mechanism - AI models analyze images by breaking them into small patches, focusing on local details rather than the overall context, which leads to a fragmented understanding of visual information [54][56]. - The Grad-CAM technique was used to visualize AI's attention during image processing, showing that AI often fixates on irrelevant details rather than the significant features necessary for accurate interpretation [39][41]. Group 5: Conclusion on AI's Visual Understanding - The article concludes that AI's inability to effectively prioritize and integrate visual information results in a form of "attention deficit," where it can identify colors and patterns but fails to construct a meaningful whole from them [62][60]. - This limitation highlights a fundamental difference between human cognition and AI processing, suggesting that while AI can mimic human intelligence, it lacks the wisdom to discern what is truly important in visual contexts [62][66].
AI看不到的爱心,成了最棒的AI检测器。
数字生命卡兹克· 2025-10-31 01:33
Core Viewpoint - The article discusses the limitations of AI in recognizing visual patterns that humans can easily identify, particularly focusing on the concept of "Time Blindness" in video-language models [22][26][70]. Group 1: AI Limitations - AI models, including Gemini 2.5 Pro and GPT-5, failed to recognize a simple heart shape in a visual illusion, highlighting their inability to perceive certain visual cues that humans can easily identify [8][10][14]. - A benchmark study called SpookyBench demonstrated that while humans achieved over 98% accuracy in recognizing shapes and patterns in videos, AI models scored 0% [35][36][41]. - The inability of AI to recognize moving patterns is attributed to its reliance on spatial analysis rather than temporal understanding, leading to a phenomenon termed "Time Blindness" [43][70]. Group 2: Research Insights - The article references a paper titled "Time Blindness: Why Video-Language Models Can't See What Humans Can?" which explores the fundamental differences in how humans and AI perceive motion and visual information [22][26]. - The study involved 451 videos categorized into different temporal patterns, revealing that AI models could not identify any of the content, while humans could effortlessly recognize the intended shapes and movements [34][35]. - The research indicates that AI's approach to video analysis is fundamentally flawed, as it treats video frames as static images, missing the critical information conveyed through motion [47][50]. Group 3: Human Perception - The article emphasizes the role of human cognitive processes, such as the "Law of Common Fate," which allows individuals to perceive moving objects as a cohesive whole, a capability that AI lacks [57][67]. - It discusses the phenomenon of involuntary eye movements that help humans maintain perception of static images, which is leveraged in visual illusions to create a sense of motion [81][83]. - The author reflects on the philosophical implications of these findings, suggesting that while AI operates in a discrete, static manner, human perception is inherently fluid and continuous [73][75].