Workflow
概念干预
icon
Search documents
230个大模型在婴幼儿认知题上集体翻车!揭秘多模态大模型的核心知识缺陷
量子位· 2025-10-10 01:03
Core Insights - The article highlights that while most AI models perform well on complex tasks, they struggle significantly with basic cognitive abilities that humans develop from a young age [1][4]. Core Cognition Benchmark - Researchers created the CoreCognition benchmark, which includes 1503 classic developmental psychology tests covering 12 core cognitive abilities that emerge in early childhood [2][9]. - The benchmark aims to systematically test models on their understanding of fundamental cognitive concepts such as object permanence and intuitive physics [5][9]. Model Performance - A comparison of 230 mainstream models revealed a "core knowledge blind spot," with many models showing significant deficits in basic cognitive abilities, often lagging behind human performance by double-digit percentages [3][4][16]. - The study found that lower-level cognitive abilities (e.g., boundary perception, continuity) are significantly weaker in models compared to higher-level abilities (e.g., intentional understanding, mechanical reasoning) [16][18]. Key Findings - The research identified five key findings regarding the cognitive capabilities of models: 1. Models exhibit systematic shortcomings in foundational "core knowledge" compared to human cognitive development [16]. 2. There is a weak correlation between lower-level abilities and higher-level reasoning, indicating a lack of scaffolding in cognitive development [18]. 3. Core abilities are positively correlated with performance on public benchmarks, suggesting that stronger core knowledge leads to better task performance [20]. 4. Increasing model size does not significantly improve lower-level cognitive abilities, with some abilities even deteriorating as model size increases [22]. 5. Concept Hacking experiments showed that larger models do not necessarily perform better, indicating that mere scaling does not eliminate reliance on shortcuts [24]. Cognitive Instruction and Model Understanding - Cognitive instructions can provide short-term gains in performance, but they do not address the underlying gaps in foundational knowledge [27][29]. - The study suggests that true intelligence relies on understanding the most basic rules of the world, rather than just increasing model parameters [31][32]. Recommendations - The article advocates for a shift in focus from merely scaling models to ensuring that foundational cognitive knowledge is solidified first, emphasizing that core knowledge is multiplicative rather than additive [33][34].