Workflow
大模型能够自发形成“人类思维地图”!Nature子刊重磅研究揭示多模态大模型类脑机制
机器人圈·2025-06-11 11:43

Core Viewpoint - The research published in "Nature Machine Intelligence" demonstrates that multimodal large language models (MLLMs) can develop human-like object concept representations, challenging the notion that these models merely mimic human language without true understanding [2][4]. Group 1: Research Findings - The study analyzed 4.7 million behavioral judgment data to construct an "concept map" of AI models, confirming that MLLMs can form object concept representations similar to humans [3][6]. - The research identified 66 core dimensions of cognition through a sparse positive definite similarity embedding method, revealing that both ChatGPT-3.5 and the multimodal Gemini model exhibit stable low-dimensional representation structures [9]. - MLLMs spontaneously formed 18 high-level object concept categories with a classification accuracy of 78.3%, approaching human accuracy of 87.1% [13]. Group 2: Methodology - The research employed a novel "behavioral cognitive probe" method, integrating computational modeling, behavioral experiments, and neuroscience to analyze AI cognition [8]. - A triplet odd-one-out task was designed to assess the similarity of object representations between AI and humans, allowing for a comparative analysis of decision-making processes [5][31]. Group 3: Cognitive Dimensions - The study provided semantic labels for the cognitive dimensions of AI models, categorizing them into dimensions related to semantic categories, perceptual features, and physical components [17][19][20]. - The findings indicated a significant correlation between MLLM representations and human brain activity patterns, particularly in areas responsible for processing faces, scenes, and bodies [23][24]. Group 4: Implications and Future Directions - The research has broad applications, including the development of neuro-aligned AI systems, exploration of neural mechanisms for concept combination and reasoning, and enhancement of brain-computer interface systems [35]. - Future work will focus on expanding to next-generation multimodal models and establishing a cognitive benchmark testing platform to objectively assess AI's semantic understanding [35][36].