卡特尔 - 霍恩 - 卡罗尔(CHC)理论
Search documents
 AGI今天起有了量化标准!Bengio牵头定义,当前进度条58%
 量子位· 2025-10-17 04:58
 Core Viewpoint - The article presents a measurable definition of Artificial General Intelligence (AGI) as an AI that can match or exceed the cognitive versatility and proficiency of a well-educated adult, emphasizing the need for comprehensive evaluation across multiple cognitive domains [2][4].   Evaluation Framework - A quantitative method was designed to assess the distance of current AI from AGI, referencing the Cattell-Horn-Carroll (CHC) theory, which breaks down human cognitive abilities into ten independent yet interconnected core cognitive domains [6][8]. - The assessment includes a question bank of over 500 items, with a scoring system where a total score of 100 indicates AGI level, and higher scores reflect closer proximity to AGI [8][9].   Current AI Performance - The evaluation revealed that while AI has made significant progress, it still falls short of AGI, with GPT-4 scoring only 27 and GPT-5 scoring 58, indicating a 115% increase over two years but still below the passing line of 100 [10][11][13]. - Current AI shows strong performance in knowledge, reading and writing, and mathematics, with GPT-5 scoring above 8 in these areas, reflecting its strengths in knowledge retention and symbolic processing [18][21][22].   Cognitive Shortcomings - Significant deficiencies were identified in foundational cognitive areas such as perception, memory, and reasoning, which cannot be compensated for by merely increasing data scale [23][30]. - In the visual and auditory domains, both GPT-4 and GPT-5 performed poorly, with GPT-4 scoring 0 and GPT-5 only achieving minimal recognition capabilities [24][26]. - Long-term memory storage and retrieval were also highlighted as critical weaknesses, with both models unable to demonstrate effective long-term information retention [27][29].   Misleading Capabilities - Some AI models appear to possess multi-tasking abilities but are essentially masking their shortcomings through technical means, such as expanding context windows, which do not equate to true long-term memory [30][32]. - The evaluation framework specifically excludes external tools, focusing solely on the intrinsic cognitive capabilities of AI systems, thereby revealing the limitations of models that rely on external knowledge sources [33][34].