174名北大学生能否考过AI? 结果很意外
Xin Lang Cai Jing·2025-12-28 17:21

Core Insights - The article discusses a unique examination conducted at Peking University, where advanced AI models such as GPT, Gemini, and DeepSeek competed against 174 undergraduate students in the College of Chemistry and Molecular Engineering [1][2] - The examination aimed to assess whether AI truly understands chemistry, utilizing a specially designed test that emphasizes reasoning over rote memorization [3][6] Group 1: Examination Design - The test comprised 500 challenging questions derived from high-level academic literature, specifically tailored to prevent AI from relying on memorized content [2][4] - A collaborative platform was created for the team of nearly 100 students and faculty to design, review, and refine the questions, incorporating a gamified points system to enhance engagement [4] Group 2: Results and Performance - The average accuracy of the participating students was 40.3%, indicating the high difficulty level of the exam [6] - AI models performed at a level comparable to that of first-year undergraduate students, revealing limitations in their ability to process visual information and complex reasoning tasks [7] Group 3: SUPERChem Project - The SUPERChem project fills a gap in multi-modal deep reasoning assessments in the field of chemistry, serving as a benchmark for future AI development [8] - The project has been fully open-sourced, with the intention of contributing to the global scientific and AI community, highlighting the journey from knowledge retention to understanding the physical world [8]