Study: AI LLM Models Now Master Highest CFA Exam Level

Core Insights - A recent study indicates that leading large language models (LLMs) can now pass the CFA Level III exam, including its challenging essay portion, which was previously a struggle for AI models [2][4]. Group 1: Study Overview - The research was conducted by NYU Stern School of Business and Goodfin, focusing on the capabilities of LLMs in specialized finance domains [3]. - The study benchmarked 23 leading AI models, including OpenAI's GPT-4 and Google's Gemini 2.5, against the CFA Level III mock exam [4]. Group 2: Performance Metrics - OpenAI's o4-mini model achieved a composite score of 79.1%, while Gemini's 2.5 Flash model scored 77.3% [5]. - Most models performed well on multiple-choice questions, but only a few excelled in the essay prompts that require analysis and strategic thinking [5]. Group 3: Reasoning and Grading - NYU Stern Professor Srikanth Jagabathula noted that recent LLMs have shown significant capabilities in quantitative and critical thinking tasks, particularly in essay responses [6]. - An LLM was used to grade the essay portion, and it was found to be stricter than human graders, assigning fewer points overall [7]. Group 4: Impact of Prompting Techniques - The study highlighted that using chain-of-thought prompting improved the performance of AI models on the essay portion, increasing accuracy by 15 percentage points [8].