Core Viewpoint - The "2025 Financial Large Model Evaluation System" was launched in Shanghai, marking a significant step in the intelligent transformation of the financial industry, aiming for higher quality and more reliable applications of AI technology in finance [1][2]. Group 1: Evaluation System Overview - The evaluation system is a collaborative effort between Shanghai Artificial Intelligence Laboratory and KuPass Technology, showcasing technological achievements in financial model assessment [1]. - The system is designed to provide a scientific benchmark for financial institutions, facilitating the selection and capability comparison of large models [1][2]. - The comprehensive upgrade of the evaluation system aims to support Shanghai's goal of becoming a globally influential financial technology center [1]. Group 2: Data and Methodology - The evaluation system integrates 4 public datasets and 22 self-built datasets, totaling approximately 36,000 evaluation data points [2]. - It employs a robust evaluation process with mechanisms like randomized options and diverse prompts, alongside the development of a financial referee large model for automated and standardized evaluation [2]. - The system aims to assist banks, brokerages, funds, and investment institutions in accurately assessing large model capabilities, optimizing selections, and managing risks [2]. Group 3: Reports and Applications - A joint report titled "Financial Large Model Application Evaluation Report (2025)" and a dataset titled "Financial Large Model Evaluation Dataset (2025)" were also released, focusing on real financial business scenarios [2]. - The report explores new concepts, mechanisms, and methods for applying large models in vertical financial fields, supporting institutions in scientific selection and cost reduction [2]. - The initiative is expected to accelerate the large model's implementation in key areas such as investment research, risk control, and customer service [2].
2025金融大模型评测体系在沪发布
Xin Hua Cai Jing·2025-12-27 13:17