中国信通院启动“可信AI”AI Safety Benchmark大模型幻觉评测

Core Viewpoint - The China Academy of Information and Communications Technology (CAICT) has initiated a large model hallucination testing to understand the current state of hallucinations in large language models and to promote practical applications of these models [1] Group 1: Testing Overview - The hallucination testing will focus on large language models, addressing two types of hallucinations: factual hallucinations and fidelity hallucinations [1] - The testing data comprises over 7,000 Chinese test samples, with testing formats including information extraction and knowledge reasoning for fidelity hallucination detection, as well as fact judgment questions for factual hallucination detection [1] Group 2: Testing Dimensions - The testing will cover five dimensions: humanities, social sciences, natural sciences, applied sciences, and formal sciences [1]