Investment Rating - The report does not explicitly provide an investment rating for the industry or companies involved Core Insights - The evaluation of large language models in report writing highlights the top performers: 商汤商量, 讯飞星火, and 文心一言 3.5, which excel in various assessment dimensions [8][9][21] - The report emphasizes the importance of foundational capabilities in industry research, with models like 商汤商量, GPT3.5, and 文心一言 3.5 demonstrating strong performance in logical reasoning, text generation, and intent understanding [24][28][32] Summary by Sections Report Writing Capability Assessment - 商汤商量, 讯飞星火, and 文心一言 3.5 ranked highest in report writing capabilities, with 商汤商量 showing consistent performance across all evaluation modules [10][12][21] - The assessment identified significant differences in performance among various platforms, with GPT3.5 and 百川 losing points due to outdated information and completeness issues [12][21] - The evaluation of low-difficulty writing modules showed no significant differences among the 12 models, although some models lost points due to inability to answer specific questions [8][17] Foundational Research Capability Assessment - 商汤商量, GPT3.5, and 文心一言 3.5 lead in foundational research capabilities, showcasing strengths in logical reasoning, text generation, and intent understanding [24][28][32] - The report indicates that models like 智谱清言 and 百川 excel in specific areas, such as intent understanding, despite lower overall scores [26][28] - The foundational capability assessment revealed that 商汤商量 excels in context switching and knowledge retention, while 讯飞星火 leads in logical reasoning and text generation [27][29]
2023年中国大模型评测(一):行研创作新范式
Tou Bao Yan Jiu Yuan·2024-04-11 16:00