Core Insights - Meituan's LongCat team has launched a new efficient reasoning model, LongCat-Flash-Thinking, which is claimed to be more powerful and specialized than its predecessor, LongCat-Flash-Chat [1][2] - LongCat-Flash-Thinking has achieved state-of-the-art (SOTA) performance in various reasoning tasks, surpassing leading closed-source models like GPT-5-Thinking in some areas [1][2] Performance Metrics - In the ARC-AGI benchmark test, LongCat-Flash-Thinking scored 50.3, outperforming OpenAI's o3 and Gemini 2.5 Pro [1] - The model scored 79.4 in LiveCodeBench, matching the performance of closed-source model GPT-5 [2] - In OJBench, it achieved a score of 40.7, closely approaching Gemini 2.5-Pro [2] - LongCat-Flash-Thinking set a new SOTA score of 74.0 in the τ2-Bench for open-source models [2] Unique Features - The model combines "deep thinking + tool invocation" with both "non-formal + formal" reasoning capabilities, making it the first of its kind in China [2] - LongCat-Flash-Thinking has been fully open-sourced on platforms like HuggingFace and GitHub, and is available for experience on the official website [2] Hardware and Training - Notably, the training of LongCat-Flash was conducted on domestic acceleration cards rather than NVIDIA GPUs, although specific hardware manufacturers have not been disclosed [2]
美团发布并开源高效推理模型LongCat-Flash-Thinking