大模型推理语言选择
Search documents
老外傻眼,明用英文提问,DeepSeek依然坚持中文思考
3 6 Ke· 2025-12-03 09:14
Core Insights - DeepSeek has launched two new models, DeepSeek-V3.2 and DeepSeek-V3.2-Speciale, which show significant improvements in reasoning capabilities, with DeepSeek-V3.2 competing directly with GPT-5 and Speciale performing comparably to Gemini-3.0-Pro [1] - There is a notable phenomenon where even when queries are made in English, the model sometimes reverts to using Chinese during its reasoning process, leading to confusion among overseas users [3][5] - The prevalent belief is that Chinese characters have a higher information density, allowing for more efficient expression of the same textual meaning compared to English [5][9] Model Performance and Efficiency - Research indicates that using non-English languages for reasoning can lead to a 20-40% reduction in token consumption without sacrificing accuracy, with DeepSeek R1 showing token reductions ranging from 14.1% (Russian) to 29.9% (Spanish) [9] - A study titled "EfficientXLang" supports the idea that reasoning in non-English languages can enhance token efficiency, which translates to lower reasoning costs and reduced computational resource requirements [6][9] - Another study, "One ruler to measure them all," reveals that English is not the best-performing language for long-context tasks, ranking sixth among 26 languages, with Polish taking the top spot [10][15] Language and Training Data - The observation that Chinese is frequently used in reasoning by models trained on substantial Chinese datasets is considered normal, as seen in the case of the AI programming tool Cursor's new version [17] - The phenomenon of models like OpenAI's o1-pro occasionally using Chinese during reasoning is attributed to the higher proportion of English data in their training, which raises questions about the language selection process in large models [20] - The increasing richness of Chinese training data suggests that models may eventually exhibit more characteristics associated with Chinese language processing [25]