腾讯混元宣布开源首个多模态统一CoT奖励模型

Core Insights - Tencent's Mix Yuan announced a collaboration with Shanghai AILab, Fudan University, and Shanghai Chuangzhi Academy to introduce a new research initiative called UnifiedReward-Think, which aims to develop the first unified multimodal reward model with long-chain reasoning capabilities [1] Group 1 - The UnifiedReward-Think model enables the reward model to "learn to think" in various visual tasks, significantly improving the accurate evaluation of complex visual generation and understanding tasks [1] - The project enhances cross-task generalization and reasoning interpretability [1] - The initiative has been fully open-sourced, including the model, dataset, training scripts, and evaluation tools [1]