MICROSOFT-一个模型统一所有离线任务！微软用671B大模型重构广告推荐「推理大脑」

Core Insights - Microsoft has developed a unified offline reasoning hub named AdNanny, built on a 671 billion parameter model, which significantly outperforms previous systems in managing advertising tasks [3][4]. Group 1: Transition from "Model Forest" to "Intelligent Centralization" - The advertising recommendation system is currently hindered by a fragmented approach, where multiple small models are used for various offline tasks, leading to inefficiencies and high operational costs [4]. - AdNanny aims to serve as a central intelligent hub for offline tasks, moving away from maintaining numerous task-specific models to a single, centralized model that enhances performance and reduces costs [4][5]. Group 2: Data Transformation - AdNanny transforms data from simple "label mapping" to understanding "decision logic" through a three-stage automated data factory, ensuring high-quality reasoning data [5]. - The process includes reasoning generation, validation by human experts, and a sampling rejection method to ensure only accurate causal relationships are learned [5][6]. Group 3: Training Mechanisms - AdNanny employs dynamic re-weighting to focus on challenging tasks and samples, ensuring that less frequent but high-value tasks are not overshadowed by larger datasets [7]. - Reinforcement learning is integrated during fine-tuning, aligning the model's output with downstream business metrics to ensure practical effectiveness [8]. Group 4: Engineering Innovations - The model utilizes a mixed parallel architecture, achieving high computational efficiency across 248 GPUs, which minimizes network bottlenecks [10]. - AdNanny's inference optimization through FP8 quantization has led to a 50% reduction in offline computational costs compared to multiple smaller models [11]. Group 5: Practical Implementation - AdNanny has demonstrated superior performance in key tasks such as Query-Ad relevance and Ad-User matching, significantly lowering costs and reducing the need for manual labeling of ambiguous samples [12][13]. - The system architecture has been simplified, moving away from numerous independent data model pipelines to a more streamlined and maintainable structure [13]. Conclusion - AdNanny represents a significant shift in the approach to industrial AI in advertising, moving beyond mere computational power to a more thoughtful and integrated methodology [14].