Workflow
AdNanny
icon
Search documents
一个模型统一所有离线任务!微软用671B大模型重构广告推荐「推理大脑」
Sou Hu Cai Jing· 2026-02-18 05:37
Core Insights - Microsoft has developed a unified offline reasoning hub named AdNanny, built on a 671 billion parameter model, which significantly outperforms previous systems in managing advertising tasks [3][4]. Group 1: Transition from "Model Forest" to "Intelligent Centralization" - The advertising recommendation system is currently hindered by a fragmented approach, where multiple small models are used for various offline tasks, leading to inefficiencies and high operational costs [4]. - AdNanny aims to serve as a central intelligent hub for offline tasks, moving away from maintaining numerous task-specific models to a single, centralized model that enhances performance and reduces costs [4][5]. Group 2: Data Transformation - AdNanny transforms data from simple "label mapping" to understanding "decision logic" through a three-stage automated data factory, ensuring high-quality reasoning data [5]. - The process includes reasoning generation, validation by human experts, and a sampling rejection method to ensure only accurate causal relationships are learned [5][6]. Group 3: Training Mechanisms - AdNanny employs dynamic re-weighting to focus on challenging tasks and samples, ensuring that less frequent but high-value tasks are not overshadowed by larger datasets [7]. - Reinforcement learning is integrated during fine-tuning, aligning the model's output with downstream business metrics to ensure practical effectiveness [8]. Group 4: Engineering Innovations - The model utilizes a mixed parallel architecture, achieving high computational efficiency across 248 GPUs, which minimizes network bottlenecks [10]. - AdNanny's inference optimization through FP8 quantization has led to a 50% reduction in offline computational costs compared to multiple smaller models [11]. Group 5: Practical Implementation - AdNanny has demonstrated superior performance in key tasks such as Query-Ad relevance and Ad-User matching, significantly lowering costs and reducing the need for manual labeling of ambiguous samples [12][13]. - The system architecture has been simplified, moving away from numerous independent data model pipelines to a more streamlined and maintainable structure [13]. Conclusion - AdNanny represents a significant shift in the approach to industrial AI in advertising, moving beyond mere computational power to a more thoughtful and integrated methodology [14].
一个模型统一所有离线任务!微软用671B大模型重构广告推荐「推理大脑」
量子位· 2026-02-17 03:58
Core Insights - Microsoft has developed a unified offline reasoning hub called AdNanny, built on the DeepSeek-R1 model with 671 billion parameters, which significantly outperforms previous models in managing advertising systems [4][8][20] Group 1: Paradigm Shift - The advertising recommendation system is transitioning from a "model forest" approach, which involves numerous small models for different tasks, to a centralized intelligent hub that enhances performance and reduces costs [6][8] - The "model forest" approach leads to knowledge silos, high operational costs, and black-box decision-making processes, making it inefficient [6][7] Group 2: Data Transformation - AdNanny's strength lies in its innovative data processing, transforming raw advertising data into high-quality reasoning-enhanced corpora through a three-stage automated data factory [9] - The process includes reasoning generation, validation by human experts, and rejection sampling to ensure the model learns correct causal relationships [9][10] Group 3: Training Mechanisms - AdNanny employs dynamic re-weighting to focus on challenging tasks and samples, ensuring that less frequent tasks receive adequate attention during training [11][12] - Reinforcement learning is integrated to align model outputs with business metrics, ensuring that generated reasoning contributes positively to ad clicks and conversions [13] Group 4: Engineering Innovations - The model utilizes a mixed parallel architecture for efficient training, achieving high computational resource utilization across 248 GPUs [14][15] - AdNanny's inference optimization through FP8 quantization has reduced offline computational costs by approximately 50% compared to multiple smaller models [17] Group 5: Practical Applications - AdNanny has demonstrated superior performance in key tasks such as query-ad relevance and ad-user matching, significantly lowering costs and simplifying system architecture [18][19] - The model's ability to provide reliable initial assessments for ambiguous samples reduces the need for extensive manual labeling, streamlining the workflow [18] Conclusion - AdNanny represents a significant advancement in industrial AI, moving beyond mere computational power to a deeper logical framework that could influence various sectors beyond advertising, such as search, e-commerce, and financial decision-making [20]