Workflow
生成式强化学习
icon
Search documents
生成式强化学习在广告自动出价场景的技术实践
AI前线· 2025-09-28 05:48
Core Insights - The article discusses the evolution and challenges of bidding algorithms in real-time bidding (RTB) advertising systems, emphasizing the transition from traditional methods to advanced techniques like generative reinforcement learning [2][3][7]. Group 1: Evolution of Bidding Algorithms - The bidding algorithm has evolved through three generations: PID, MPC, and reinforcement learning (RL), each improving upon the previous in terms of adaptability and effectiveness in complex bidding environments [5][6][7]. - The introduction of generative reinforcement learning aims to enhance decision-making by utilizing historical bidding sequences for more accurate predictions [8][10]. Group 2: Challenges in Bidding - Key challenges faced by bidding algorithms include the need to manage daily budgets while minimizing conversion costs, the unpredictability of traffic and competitor behavior, and the complexity of sequential decision-making [5][6]. - The reliance on high-quality datasets poses a challenge, as simple exploration can lead to out-of-distribution (OOD) issues, necessitating efficient offline exploration mechanisms [12][14]. Group 3: GAVE Algorithm - The GAVE algorithm integrates score-based return-to-go (RTG) and value function-based action exploration to enhance model learning and address the challenges of data quality and exploration [18][19]. - Experimental results show that GAVE outperforms baseline algorithms in various budget settings, demonstrating its effectiveness in maximizing conversion value [22][25]. Group 4: CBD Algorithm - The CBD algorithm introduces Completer and Aligner modules to improve the alignment of generated sequences with optimization goals, addressing issues of sequence legality and preference alignment [29][31]. - Offline experiments indicate that CBD significantly outperforms other methods in total conversion value, validating its effectiveness in real-world applications [34][36]. Group 5: Future Directions - Future advancements in bidding technology are expected to focus on developing foundational models that leverage multi-scenario data and enhancing interpretability and decision-making capabilities through the integration of large language models [41].
快手解密「AI印钞机」,首提生成式强化学习出价技术,为平台实现超过3%的广告收入提升
机器之心· 2025-09-23 04:08
机器之心报道 编辑:Panda、张倩 前段时间,谷歌母公司 Alphabet 市值突破 3 万亿美元,成为第四家市值达到这一门槛的公司。 如果时间倒回到两年半以前,谷歌自己可能都没有想到这一结果。当时,ChatGPT 带来的冲击让外界开始质疑谷歌能否守住营收,尤其是广告营收。甚至还有人 发出灵魂追问:谷歌会成为下一个诺基亚吗? 然而,事实的发展出乎许多人意料 —— 谷歌不仅稳住了广告基本盘,还通过将生成式 AI 融入搜索和广告投放,提升了用户意图理解和广告匹配效率,让广告价 值进一步放大。 在国内,我们也看到了这种趋势。上个月,快手发布了 Q2 财报。财报显示,这一季度, 快手线上营销服务收入 198 亿元,同比增长 12.8% 。财报明确指出,大 模型在投放出价、营销推荐方面的应用取得显著进展。在营销出价方面,快手优化了生成式出价算法,运用强化学习和长期价值策略,提升了广告转化效果。在 营销推荐环节,快手利用大语言模型的内容理解和推理能力,采用生成式方法筛选广告,深入挖掘用户行为与广告转化的关联性,生成符合用户兴趣的广告内 容,经过排序优化后显著提高点击率,推动营销服务收入实现两位数增长。 这些信号表明,A ...