Workflow
AI视觉生成
icon
Search documents
ICLR2026 Oral | 北大彭一杰团队提出高效优化新范式,递归似然比梯度优化器赋能扩散模型后训练
机器之心· 2026-03-09 03:58
Core Viewpoint - The article discusses the introduction of the Recursive Likelihood Ratio (RLR) optimizer by Professor Peng Yijie’s team from Peking University, which offers a new semi-gradient fine-tuning solution for diffusion models, addressing the challenges of efficiency and performance in downstream applications [2][10]. Group 1: Background and Challenges - Diffusion models (DM) have become a core framework for image synthesis and video generation due to their high-fidelity data generation capabilities [2]. - The main challenge in the industry is how to efficiently adapt pre-trained diffusion models to meet specific application requirements [2]. - Current mainstream fine-tuning methods are divided into two categories: reinforcement learning (RL) methods and truncated backpropagation (BP) methods, both of which have significant drawbacks [7]. - Truncated BP methods can lead to structural bias in gradient estimation, potentially causing model collapse and content degradation [7]. - RL methods, while reducing memory requirements, suffer from high variance in gradient estimation and slow convergence [7]. Group 2: RLR Optimizer Design - The RLR optimizer introduces a semi-gradient estimation paradigm that utilizes the inherent noise characteristics of diffusion models to achieve unbiased and low-variance gradient estimation [10]. - The core design of the RLR optimizer includes three main modules: 1. First-order estimation module that directly backpropagates through the reward model at the first time step [11]. 2. Zero-order estimation module that employs parameter perturbation strategies for remaining time steps, ensuring unbiased gradient estimation without caching intermediate latent variables [12]. - The optimizer's controllable parameter, the local sub-chain length (h), directly influences the trade-off between memory usage and gradient variance [14]. Group 3: Performance Validation - The effectiveness of the RLR optimizer was validated through large-scale experiments on Text2Image and Text2Video tasks, showing superior performance compared to existing RL and truncated BP methods [18]. - In the Text2Image task, RLR improved the ImageReward score of Stable Diffusion 1.4 from 32.90 to 76.55, outperforming DDPO by approximately 47% and AlignProp by about 14% [18]. - In the Text2Video task, RLR achieved a weighted average score of 84.63, surpassing other models like VideoCrafter and Gen-2, particularly excelling in the dynamic degree metric [18][20]. - The RLR optimizer also incorporates a diffusion thinking chain prompt technique, which enhances performance in fine-grained tasks such as hand generation by targeting specific scales of generation defects [22].
美图设计室接入阿里千牛,为淘宝、天猫商家提供相应服务
Ge Long Hui· 2025-09-30 08:16
Core Insights - Meitu's design studio has integrated with Alibaba's QianNiu backend, providing AI visual generation services for Taobao and Tmall merchants [1] - The integration aims to address the long-standing pain points of e-commerce sellers regarding visual content production, which involves high costs and time consumption [1] - The core functionalities of Meitu's design studio include product image generation, photo editing, and poster design, catering to the full spectrum of e-commerce design needs [1] - This move is seen as a result of the strategic partnership established between Meitu and Alibaba in May 2023, with plans to introduce additional features like "AI model fitting" and "poster editor" in the future [1] Company and Industry Summary - The integration of AI capabilities into e-commerce platforms is expected to significantly reduce the costs associated with visual content creation for sellers [1] - Meitu's design studio targets a user base that overlaps significantly with Taobao and Tmall sellers, indicating a strategic alignment with major e-commerce players [1] - The introduction of AI tools in the e-commerce sector reflects a growing trend towards automation and efficiency in visual content production, which is crucial for online retail success [1]
美图设计室接入阿里千牛 为淘天商家提供服务
Xin Lang Ke Ji· 2025-09-30 07:49
Core Insights - Meitu's design studio has integrated with Alibaba's QianNiu backend to provide AI visual generation services for Taobao and Tmall merchants [1] - The strategic partnership between Meitu and Alibaba was established in May, focusing on e-commerce design tools [1] - The design studio offers a range of functionalities including product image generation, photo editing, and poster design, with plans to introduce additional features like "AI model fitting" and "poster editor" in the future [1]