Core Insights - The article discusses advancements in AI image generation, particularly focusing on the Qwen model, which has significantly reduced image generation time from nearly one minute to just 5 seconds for 4 high-definition images [1][3]. Group 1: Model Performance Improvements - The Qwen model's latest open-source version has achieved a state-of-the-art (SOTA) compression level, reducing the forward computation steps from 80-100 to just 2 steps, resulting in a 40-fold speed increase [2]. - The introduction of the DMD2 algorithm has shifted the constraints from sample space to probability space, enhancing the quality of generated images by addressing detail loss issues [8][10]. - The Reverse-KL loss design in DMD2 allows the student model to generate images independently while receiving guidance from the teacher model, improving detail and realism in the generated images [11][12]. Group 2: Challenges and Solutions - Traditional trajectory distillation methods faced challenges in generating high-quality images with low iteration steps, often resulting in blurry outputs due to insufficient learning of detailed features [6][7]. - To mitigate distribution degradation issues, the team implemented a "warm start" using PCM distillation, which significantly improved the model's ability to generate realistic shapes [14][17]. - The introduction of adversarial learning (GAN) further enhanced the student model's performance by improving texture and detail in generated images [20][26]. Group 3: Future Directions - The team plans to continue releasing faster and more effective generative models, addressing limitations in complex scenarios where noise reduction steps may still require improvement [32]. - Ongoing efforts will focus on developing and iterating more diffusion acceleration technologies, with an emphasis on open-source contributions to the community [33][35]. - The advancements will be made available on the Wuli AI platform, aiming to provide accessible creative tools for designers, content creators, and AI enthusiasts [36].
5秒出4张2K大图!阿里提出2步生成方案,拉爆AI生图进度条