Core Insights - Alibaba's intelligent engine team has significantly improved the image generation speed of the Qwen model, reducing the time from nearly one minute to just 5 seconds for generating four 2K HD images, achieving a 40-fold speed increase [1][2]. Group 1: Technological Advancements - The team has released the updated model checkpoints on HuggingFace and ModelScope platforms, allowing developers to download and experience the advancements [3][4]. - Traditional trajectory distillation methods faced challenges in generating high-quality images with low iteration steps, often resulting in blurry outputs due to inadequate learning of detailed features [5][6]. - Recent advancements in probability space-based distillation, particularly the DMD2 algorithm, have shown significant success in maintaining detail while reducing the number of steps required for image generation [6][7]. Group 2: Methodology Improvements - DMD2's approach shifts constraints from sample space to probability space, enhancing the detail and quality of generated images by allowing the student model to learn from the teacher model's guidance on errors [10][11]. - To address issues of mode collapse and distribution sharpness, the team implemented a warm-starting technique using PCM distillation, which improved the model's performance in generating realistic images [12][14]. - The introduction of adversarial learning (GAN) further enhanced the detail and realism of the generated images, with strategies such as mixing real data with generated images and adjusting loss weights to stabilize training [22][24]. Group 3: Future Directions - The Wuli-Qwen-Image-Turbo model is expected to continue evolving, with plans for faster and more effective generation models to be released in the future [26]. - The team emphasizes a commitment to open-source culture, having previously contributed various projects and aiming to collaborate with the open-source community to enhance creative tools [26][27].
5秒出4张2K大图!阿里提出2步生成方案,拉爆AI生图进度条