BFL&Krea重磅开源新图像模型，专注于极致真实细节去 AI 感

Core Viewpoint - The article discusses the launch of a new image model, FLUX.1-Krea, developed by Black Forest Labs and Krea, which aims to create images that do not exhibit typical "AI effects" and instead focus on natural details and aesthetics [1]. Group 1: AI Style and Model Limitations - There has been significant criticism regarding the unique appearance of AI-generated images, often characterized by blurry backgrounds, waxy skin textures, and dull compositions, collectively referred to as "AI style" [9]. - The pursuit of technical capabilities and benchmark optimization has led to a neglect of the chaotic realism, stylistic diversity, and creative fusion that early image models exhibited [10]. - Many existing benchmarks primarily measure compliance with prompts, focusing on spatial relationships and object counts, rather than aesthetic quality [12]. Group 2: Training Phases and Methodology - The training of image generation models is divided into two phases: pre-training and post-training, with the latter being crucial for the model's final quality [17][22]. - Pre-training should emphasize "mode coverage" and "world understanding," providing the model with a rich visual knowledge base to maximize diversity [20]. - The post-training phase focuses on refining the model to reduce undesirable outputs, with a need for a "raw" model that is not overly fine-tuned [24][26]. Group 3: Post-Training Insights - The post-training process involves two stages: supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF), with a focus on high-quality image datasets [28]. - Quality of data is more critical than quantity in effective post-training, with less than 1 million high-quality images being sufficient [31]. - A clear perspective in collecting preference data is essential, as mixing diverse aesthetic preferences can lead to suboptimal model performance [32].