这个春节P图不求人！小红书开源图像编辑新SOTA

Core Viewpoint - The article highlights the launch of Xiaohongshu's foundational model FireRed-Image-Edit, which demonstrates exceptional capabilities in AI image generation and editing, achieving state-of-the-art (SOTA) performance in various benchmarks [2][3]. Group 1: Performance and Evaluation - FireRed-Image-Edit excels in handling complex editing instructions, style transfers, and high-precision text editing, showcasing superior understanding and efficiency compared to competitors [3][4]. - The model's performance is validated through a newly introduced evaluation framework called RedEdit Bench, which includes 15 sub-tasks covering real-world editing scenarios such as portrait beautification and low-quality enhancement [9][10]. - The RedEdit Bench will be open-sourced to establish a new standard for evaluating image editing models in the open-source community [11]. Group 2: Technical Foundation - The model's architecture is supported by a robust data engine and a three-phase training process, which includes pre-training, fine-tuning, and reinforcement learning stages to enhance its capabilities [13][16]. - The data engine efficiently generates training data by breaking down complex editing tasks into manageable sub-tasks, ensuring high-quality data through a rigorous cleaning process [14]. Group 3: Core Capabilities - FireRed-Image-Edit features advanced instruction adherence, allowing it to understand the semantic relationship between commands and images rather than relying on rote memorization [20]. - The model introduces a Layout-Aware OCR-based Reward system during the reinforcement learning phase, improving text editing accuracy by penalizing errors in character placement and layout [26][27]. - It supports creative scene generation and multi-reference image generation, enabling style transfer and image fusion capabilities [33]. Group 4: Future Developments - Xiaohongshu plans to further enhance the foundational model's capabilities in portrait beautification, consistency, and text editing, with ongoing updates and open-source releases in the coming months [49].