Core Viewpoint - In 2025, AIGC (AI-Generated Content) has reached new heights, with AI-generated content permeating daily creation across various fields such as social avatars, e-commerce posters, and film storyboards. Notable models like Nano Banana and Qwen Edit have shown strong capabilities in general image editing, particularly the popular Nano Banana Pro, which converts text instructions into high-precision images. However, these models still exhibit shortcomings in specific niches and may not be cost-effective for simple tasks [1]. Group 1: Image Composition Research - The Niu Li team from Shanghai Jiao Tong University has been engaged in image composition research since late 2018, focusing on object insertion, commonly referred to as "fusion" in the AIGC community. Their work aims to address common issues in image composition, such as jagged edges, inconsistent lighting, missing shadows and reflections, and improper perspective [1][2]. - From 2018 to 2025, the Niu Li team has built over 10 datasets, developed more than 30 original models, and published over 25 high-quality academic papers. By the end of 2023, they launched the Libcom toolbox, which offers out-of-the-box image composition capabilities without the need for training or fine-tuning [2]. Group 2: Libcom Toolbox Features - The Libcom toolbox will undergo a comprehensive upgrade in 2025, introducing a user-friendly image composition workstation that focuses on 12 functionalities, including generation, detection, and evaluation, distinguishing it from general image editing models [2][5]. - The workstation interface allows users to register and access detailed functionality descriptions. The 12 features are categorized into six groups: 1. Basic Composition: alpha blending, Poisson blending 2. Image Harmonization: color transfer, image harmonization, artistic image harmonization 3. Background Effect Generation: shadow generation, reflection generation 4. Analysis Tools: disharmony area detection, object placement rationality heatmap 5. Scoring Tools: harmony score, object placement rationality score 6. Advanced Composition: integrates FLUX-Kontext and InsertAnything models [5]. Group 3: Performance Comparison - A practical exploration using the character Labubu demonstrated the capabilities of the Libcom workstation compared to Nano Banana Pro. In various scenarios, Libcom effectively integrated Labubu into different backgrounds, while Nano Banana Pro showed inconsistent results [7][14]. - For instance, when assessing harmony in lighting between Labubu and a forest background, Libcom provided a harmony score of 0.391, indicating poor harmony, while Nano Banana Pro scored 0.24, suggesting a similar conclusion but with discrepancies in the results [17][18]. - In artistic scenarios, Libcom allowed for more creative adjustments, while Nano Banana Pro maintained a more conservative approach. The performance of both models varied in generating shadows and reflections, with Libcom generally providing more accurate results [20][26][27].
与Banana Pro过过招,国产Libcom图像合成工作台开启Labubu漫游记
机器之心·2025-11-25 04:09