FlowEdit
Search documents
汇报一下ICCV全部奖项,恭喜朱俊彦团队获最佳论文
具身智能之心· 2025-10-26 04:02
Core Insights - The article highlights the significant presence of Chinese authors at ICCV 2025, accounting for 50% of the submissions, showcasing China's growing influence in the field of computer vision [1]. Awards and Recognitions - The Best Paper Award (Marr Prize) was awarded to a study titled "Generating Physically Stable and Buildable Brick Structures from Text," which introduced BRICKGPT, a model that generates stable brick structures based on textual prompts [4][24]. - The Best Student Paper Award went to "FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models," which presents a method for editing images without the need for inversion [6][38]. - Honorary mentions for Best Paper included "Spatially-Varying Autofocus," which innovatively allows cameras to focus on different depths simultaneously [7][42]. - Honorary mentions for Best Student Paper included "RayZer: A Self-supervised Large View Synthesis Model," which autonomously reconstructs camera parameters and generates new perspectives from uncalibrated images [9][47]. Notable Research Contributions - The BRICKGPT model was trained on a dataset of over 47,000 brick structures, demonstrating its ability to generate aesthetically pleasing and stable designs that can be assembled manually or by robotic arms [24][26]. - FlowEdit utilizes a differential equation to map source and target distributions directly, achieving advanced results without the need for model-specific dependencies [39][40]. - The "Fast R-CNN" method, awarded the Helmholtz Prize, significantly improved training and testing speeds while enhancing detection accuracy in object recognition tasks [10][54]. - The research on modified activation functions, which led to a new parameterized ReLU, achieved a top-5 test error of 4.94% on the ImageNet dataset, surpassing human-level performance [58][60]. Awarded Teams and Individuals - The SMPL Body Model Team developed a highly accurate 3D human model based on extensive data from 3D scans, enhancing compatibility with mainstream rendering pipelines [62][66]. - The VQA Team created a dataset for visual question answering, containing approximately 250,000 images and 7.6 million questions, facilitating deeper understanding and reasoning about image content [68][69]. - Distinguished researchers David Forsyth and Michal Irani received the Outstanding Researcher Award for their contributions to computer vision and machine learning [72][75]. - Rama Chellappa was honored with the Azriel Rosenfeld Lifetime Achievement Award for his extensive work in computer vision and pattern recognition [78].
刚刚,ICCV最佳论文出炉,朱俊彦团队用砖块积木摘得桂冠
具身智能之心· 2025-10-23 00:03
Core Insights - The article discusses the recent International Conference on Computer Vision (ICCV) held in Hawaii, highlighting the award-winning research papers and their contributions to the field of computer vision [2][5][24]. Group 1: Award Winners - The Best Paper Award was given to a research team from Carnegie Mellon University (CMU) for their paper titled "Generating Physically Stable and Buildable Brick Structures from Text," led by notable AI scholar Zhu Junyan [3][7][11]. - The Best Student Paper Award was awarded to a paper from the Technion, titled "FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models," which introduces a novel image editing method [28][30]. Group 2: Conference Statistics - ICCV is one of the top three conferences in computer vision, held biennially, with this year's conference receiving 11,239 valid submissions and accepting 2,699 papers, resulting in a 24% acceptance rate, a significant increase from the previous conference [5]. Group 3: Research Contributions - The paper by CMU presents Brick GPT, the first method capable of generating physically stable and interconnected brick assembly models based on text prompts. The research includes a large dataset of over 47,000 brick structures and 28,000 unique 3D objects with detailed descriptions [11][13]. - The FlowEdit paper from Technion proposes a new image editing approach that bypasses the traditional image-to-noise inversion process, achieving higher fidelity edits by establishing a direct mapping path between source and target image distributions [32][34]. Group 4: Methodology and Results - The Brick GPT method utilizes a self-regressive large language model trained on a dataset of brick structures, incorporating validity checks and a physics-aware rollback mechanism to ensure stability in generated designs [13][19]. - Experimental results show that Brick GPT outperforms baseline models in terms of validity and stability, achieving a 100% validity rate and 98.8% stability in generated structures [20][22].
汇报一下ICCV全部奖项,恭喜朱俊彦团队获最佳论文
量子位· 2025-10-22 05:48
Core Points - The ICCV 2025 conference in Hawaii highlighted significant contributions from Chinese researchers, who accounted for 50% of the paper submissions [1] - Various prestigious awards were announced, showcasing advancements in computer vision research [3] Award Highlights - Best Paper Award (Marr Prize): "Generating Physically Stable and Buildable Brick Structures from Text" introduced BRICKGPT, a model that generates stable brick structures based on text prompts, utilizing a dataset of over 47,000 structures [4][24][26] - Best Student Paper Award: "FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models" proposed a method for image editing without inversion, achieving state-of-the-art results [6][39][40] - Best Paper Honorary Mention: "Spatially-Varying Autofocus" developed a technique for dynamic depth adjustment in imaging, enhancing focus clarity across scenes [7][42][44] - Best Student Paper Honorary Mention: "RayZer: A Self-supervised Large View Synthesis Model" demonstrated 3D perception capabilities using uncalibrated images [9][47][49] Special Awards - Helmholtz Prize: Awarded to "Fast R-CNN" for its efficient object detection capabilities, significantly improving training and testing speeds [10][52][54] - Another Helmholtz Prize was given for research on rectified activation functions, achieving performance surpassing human-level accuracy on ImageNet [10][59][60] - Evelyn Erham Award: Recognized teams for their contributions to 3D modeling and visual question answering [12][63][68] - Distinguished Researcher Award: David Forsyth and Michal Irani were honored for their impactful work in computer vision [14][73][76] - Azriel Rosenfeld Lifetime Achievement Award: Rama Chellappa was recognized for his extensive contributions to the field [16][79] Research Contributions - The BRICKGPT model was developed to generate physically stable structures, utilizing a large dataset and innovative mechanisms for stability [24][26] - FlowEdit's approach allows for seamless image editing across different model architectures, enhancing flexibility in applications [39][40] - The spatially-varying autofocus technique improves image clarity by dynamically adjusting focus based on scene depth [42][44] - RayZer's self-supervised learning approach enables 3D scene reconstruction without the need for calibrated camera data [47][49] Conclusion - The ICCV 2025 conference showcased groundbreaking research and innovations in computer vision, with significant contributions from various teams and individuals, particularly highlighting the achievements of Chinese researchers [1][3]
刚刚,ICCV最佳论文出炉,朱俊彦团队用砖块积木摘得桂冠
机器之心· 2025-10-22 03:30
Core Insights - The ICCV (International Conference on Computer Vision) awarded the best paper and best student paper on October 22, 2023, highlighting significant advancements in computer vision research [1][2][4]. Group 1: Best Paper - The best paper award was given to a research team from Carnegie Mellon University (CMU) for their paper titled "Generating Physically Stable and Buildable Brick Structures from Text" led by notable AI scholar Junyan Zhu [6][9]. - The paper introduces BrickGPT, a novel method that generates physically stable and interconnected brick assembly models based on text prompts, marking a significant advancement in the field [9][11]. - The research team created a large-scale dataset of stable brick structures, comprising over 47,000 models and 28,000 unique 3D objects with detailed text descriptions, to train their model [11][10]. Group 2: Methodology and Results - The methodology involves discretizing a brick structure into a sequence of text tokens and training a large language model to predict the next brick to add, ensuring physical stability through validity checks and a rollback mechanism [10][17]. - Experimental results indicate that BrickGPT achieved a 100% validity rate and 98.8% stability rate, outperforming various baseline models in both effectiveness and stability [20][18]. - The paper's approach allows for the generation of diverse and aesthetically pleasing brick structures that align closely with the input text prompts, demonstrating high fidelity in design [11][20]. Group 3: Best Student Paper - The best student paper award went to a research from the Technion titled "FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models," which innovatively bypasses traditional image editing paths to enhance image fidelity [25][28]. - FlowEdit establishes a direct mapping path between source and target image distributions, resulting in lower transfer costs and better preservation of original image structure during editing [31][27]. - The method was validated on advanced T2I flow models, achieving state-of-the-art results across various complex editing tasks, showcasing its efficiency and superiority [31][31]. Group 4: Other Awards and Recognitions - The Helmholtz Prize was awarded for contributions to computer vision benchmarks, recognizing two significant papers, including "Fast R-CNN" by Ross Girshick, which improved detection speed and accuracy [36][38]. - The Everingham Prize recognized teams for their contributions to 3D modeling and multimodal AI, including the development of the SMPL model and the VQA dataset [41][43]. - Significant Researcher Awards were given to David Forsyth and Michal Irani for their impactful contributions to the field of computer vision [50][52].