BRHVC
Search documents
超越 VTM-RA!快手双向智能视频编码器BRHVC亮相NeurIPS2025
机器之心· 2025-11-21 03:56
Core Viewpoint - The article discusses the challenges and advancements in bi-directional video coding, particularly focusing on the new BRHVC method developed by Kuaishou's audio and video technology team, which significantly improves compression performance over existing standards [2][29]. Video Coding Challenges - Video coding is essential for addressing the conflict between massive video data and limited transmission and storage resources, with uncompressed 4K video reaching up to 20 GB per minute [4]. - Current video coding techniques can reduce video bitrate by 1/100 to 1/1000, enabling applications like short videos, live streaming, and cloud gaming [4]. Bi-directional Coding - Bi-directional coding (RA mode) has been a "secret weapon" for efficient compression but faces challenges in deep learning-based intelligent video coding due to complex reference structures [2][7]. - The RA mode can save over 20% bitrate compared to low-latency modes while maintaining high quality, making it suitable for on-demand and storage scenarios [7]. Key Issues in RA Mode - The long-span frame motion processing is complicated due to the exponential growth of frame intervals, which can reach up to 32 frames, leading to significant motion complexity [8]. - There is a notable imbalance in the contribution of reference frames, where the value of information from two reference frames can differ significantly, affecting encoding efficiency [9][11]. BRHVC Framework - The BRHVC framework introduces two innovative modules: Bi-directional Motion Converge (BMC) and Bi-directional Contextual Fusion (BCF), addressing the challenges of long-span motion processing and reference contribution imbalance [13][20]. - BMC enhances motion estimation by aggregating multi-scale optical flow into a single latent variable, improving motion compensation accuracy in large displacement scenarios [16][17]. - BCF generates spatially adaptive weight maps to re-weight reference features based on their importance, effectively addressing occlusion issues in long-span frames [20][22]. Experimental Results - BRHVC achieved an average bitrate saving of 32.0% compared to traditional encoders like VTM-LDB, with a peak saving of 44.7% in Class D sequences [25]. - The framework also surpassed the VTM-RA encoder in encoding efficiency, demonstrating its effectiveness in bi-directional intelligent video compression [25]. Conclusion - The research highlights the core challenges in bi-directional intelligent video compression and presents the BRHVC framework as a significant advancement, providing a new direction for future developments in intelligent video coding [29].