视频生成技术
Search documents
博纳影业:公司积极关注国内外视频生成产品和相关技术发展
Zheng Quan Ri Bao Wang· 2025-10-16 09:45
Core Viewpoint - Bona Film Group (001330) is actively monitoring the development of video generation products and related technologies both domestically and internationally, and is exploring applications in these areas based on its business layout [1] Group 1 - The company will disclose relevant progress in accordance with regulations through designated media on the Shenzhen Stock Exchange [1] - Investors are encouraged to pay attention to the company's subsequent announcements and regular reports [1]
赛力斯取得一种视频生成相关专利
Jin Rong Jie· 2025-08-01 05:38
Core Insights - Chengdu Silis Technology Co., Ltd. has obtained a patent for a "video generation method, device, electronic equipment, and storage medium" with authorization announcement number CN119743660B, applied on March 2025 [1] Company Overview - Chengdu Silis Technology Co., Ltd. was established in 2021 and is located in Chengdu, primarily engaged in software and information technology services [1] - The company has a registered capital of 5 million RMB [1] - According to Tianyancha data analysis, the company has invested in one external enterprise and holds 324 patent records, in addition to one administrative license [1]
CVPR2025视频生成统一评估架构,上交x斯坦福联合提出让MLLM像人类一样打分
量子位· 2025-06-12 08:17
Core Viewpoint - Video generation technology is rapidly transforming visual content creation across various sectors, including film production, advertising design, virtual reality, and social media, making high-quality video generation models increasingly important [1]. Group 1: Video Evaluation Framework - The Video-Bench framework evaluates AI-generated videos by simulating human cognitive processes, establishing an intelligent assessment system that connects text instructions with visual content [2]. - Video-Bench enables multimodal large models (MLLM) to evaluate videos similarly to human assessments, effectively identifying defects in object consistency (0.735 correlation) and action rationality, while also addressing traditional challenges in aesthetic quality evaluation [3]. Group 2: Innovations in Video-Bench - Video-Bench addresses two main issues in existing video evaluation methods: the inability to capture complex dimensions like video fluency and aesthetic performance, and the challenges in cross-modal comparison during video-condition alignment assessments [5]. - The framework introduces two core innovations: a dual-dimensional evaluation framework covering video-condition alignment and video quality [7], and the implementation of chain-of-query and few-shot scoring techniques [8]. Group 3: Evaluation Dimensions - The dual-dimensional evaluation framework allows Video-Bench to assess video generation quality by breaking it down into "video-condition alignment" and "video quality," focusing on the accuracy of generated content against text prompts and the visual quality of the video itself [10]. - Key dimensions for video-condition consistency include object category consistency, action consistency, color consistency, scene consistency, and video-text consistency, while video quality evaluation emphasizes imaging quality, aesthetic quality, temporal consistency, and motion quality [10]. Group 4: Performance Comparison - Video-Bench significantly outperforms traditional evaluation methods, achieving an average Spearman correlation of 0.733 in video-condition alignment and 0.620 in video quality [18]. - In the critical metric of object category consistency, Video-Bench shows a 56.3% improvement over the GRiT-based method, reaching a correlation of 0.735 [19]. Group 5: Robustness and Reliability - Video-Bench's evaluation results were validated by a team of 10 experts who annotated 35,196 video samples, achieving a Krippendorff's α of 0.52, comparable to human self-assessment levels [21]. - The framework demonstrated high stability and reliability, with a TARA@3 score of 67% and a Krippendorff's α of 0.867, confirming the effectiveness of its component designs [23]. Group 6: Current Model Assessment - Video-Bench evaluated seven mainstream video generation models, revealing that commercial models generally outperform open-source models, with Gen3 scoring an average of 4.38 compared to VideoCrafter2's 3.87 [25]. - The assessment highlighted weaknesses in dynamic dimensions such as action rationality (average score of 2.53/3) and motion blur (3.11/5) across current models [26].
CVPR2025视频生成统一评估架构,上交x斯坦福联合提出让MLLM像人类一样打分
量子位· 2025-06-12 08:16
Core Viewpoint - Video generation technology is rapidly transforming visual content creation across various sectors, emphasizing the importance of high-quality video generation models that align with human expectations [1]. Group 1: Video Evaluation Framework - The Video-Bench framework simulates human cognitive processes to establish an intelligent evaluation system that connects text instructions with visual content [2]. - Video-Bench enables multimodal large models (MLLM) to evaluate videos similarly to human assessments, identifying defects in object consistency (0.735 correlation) and action rationality, while also effectively assessing aesthetic quality [3][18]. Group 2: Innovations in Video Evaluation - Video-Bench addresses two main issues in existing video evaluation methods: the inability to capture complex dimensions like video fluency and aesthetics, and the challenges in cross-modal comparisons for video-text alignment [5]. - The framework introduces a dual-dimensional evaluation system covering video-condition alignment and video quality [7]. - Key technologies include Chain-of-Query, which resolves cross-modal alignment issues through iterative questioning, and Few-shot scoring, which quantifies subjective aesthetic judgments by comparing multiple videos [8][13]. Group 3: Comprehensive Evaluation Metrics - Video-Bench dissects video generation quality into two orthogonal dimensions: video-condition alignment and video quality, assessing both the fidelity to text prompts and the visual quality of the video itself [10]. - The evaluation framework includes metrics for object category consistency, action consistency, color consistency, scene consistency, imaging quality, aesthetic quality, temporal consistency, and motion quality [10][11]. Group 4: Performance Comparison - Video-Bench significantly outperforms traditional methods, achieving an average Spearman correlation of 0.733 in video-condition alignment and 0.620 in video quality [18]. - In the critical metric of object category consistency, Video-Bench shows a 56.3% improvement over GRiT methods, reaching a correlation of 0.735 [19]. - A reliability test with a panel of 10 experts on 35,196 video samples yielded a consistency score (Krippendorff's α) of 0.52, comparable to human self-assessment levels [21]. Group 5: Current Model Evaluations - Video-Bench evaluated seven mainstream video generation models, revealing that commercial models generally outperform open-source models, with Gen3 scoring an average of 4.38 compared to VideoCrafter2's 3.87 [25]. - The evaluation highlighted weaknesses in dynamic dimensions such as action rationality (average score of 2.53/3) and motion blur (3.11/5) [26]. - Comparisons among foundational models indicated that GPT-4o typically excels in video quality and consistency scores, particularly in imaging quality (0.807) and video-text consistency (0.750) [27].
豆包发布视频生成模型Seedance1.0 pro
news flash· 2025-06-11 03:38
Group 1 - The company Doubao has launched a video generation model called Seedance1.0pro, priced at 0.015 yuan per thousand tokens [1] - The cost to produce a 5-second 1080p video using this model is approximately 3.67 yuan per unit [1] - Additionally, Doubao has fully launched its real-time voice model [1]