Workflow
视频生成技术
icon
Search documents
赛力斯取得一种视频生成相关专利
Jin Rong Jie· 2025-08-01 05:38
天眼查资料显示,成都赛力斯科技有限公司,成立于2021年,位于成都市,是一家以从事软件和信息技 术服务业为主的企业。企业注册资本500万人民币。通过天眼查大数据分析,成都赛力斯科技有限公司 共对外投资了1家企业,专利信息324条,此外企业还拥有行政许可1个。 金融界2025年8月1日消息,国家知识产权局信息显示,成都赛力斯科技有限公司取得一项名为"一种视 频生成方法、装置、电子设备及存储介质"的专利,授权公告号CN119743660B,申请日期为2025年03 月。 ...
CVPR2025视频生成统一评估架构,上交x斯坦福联合提出让MLLM像人类一样打分
量子位· 2025-06-12 08:17
Video-Bench 视频评估框架,能够通过模拟人类的认知过程,建立起连接文本指令与视觉内容的智能评估体系。 简单地说,能够让多模态大模型(MLLM)"像人一样评估视频"。 实验结果表明,Video-Bench不仅能精准识别生成视频在物体一致性(0.735相关性)、动作合理性等维度的缺陷,还能稳定评估美学质量等 传统难题,显著优于现有的评估方法。 Video-Bench团队 投稿 量子位 | 公众号 QbitAI 视频生成技术正以前所未有的速度革新着当前的视觉内容创作方式,从电影制作到广告设计,从虚拟现实到社交媒体,高质量且符合人类期望 的视频生成模型正变得越来越重要。 那么,要如何评估AI生成的视频是否符合人类的审美和需求呢? Video-Bench的研究团队来自上海交通大学、斯坦福大学、卡内基梅隆大学等机构。 Video-Bench:基于MLLM的自动化视频评估框架 Video-Bench团队在面对已有的视频评估方法时,发现了两个问题: 1.简单的评分规则往往无法捕捉视频流畅度、美学表现等复杂维度—— 那么,当评判"视频质量"时,如何将人类出于"直觉"的模糊感受转化为可量化的评估指标? 2.现有基于大语 ...
CVPR2025视频生成统一评估架构,上交x斯坦福联合提出让MLLM像人类一样打分
量子位· 2025-06-12 08:16
Core Viewpoint - Video generation technology is rapidly transforming visual content creation across various sectors, emphasizing the importance of high-quality video generation models that align with human expectations [1]. Group 1: Video Evaluation Framework - The Video-Bench framework simulates human cognitive processes to establish an intelligent evaluation system that connects text instructions with visual content [2]. - Video-Bench enables multimodal large models (MLLM) to evaluate videos similarly to human assessments, identifying defects in object consistency (0.735 correlation) and action rationality, while also effectively assessing aesthetic quality [3][18]. Group 2: Innovations in Video Evaluation - Video-Bench addresses two main issues in existing video evaluation methods: the inability to capture complex dimensions like video fluency and aesthetics, and the challenges in cross-modal comparisons for video-text alignment [5]. - The framework introduces a dual-dimensional evaluation system covering video-condition alignment and video quality [7]. - Key technologies include Chain-of-Query, which resolves cross-modal alignment issues through iterative questioning, and Few-shot scoring, which quantifies subjective aesthetic judgments by comparing multiple videos [8][13]. Group 3: Comprehensive Evaluation Metrics - Video-Bench dissects video generation quality into two orthogonal dimensions: video-condition alignment and video quality, assessing both the fidelity to text prompts and the visual quality of the video itself [10]. - The evaluation framework includes metrics for object category consistency, action consistency, color consistency, scene consistency, imaging quality, aesthetic quality, temporal consistency, and motion quality [10][11]. Group 4: Performance Comparison - Video-Bench significantly outperforms traditional methods, achieving an average Spearman correlation of 0.733 in video-condition alignment and 0.620 in video quality [18]. - In the critical metric of object category consistency, Video-Bench shows a 56.3% improvement over GRiT methods, reaching a correlation of 0.735 [19]. - A reliability test with a panel of 10 experts on 35,196 video samples yielded a consistency score (Krippendorff's α) of 0.52, comparable to human self-assessment levels [21]. Group 5: Current Model Evaluations - Video-Bench evaluated seven mainstream video generation models, revealing that commercial models generally outperform open-source models, with Gen3 scoring an average of 4.38 compared to VideoCrafter2's 3.87 [25]. - The evaluation highlighted weaknesses in dynamic dimensions such as action rationality (average score of 2.53/3) and motion blur (3.11/5) [26]. - Comparisons among foundational models indicated that GPT-4o typically excels in video quality and consistency scores, particularly in imaging quality (0.807) and video-text consistency (0.750) [27].
豆包发布视频生成模型Seedance1.0 pro
news flash· 2025-06-11 03:38
Group 1 - The company Doubao has launched a video generation model called Seedance1.0pro, priced at 0.015 yuan per thousand tokens [1] - The cost to produce a 5-second 1080p video using this model is approximately 3.67 yuan per unit [1] - Additionally, Doubao has fully launched its real-time voice model [1]