Vidu Q2参考生Pro
Search documents
腾讯研究院AI速递 20260129
腾讯研究院· 2026-01-28 16:03
Group 1: OpenAI Developments - OpenAI launched Prism, a cloud-based LaTeX workspace powered by GPT-5.2, integrating drafting, editing, collaboration, and publishing, with capabilities to read the overall structure and context of papers [1] - Prism offers features like intelligent literature search, sketch-to-LaTeX conversion, and voice editing, allowing unlimited collaborators and is free for all ChatGPT users [1] - OpenAI anticipates that AI will transform software development by 2025 and the scientific field by 2026, positioning Prism as a pioneer in accelerating scientific discovery [1] Group 2: Google AI Plus Initiative - Google officially launched the AI Plus plan globally, priced at $7.99 per month in the U.S., with a 50% discount for the first two months, targeting budget-conscious users [2] - The plan includes access to Gemini 3 Pro, Flow video creation, NotebookLM research assistance, and 200GB of cloud storage, supporting up to six family members [2] - Existing Google One Premium 2TB users will automatically receive all AI Plus benefits, seen as a direct response to OpenAI's ChatGPT Go [2] Group 3: Clawdbot Rebranding - The open-source project Clawdbot was forced to rebrand as Moltbot due to trademark infringement claims from Anthropic, with developers humorously noting "same lobster spirit, new shell" [3] - During the rebranding, a GitHub issue led to the old ID being seized by cryptocurrency scammers for blockchain fraud, prompting the author to clarify that no tokens were ever issued [3] - The author also advised that "most non-technical users should not install this," as the project is still in its early stages and poses security risks [3] Group 4: Tencent's Mixed Yuan Image 3.0 - Tencent's Mixed Yuan Image 3.0, a state-of-the-art image generation model, has been open-sourced, based on an 80 billion parameter mixed expert architecture, ranking seventh globally on the LMArena image editing leaderboard [4] - The model employs a "think before edit" workflow, supporting diverse editing capabilities such as addition, deletion, style transformation, and old photo restoration [4] - The training process involved constructing a dataset of millions of image generation tasks covering over 80 tasks, utilizing a proprietary MixGRPO algorithm to align with user preferences [4] Group 5: Kunlun Tiangong's Mureka V8 - Kunlun Tiangong released the Mureka V8 music model, leveraging MusiCoT technology to enhance musicality, arrangement completeness, and vocal expression, transitioning from "generable" to "publishable" [5][6] - The V8 model surpassed Suno in subjective scoring for Chinese song generation and has formed a strategic partnership with Taihe Music Group, integrating AI music into mainstream production and distribution [6] - The platform has served over 8,000 global clients and plans to iterate 2-3 versions annually, aiming to become the leading platform in the global AI music sector [6] Group 6: Vidu's Q2 Reference Model - Vidu launched the Q2 Reference Pro model, featuring a unique "everything can be referenced" capability, supporting six types of references including effects, expressions, textures, actions, characters, and scenes [7] - The model enables fine-tuned video editing, allowing users to add, delete, modify, and replace any elements, with one-click switching between real and animated styles [7] - This new functionality allows users to create special effects films without needing to learn professional tools like C4D or AE, accelerating the production of AI-driven short dramas [7] Group 7: Ant Group's LingBot-VLA - Ant Group released the LingBot-VLA, an embodied intelligent base model trained on approximately 20,000 hours of real data covering nine dual-arm robot configurations, outperforming Pi0.5 in GM-100 benchmark tests [8] - The model utilizes a Mixture-of-Transformers architecture, integrating visual distillation to achieve strong generalization across different entities and scenes [8] - The research revealed the scaling law of the VLA model, showing continuous performance improvement as data expanded from 3,000 to 20,000 hours without saturation [8] Group 8: Establishment of the Interstellar Navigation Academy - The Interstellar Navigation Academy was officially established at the Chinese Academy of Sciences, with Academician Zhu Junqiang as the director, aiming to build a curriculum system covering 14 primary disciplines [9] - The academy will introduce 22 core courses, focusing on cutting-edge topics such as interstellar dynamics and governance, along with six specialized teaching practice platforms [9] - This initiative is positioned as a key measure to seize technological high ground, providing talent support for national deep space exploration and space science research [9] Group 9: OpenAI's CEO Acknowledgment - OpenAI's CEO acknowledged during a developer meeting that GPT-5.2 sacrificed writing capabilities for improved reasoning and coding, stating "we messed up," with plans to address this in future versions [10] - The CEO predicted that by the end of 2027, the cost of GPT-5.2 level intelligence will decrease by at least 100 times, leading to personalized app versions for everyone [10] - He emphasized that the most important skills in the AI era will be high adaptability and the ability to generate ideas, noting that while the definition of engineers may change, demand will remain [10] Group 10: AI for Science Competition - OpenAI's Vice President Kevin Weil stated that GPT-5's reasoning capabilities have reached the forefront of human performance, scoring 92% on the GPQA doctoral-level test, significantly surpassing GPT-4's 39% [11] - Weil believes the greatest value of large language models lies in discovering interdisciplinary connections and forgotten research, exploring ways to instill "cognitive humility" and self-fact-checking abilities in models [11] - He predicts that 2026 will be a pivotal year for AI-enabled research, warning that researchers who do not deeply utilize AI tools will miss opportunities to enhance efficiency [11]
万物皆可参考是种什么体验?Vidu Q2参考生Pro:特效、演技、细节全都要
机器之心· 2026-01-28 04:59
编辑|+0 最近,一段「威尔·史密斯吃意面」的今昔对比视频在社交媒体刷屏,引发了无数感慨。 两年前,初出茅庐的 AI 视频还是「抽象鬼畜」的代名词,五官乱飞、逻辑崩坏;仅仅两年过去,当同一主题再次被演绎,从吞咽时肌肉的牵动,到光影在 面部的细腻流转,AI 已进化至「惟妙惟肖」的真·智能水准。 这两年,浓缩了 AI 视频生成行业翻天覆地的技术跃迁。然而,行业并未止步于画质的内卷。在各家厂商竞逐「可控性」高地的当下,AI 视频正站在一个 关键转折点: 从解决「有没有」,到追求「精不精」 。 回顾 Vidu 的进化之路:2025 年 9 月,Vidu Q2 全球首发,以惊艳的图生视频、参考生视频能力技惊四座;12 月,Q2「生图全家桶」上线,首日突破 50 万次的使用量,印证了市场对高质量生成的渴望。 昨天,Vidu Q2 参考生 Pro 正式发布。 登陆 Vidu.cn 或 Vidu API: platform.vidu.cn ,体验最新产品功能。 短短数月,它完成了从「生成」到「编辑」的闭环,更推出了 全球首个「万物可参考」的视频模型 ,将参考模态从静态图像一举扩展至动态视频与多维元 素。其全新 Slogan「 ...
【太平洋科技-每日观点&资讯】(2026-01-28)
远峰电子· 2026-01-27 13:06
国内新闻 / Part 02 ①半导体投资联盟,澜起科技宣布/率先在国内推出基于PCIe 6.x/CXL 3.x标准的高性能有源电缆(AEC,Active Electrical Cable)解决方案/该方案面向数据中心从单机架向多机架复杂架构演进的需求/采用澜起自研的PCIe 6.x/CXL 3.x Retimer 芯片/旨在为大规模数据中心与高性能服务器平台提供高带宽、低延迟互连支持/ ②半导体芯闻,具备边缘推理能力的数字终端将快速增长/成为中国半导体产业扩张的重要驱动力——尤其是成熟工 艺技术领域/2025年四季度最新数据显示/2026年中国半导体市场预计增长31.26%/市场规模将达到5465亿美元/ ③大话芯片,国科微宣布对旗下固态存储芯片、SSD主控芯片及配套存储模组等全系列产品进行价格调整/涨幅区间 为20%至80%/其中企业级SSD及高端DDR适配产品涨幅居前/最高达80%/ 行情速递 / Part 01 ①大盘指数,科创50 (+1.51%)/创业板指(+0.71%)/上证指数(+0.18%)/深证成指(+0.09%)/北证50 (-0.05%)/ ②TMT领涨板块,SW分立器件(+5.70% ...