Vidu Q2参考生Pro
Search documents
腾讯研究院AI速递 20260129
腾讯研究院· 2026-01-28 16:03
Group 1: OpenAI Developments - OpenAI launched Prism, a cloud-based LaTeX workspace powered by GPT-5.2, integrating drafting, editing, collaboration, and publishing, with capabilities to read the overall structure and context of papers [1] - Prism offers features like intelligent literature search, sketch-to-LaTeX conversion, and voice editing, allowing unlimited collaborators and is free for all ChatGPT users [1] - OpenAI anticipates that AI will transform software development by 2025 and the scientific field by 2026, positioning Prism as a pioneer in accelerating scientific discovery [1] Group 2: Google AI Plus Initiative - Google officially launched the AI Plus plan globally, priced at $7.99 per month in the U.S., with a 50% discount for the first two months, targeting budget-conscious users [2] - The plan includes access to Gemini 3 Pro, Flow video creation, NotebookLM research assistance, and 200GB of cloud storage, supporting up to six family members [2] - Existing Google One Premium 2TB users will automatically receive all AI Plus benefits, seen as a direct response to OpenAI's ChatGPT Go [2] Group 3: Clawdbot Rebranding - The open-source project Clawdbot was forced to rebrand as Moltbot due to trademark infringement claims from Anthropic, with developers humorously noting "same lobster spirit, new shell" [3] - During the rebranding, a GitHub issue led to the old ID being seized by cryptocurrency scammers for blockchain fraud, prompting the author to clarify that no tokens were ever issued [3] - The author also advised that "most non-technical users should not install this," as the project is still in its early stages and poses security risks [3] Group 4: Tencent's Mixed Yuan Image 3.0 - Tencent's Mixed Yuan Image 3.0, a state-of-the-art image generation model, has been open-sourced, based on an 80 billion parameter mixed expert architecture, ranking seventh globally on the LMArena image editing leaderboard [4] - The model employs a "think before edit" workflow, supporting diverse editing capabilities such as addition, deletion, style transformation, and old photo restoration [4] - The training process involved constructing a dataset of millions of image generation tasks covering over 80 tasks, utilizing a proprietary MixGRPO algorithm to align with user preferences [4] Group 5: Kunlun Tiangong's Mureka V8 - Kunlun Tiangong released the Mureka V8 music model, leveraging MusiCoT technology to enhance musicality, arrangement completeness, and vocal expression, transitioning from "generable" to "publishable" [5][6] - The V8 model surpassed Suno in subjective scoring for Chinese song generation and has formed a strategic partnership with Taihe Music Group, integrating AI music into mainstream production and distribution [6] - The platform has served over 8,000 global clients and plans to iterate 2-3 versions annually, aiming to become the leading platform in the global AI music sector [6] Group 6: Vidu's Q2 Reference Model - Vidu launched the Q2 Reference Pro model, featuring a unique "everything can be referenced" capability, supporting six types of references including effects, expressions, textures, actions, characters, and scenes [7] - The model enables fine-tuned video editing, allowing users to add, delete, modify, and replace any elements, with one-click switching between real and animated styles [7] - This new functionality allows users to create special effects films without needing to learn professional tools like C4D or AE, accelerating the production of AI-driven short dramas [7] Group 7: Ant Group's LingBot-VLA - Ant Group released the LingBot-VLA, an embodied intelligent base model trained on approximately 20,000 hours of real data covering nine dual-arm robot configurations, outperforming Pi0.5 in GM-100 benchmark tests [8] - The model utilizes a Mixture-of-Transformers architecture, integrating visual distillation to achieve strong generalization across different entities and scenes [8] - The research revealed the scaling law of the VLA model, showing continuous performance improvement as data expanded from 3,000 to 20,000 hours without saturation [8] Group 8: Establishment of the Interstellar Navigation Academy - The Interstellar Navigation Academy was officially established at the Chinese Academy of Sciences, with Academician Zhu Junqiang as the director, aiming to build a curriculum system covering 14 primary disciplines [9] - The academy will introduce 22 core courses, focusing on cutting-edge topics such as interstellar dynamics and governance, along with six specialized teaching practice platforms [9] - This initiative is positioned as a key measure to seize technological high ground, providing talent support for national deep space exploration and space science research [9] Group 9: OpenAI's CEO Acknowledgment - OpenAI's CEO acknowledged during a developer meeting that GPT-5.2 sacrificed writing capabilities for improved reasoning and coding, stating "we messed up," with plans to address this in future versions [10] - The CEO predicted that by the end of 2027, the cost of GPT-5.2 level intelligence will decrease by at least 100 times, leading to personalized app versions for everyone [10] - He emphasized that the most important skills in the AI era will be high adaptability and the ability to generate ideas, noting that while the definition of engineers may change, demand will remain [10] Group 10: AI for Science Competition - OpenAI's Vice President Kevin Weil stated that GPT-5's reasoning capabilities have reached the forefront of human performance, scoring 92% on the GPQA doctoral-level test, significantly surpassing GPT-4's 39% [11] - Weil believes the greatest value of large language models lies in discovering interdisciplinary connections and forgotten research, exploring ways to instill "cognitive humility" and self-fact-checking abilities in models [11] - He predicts that 2026 will be a pivotal year for AI-enabled research, warning that researchers who do not deeply utilize AI tools will miss opportunities to enhance efficiency [11]
万物皆可参考是种什么体验?Vidu Q2参考生Pro:特效、演技、细节全都要
机器之心· 2026-01-28 04:59
编辑|+0 最近,一段「威尔·史密斯吃意面」的今昔对比视频在社交媒体刷屏,引发了无数感慨。 两年前,初出茅庐的 AI 视频还是「抽象鬼畜」的代名词,五官乱飞、逻辑崩坏;仅仅两年过去,当同一主题再次被演绎,从吞咽时肌肉的牵动,到光影在 面部的细腻流转,AI 已进化至「惟妙惟肖」的真·智能水准。 这两年,浓缩了 AI 视频生成行业翻天覆地的技术跃迁。然而,行业并未止步于画质的内卷。在各家厂商竞逐「可控性」高地的当下,AI 视频正站在一个 关键转折点: 从解决「有没有」,到追求「精不精」 。 回顾 Vidu 的进化之路:2025 年 9 月,Vidu Q2 全球首发,以惊艳的图生视频、参考生视频能力技惊四座;12 月,Q2「生图全家桶」上线,首日突破 50 万次的使用量,印证了市场对高质量生成的渴望。 昨天,Vidu Q2 参考生 Pro 正式发布。 登陆 Vidu.cn 或 Vidu API: platform.vidu.cn ,体验最新产品功能。 短短数月,它完成了从「生成」到「编辑」的闭环,更推出了 全球首个「万物可参考」的视频模型 ,将参考模态从静态图像一举扩展至动态视频与多维元 素。其全新 Slogan「 ...
【太平洋科技-每日观点&资讯】(2026-01-28)
远峰电子· 2026-01-27 13:06
Market Overview - Major indices showed mixed performance with the STAR 50 up by 1.51%, ChiNext Index up by 0.71%, Shanghai Composite Index up by 0.18%, Shenzhen Component Index up by 0.09%, and North Exchange 50 down by 0.05% [1] - TMT sector led the gains with SW discrete devices up by 5.70%, SW analog chip design up by 3.60%, and SW integrated circuit packaging and testing up by 3.59% [1] - TMT sector faced declines with SW security equipment down by 1.11%, SW other computer equipment down by 1.07%, and SW education publishing down by 1.03% [1] Domestic News - Lanke Technology announced the launch of high-performance active electrical cable solutions based on PCIe 6.x/CXL 3.x standards, aimed at supporting data centers transitioning from single-rack to multi-rack architectures [2] - The Chinese semiconductor market is projected to grow by 31.26% by Q4 2026, reaching a market size of $546.5 billion [2] - Guokewai announced price adjustments for its solid-state storage chips and SSD controllers, with increases ranging from 20% to 80%, particularly for enterprise-grade SSDs and high-end DDR products [2] - Hefei Guoxian's 8.6-generation AMOLED production line project is 65% complete, with cleanroom delivery expected in Q2 this year [2] Overseas News - Micron has begun construction on an advanced wafer manufacturing facility in Singapore, planning to invest approximately $24 billion over 10 years, with production expected to start in the second half of 2028 [2] - Counterpoint Research forecasts that global shipments of AI server-specific ASICs will triple by 2027 compared to 2024, driven by strong demand for Google's TPU infrastructure and AWS Trainium clusters [2] - Microsoft launched the new AI accelerator, Microsoft Azure Maia 200, with a peak FP4 computing power of 10 petaflops, three times that of Amazon's Trainium3 [2] - The U.S. Patent and Trademark Office rejected Yangtze Memory Technologies Co.'s request to invalidate two core patents of Micron related to 3D NAND flash memory manufacturing processes [2] AI Insights - DeepSeek released OCR 2, utilizing an innovative DeepEncoder V2 method to dynamically adjust visual token distribution based on image content [3] - Vidu launched the world's first video generation model supporting "everything can be referenced," allowing users to replicate effects and edit videos with ease [3] - Kimi released the open-source K2.5 model, achieving state-of-the-art performance in various benchmarks and supporting multi-modal inputs [3] - Alibaba introduced the Qwen3-Max-Thinking model, with over 1 trillion parameters and significant improvements across multiple dimensions, comparable to leading models like GPT-5.2-Thinking [3] Industry Tracking - Guoxing Aerospace disclosed plans for the world's first space computing network serving silicon-based intelligences, aiming to establish a comprehensive computing infrastructure by 2035 [4] - The "Stone Worker Zhuoling" ultrasonic Lamb wave scanning imaging logging instrument has been successfully applied in major oil fields, enhancing wellbore integrity diagnostics [4] - China's machine tool exports surged by 18% year-on-year, capturing a 21.6% global market share, surpassing Germany for the first time [4] - Zhejiang Renxing completed a 450 million yuan Pre-A round financing for its humanoid robots, which are already deployed in leading companies across various sectors [4] Earnings Updates - Gallen Electronics expects 2025 revenue of approximately 487 million yuan, a year-on-year increase of 16.21%, with a projected net profit of 36 million yuan [5] - Lante Optical anticipates a net profit of 375 to 400 million yuan for 2025, representing a year-on-year growth of 70.04% to 81.38% [5] - Nanya New Materials forecasts a net profit of 220 to 260 million yuan for 2025, a significant increase of 337.20% to 416.69% year-on-year [5] - Shijia Photon expects 2025 revenue to reach 2.129 billion yuan, a year-on-year growth of approximately 98.13%, with a projected net profit of 342 million yuan [5]