量子位
Search documents
AI视频生成新品实测:这怎么不算影院级呢?
量子位· 2025-08-25 15:47
Core Viewpoint - The article discusses the capabilities and performance of Baidu's latest video generation model, MuseSteamer 2.0, highlighting its advancements in audio-visual integration and storytelling through video generation [1][53]. Model Performance - MuseSteamer 2.0 is noted as the world's first Chinese audio-video integrated I2V model, excelling in natural Chinese voice generation and lip-syncing [6][44]. - The upgraded model shows improved capabilities in complex camera movements and storytelling, with enhanced video quality compared to its predecessor [7][44]. - In practical tests, while MuseSteamer 2.0 demonstrated strong performance in capturing animal expressions, it struggled with certain actions like "running" [15][45]. Comparison with Competitors - When compared to the popular model Veo3, MuseSteamer 2.0 takes significantly longer to generate videos, requiring about 3 minutes versus Veo3's under 1 minute [16][17]. - The file size of videos generated by MuseSteamer 2.0 is larger (20.8M) compared to Veo3 (3M), which may contribute to the longer processing time [18]. - Despite some limitations, MuseSteamer 2.0 is positioned as a more cost-effective option for video generation, with pricing significantly lower than Veo3's subscription model [52]. Creative Applications - The model is suggested as a valuable tool for creators with imaginative ideas, allowing for the transformation of static images into dynamic videos [32][36]. - Examples include using the model to animate characters from classic literature or popular culture, showcasing its potential for creative storytelling [34][36]. User Feedback and Market Position - Users have praised the model for its realistic video generation capabilities, with some calling it a transformative innovation in the field [53][55]. - The model's integration within Baidu's mobile ecosystem and its adaptation to the Chinese language context are seen as advantages for local creators [57].
最高提效8倍!腾讯游戏发布专业游戏AI大模型,美术师做动画不用辣么“肝”了
量子位· 2025-08-25 15:47
Core Viewpoint - The article discusses the significant advancements in AI technology within the gaming industry, particularly focusing on Tencent's VISVISE, a comprehensive AI solution aimed at enhancing the efficiency of game art production processes. Group 1: AI in Game Development - Major international companies like Microsoft, Tencent, Google, and Meta presented over 20 AI-related topics at the Devcom developer conference, highlighting AI's role in improving game art production efficiency and integrating AI tools into traditional workflows [1] - The demand for precision in game art has increased exponentially, leading to a geometric rise in workload [2] Group 2: VISVISE Overview - Tencent Games launched VISVISE, an AI-driven solution that encompasses a full suite of tools for animation production, model creation, digital asset management, and intelligent NPCs, aimed at assisting game artists with repetitive and labor-intensive tasks [4] - VISVISE's MotionBlink tool can automatically complete animation sequences based on minimal user input, significantly reducing the time required for animation production [5][7] Group 3: Efficiency Gains - Traditionally, animators spent 60%-70% of their time on manual frame completion, with a 10-second animation taking 3-7 person-days to finalize. In contrast, AI can generate 200 frames of animation in just 4 seconds [6][7] - The GoSkinning tool within VISVISE has been successfully implemented in popular games like "PUBG Mobile" and "Peacekeeper Elite," showcasing its practical application [11] Group 4: Challenges in Game Art Production - 50%-60% of the work in traditional game art production is dedicated to creating art assets, with 3D modeling and animation being the most labor-intensive processes [13] - The complexity of game data poses challenges for AI, as it must seamlessly integrate into existing workflows while allowing artists to make adjustments [17] Group 5: Development of VISVISE - VISVISE was developed based on actual development needs, with Tencent starting its exploration of AI in gaming as early as 2016. The focus shifted to art production pipelines by 2018, leading to the introduction of the GoSkinning tool in 2022 [33][34] - The efficiency of the GoSkinning tool has improved by over 60% in animation skinning processes, demonstrating the effectiveness of AI in enhancing production workflows [34] Group 6: Future of AI in Gaming - The article suggests that the gaming industry will continue to be a testing ground for AI technologies, with the potential for AI to revolutionize NPC behavior and interactions, moving beyond scripted responses to more human-like understanding [40][45]
苹果折叠屏最新爆料:Touch ID正式回归,4摄系统首次亮相!
量子位· 2025-08-25 15:47
Core Viewpoint - The article discusses the anticipated launch of Apple's first foldable iPhone, highlighting its features such as Touch ID, four cameras, and a slim design with a thickness of approximately 9.5mm when folded, aiming for a release in the second half of next year [1][3][4]. Design and Features - The foldable iPhone can be opened like a book to reveal a larger internal display and can be used like a regular iPhone when folded [6]. - The device's thickness is designed to be under 5mm when opened, showcasing Apple's commitment to thinness [7]. - Due to the slim design, the complex TrueDepth camera system for Face ID cannot be accommodated, leading to the use of Touch ID on the side button for authentication [8][10]. Camera System - The foldable iPhone will feature four cameras, making it the iPhone with the most cameras to date: one on the front, one on the inside, and two on the back [12]. - The rear cameras include a high-resolution main camera and another for ultra-wide or telephoto shots [13]. - The internal camera will be used for selfies when the device is unfolded [14]. Display and Technology - To address the common issue of creasing in foldable screens, Apple is shifting from an on-cell to an in-cell display technology, which is more aligned with existing iPhone technology [14]. - The initial color options for the foldable iPhone are conservative, with black and white being the primary choices, but there is potential for more vibrant colors in the final version [15][17]. Hardware and Connectivity - The foldable iPhone will be equipped with Apple's first cellular modem chip, comparable to Qualcomm's C2, and will eliminate the physical SIM card slot in favor of eSIM technology [18]. - The acceptance of this "one-size-fits-all" approach among consumers, who are still accustomed to physical SIM cards, remains uncertain [20]. Production and Release Timeline - Suppliers are reportedly ramping up production for the new model, aiming for a launch in the fall of next year [21]. Future Developments - In 2027, Apple plans to celebrate the 20th anniversary of its smartphones with the release of the curved-screen "iPhone 20," featuring an all-around glass body that allows interaction from all surfaces [22]. Pricing Expectations - Predictions suggest that the foldable iPhone will be priced between $2300 and $2500 (approximately 16491 to 17925 RMB) [25].
科学界论文高引第一人易主!AI站上历史巅峰
量子位· 2025-08-25 05:54
Core Viewpoint - Yoshua Bengio is recognized as the most cited living scientist across all disciplines, not just in computer science, highlighting his significant impact on deep learning and artificial intelligence [4][19]. Group 1: Background and Contributions - Yoshua Bengio, born in 1964 in Paris, is a prominent figure in deep learning, having co-founded the field alongside Geoffrey Hinton and Yann LeCun [8][11]. - His early academic journey included a PhD under Hinton at McGill University, where he shifted focus from classical statistical models to neural networks [10][12]. - Bengio's major contributions include the development of probabilistic modeling, high-dimensional word embeddings, attention mechanisms, and generative adversarial networks (GANs) [13][16]. Group 2: Key Publications - Bengio's influential papers include "A Neural Probabilistic Language Model" (2000), which addressed the "curse of dimensionality" in language modeling, laying the groundwork for modern language models [14]. - The paper "Generative Adversarial Nets" (2014), co-authored with Ian Goodfellow, is his most cited work, with over 100,904 citations [17]. - The 2015 paper "Deep Learning," co-authored with Hinton and LeCun, is considered a foundational text in the field, summarizing deep learning's evolution and theoretical underpinnings [16][17]. Group 3: Recent Developments - In June 2023, Bengio announced the establishment of a non-profit organization, LawZero, aimed at developing the next generation of AI systems, with an initial funding of $30 million [19][20]. - LawZero focuses on understanding the learning world rather than action-oriented AI, aiming to provide verifiable answers to enhance scientific discovery and address AI risks [20]. Group 4: Citation Rankings - Bengio currently leads in citation counts among living scientists, with his closest competitor being Geoffrey Hinton, who has nearly 940,000 citations [21]. - The AD Scientific Index ranks researchers based on various metrics, including total citations, reflecting the prominence of AI and medical research in current academic discourse [23][26].
人均300万的青年科学家大奖“男女条件不平等”?颜宁解释了
量子位· 2025-08-25 05:54
Core Points - The seventh Science Exploration Award has been announced, with 50 young scientists receiving a total of 300 million yuan in funding over five years [2] - The award covers ten fields, including mathematics and physics, chemistry and new materials, astronomy and earth sciences, life sciences, medical sciences, information electronics, energy and environment, advanced manufacturing, transportation and construction, and frontier intersections [2][39] - The award aims to encourage original research and has optimized its evaluation mechanism to focus on the originality of future research work [40][41] Group 1: Award Overview - The Science Exploration Award was established in 2018 by prominent scientists and Tencent's founder, Ma Huateng, and has become one of the most recognized awards for basic research and cutting-edge technology in China [2][39] - Each of the 50 awardees will receive 6 million yuan annually for five years, totaling 30 million yuan per recipient [2] - This year, there were 1,238 applicants, with 13 young scientists among the winners, the youngest being 32 years old at the time of application [44] Group 2: Information Electronics Field - The information electronics field had six awardees, three of whom are young scientists [15] - Notable winners include Chang Yi from Jilin University, who is a leading expert in information retrieval and data mining, and has published over 100 papers and holds more than 30 patents [19][21] - Other awardees include Du Bo from Wuhan University, who focuses on artificial intelligence and computer vision, and Jiang Yugang from Fudan University, who leads significant projects in the field of artificial intelligence [25][27] Group 3: Gender Representation - This year's award saw a record number of female winners, with nine women making up 18% of the total awardees [45] - The award has introduced a "new star" category to attract younger researchers, with specific age limits set for male and female candidates to promote gender diversity [44]
首个接入GPT-5的视频Agent!一句话生成商业级广告大片,分镜配音字幕等全包了
量子位· 2025-08-25 02:32
Core Viewpoint - The article discusses the emergence of Video Ocean, the world's first video agent integrated with GPT-5, which revolutionizes AI video generation by automating the entire creative process, significantly reducing production time and enhancing efficiency. Group 1: Product Features - Video Ocean can automatically create complete videos, including storyboarding, visuals, voiceovers, and subtitles, transforming the traditional video production process [2][3]. - The platform allows for the rapid production of high-quality videos, reducing the time required from weeks to just days or even minutes [5][6]. - It features an automated creative ecosystem that learns and adapts to brand styles and historical creations, avoiding the limitations of traditional tools [9][11]. Group 2: Efficiency and Scalability - Video Ocean enhances content production efficiency by up to 10 times, enabling quick responses to market trends and the generation of viral videos [12]. - The platform supports the creation of professional-grade commercial videos with simple commands, catering to diverse business scenarios [13]. - It facilitates the development of original film content from scratch, streamlining the entire production process [14]. Group 3: User Experience - The platform is designed for ease of use, allowing users to generate videos with just a simple input, making it accessible for both novices and professionals [18][21]. - Video Ocean automates the entire video editing process, providing a project replay feature for users to review their creative journey [26][25]. - The system ensures that all generated images are categorized for easy modification, enhancing the overall efficiency of the creative process [25].
马斯克成立新公司「巨硬」:用AI把微软产品重做一遍
量子位· 2025-08-25 01:12
Core Viewpoint - Elon Musk has established a new AI software company named "Macrohard" to directly compete with Microsoft, indicating a strategic move in the AI sector and a continuation of personal rivalries with Bill Gates [2][22][33]. Group 1: Company Overview - "Macrohard" is a pure AI software company aimed at simulating and potentially replacing Microsoft's core business functions [7][18]. - The company plans to develop hundreds of specialized AI agents under its Grok platform, which will handle tasks such as coding, image and video generation, and user interaction understanding [15][20]. - The AI capabilities of "Macrohard" are supported by Musk's xAI and the Colossus 2 supercomputer project, which is expected to utilize millions of NVIDIA GPUs [21][20]. Group 2: Competitive Landscape - The name "Macrohard" cleverly contrasts with Microsoft's "Microsoft," highlighting the competitive nature of the two companies [5][12]. - Musk's approach aims to fundamentally disrupt Microsoft's business model by creating AI-generated alternatives to their products, such as a fully functional version of the Office suite [19][18]. - The rivalry between Musk and Gates has historical roots, with past conflicts over Tesla and differing views on electric vehicles and space exploration [24][28][32]. Group 3: Personal Rivalry - The animosity between Musk and Gates has escalated from business disagreements to personal attacks, exemplified by Musk's public mockery of Gates [25][29][32]. - The establishment of "Macrohard" serves not only as a business challenge to Microsoft but also as a continuation of their personal feud [33].
和图灵机相关的这个数字,已经大到整个宇宙原子都容不下了
量子位· 2025-08-24 04:38
闻乐 发自 凹非寺 量子位 | 公众号 QbitAI 衡量图灵机最大运行步数的 海狸数 (busy beaver number)纪录,被刷新了! 一位神秘人突破了 第六个海狸数 的新下限,而且数值大到超乎想象—— 假如将宇宙里的每个原子都刻上数字,也无法完全容纳它。 也就是说,用咱平时熟悉的十进制根本没办法完全表示,得用超复杂的五幂运算来描述: 指数套指数再套指数 …… $$\delta\Delta\phi(t)=\delta\phi(t)\delta\phi(t)=\delta\phi(t)\delta\phi(t)$$ 这到底是个什么样的神秘数字呢? 研究图灵机极限能力的数字 海狸数,专业点说叫忙碌海狸数BB(n)。它背后藏着图灵在1936年就证明的停机问题: 例如,若选择规则数n=5,目标就是找到有5条规则的图灵机中运行时间最长才停机的那个,它在停机前执行的步数,就是BB(5)。 你永远没法用通用程序判断一台图灵机到底是运行有限步骤后就停机,还是会一直无限运行下去。 所以找这个数, 本质是在触碰计算机能解决问题的边界 。 图灵机的计算方式是在无限长的磁带上读取和写入0和1,磁带划分为很多个单元格,一个读 ...
告别“炼丹玄学”:上海AI实验室推出首个大模型数据竞技场OpenDataArena
量子位· 2025-08-24 04:38
OpenDataLab团队 投稿 量子位 | 公众号 QbitAI 数据在AI时代的重要性已经不言而喻,但悬而未决的是—— 如何精确量化这些数据的价值、辨别其优劣? 为此,上海人工智能实验室OpenDataLab团队在数据领域持续深耕,正式推出了 开放数据竞技场OpenDataArena 。 展开来说,在海量的SFT (监督式微调) 后训练数据面前,研究者们常常陷入"黑盒式"的困境: 不清楚哪些数据真正有用,也难以系统性地 评估和比较不同的数据集。 而OpenDataArena,正是一个为数据价值而生的"竞技场",致力于将数据质量的评估从"玄学"变为"科学"。 团队希望通过一个 公平、公开、透明 的平台,首次正式尝试回答"如何验证数据价值"这一核心问题。 它不仅提供了一个直观的 数据评测榜单 ,更构建了一套完整可复现的数据价值验证体系—— 通过一套 训评一体化的开源工具 ,让不同数据集在同等条件下公平"竞技",用模型效果作为衡量数据价值的最终标准。 同时,通过开发 多维度评分工具 ,对数据进行精细化"体检",让数据价值不再是模糊的"黑盒"。 平台目前已覆盖4+领域、20+基准测试、20+数据评分维度,处理了1 ...
诺贝尔物理学成果48年后终获数学证明!中科大少年班尹骏又出现了
量子位· 2025-08-24 04:38
Core Viewpoint - Two Chinese scholars have made a significant breakthrough in proving the Anderson model, a long-standing problem in condensed matter physics that explains the transition of electrons in semiconductor materials from a conductive to a non-conductive state [1][2][19]. Group 1: Anderson Model Overview - The Anderson model, proposed by Philip W. Anderson in 1958, describes how electrons transition from being able to move freely (delocalized) to being trapped (localized) in a material as the disorder increases [10][11][16]. - This phenomenon is crucial for understanding semiconductor materials, which can switch between conductive and non-conductive states, making them essential for chip technology [7][8][12]. Group 2: Breakthrough Achievements - After 16 years of collaboration, scholars Yao Hongze and Jun Yin successfully provided a mathematical proof for the Anderson model, marking the most significant progress since its inception [2][32]. - Their research initially focused on one-dimensional cases and later expanded to two-dimensional and three-dimensional scenarios, achieving notable advancements in understanding electron behavior in complex matrices [33][35]. Group 3: Methodology and Challenges - The scholars utilized random matrix theory to simplify the complex band matrix involved in the Anderson model, allowing them to prove that when the bandwidth exceeds a certain threshold, electrons remain delocalized [27][31]. - They faced significant challenges in their calculations, requiring extensive graphical analysis to simplify their equations and ultimately leading to a breakthrough in understanding the conditions for electron localization [30][31]. Group 4: Background of Scholars - Yao Hongze, a prominent mathematician, has made substantial contributions to probability, random processes, and quantum mechanics, and has been a professor at Harvard University since 2005 [36][38]. - Jun Yin, a professor at UCLA, has received several prestigious awards for his early career achievements in physics and mathematics, including the von Neumann Research Prize [47][50].