Workflow
量子位
icon
Search documents
量子位编辑作者招聘
量子位· 2025-12-14 07:12
以下是岗位详情: 所有岗位不同能力层级职位均在开放,欢迎结合个人履历和经验申请。 AI产业方向 编辑部 发自 凹非寺 量子位 | 公众号 QbitAI AI热潮还在汹涌,但如果你还不知道如何参与……那为什么不来 量子位 呢? 我们是一家以 追踪AI新进展 为核心的内容平台,经过8年积累,目前拥有顶流影响力,广泛且备受认可的产业资源,以及时代风口的最佳观 测和学习生态位。 目前,我们有 三大方向 岗位招聘,希望你是 (或者能成为) 这三个方向的内容专家: 岗位均为全职,工作地点:北京中关村。 岗位面向: 加入我们,你可以获得: 岗位职责: AI产业方向 :关注基建层创新,包含芯片、AI Infra、云计算; AI财经方向 :关注AI领域创投和财报,跟踪产业链资本动向; AI产品方向 :关注AI在应用和硬件终端方向的进展。 社招:覆盖编辑、主笔、主编各个层级,按能力匹配岗位; 校招:应届毕业生,接受实习且可转正。 站在AI浪潮之巅 :第一时间接触和了解AI领域最新技术和产品,构建完整的AI认知体系。 玩转AI新工具 :将各种AI新技术、新工具应用于工作,提升工作效率和创造力。 打造个人影响力 :通过撰写独家原创内 ...
统一视觉多模态与多任务!快手可灵与港科大团队发布视频生成模型,加速真实世界理解
量子位· 2025-12-14 07:12
Core Insights - The article introduces UnityVideo, a new visual framework developed by research teams from Hong Kong University of Science and Technology, Chinese University of Hong Kong, Tsinghua University, and Kuaishou, which enhances video generation by integrating multiple visual modalities [1][3][4]. Group 1: Model Capabilities - UnityVideo utilizes unified training across various visual modalities such as depth maps, optical flow, skeletons, and segmentation masks, allowing the model to better understand the physical world and generate more realistic and controllable videos [3][12]. - The model demonstrates zero-shot generalization, enabling it to generate reasonable results for previously unseen objects or scenes [4][16]. - The unified training approach significantly accelerates convergence speed and improves performance in RGB video generation tasks compared to single modality training [15][16]. Group 2: Technical Innovations - UnityVideo features dynamic task routing, allowing seamless integration of three training paradigms within a single architecture [19]. - A key breakthrough is the dynamic noise scheduling strategy, which randomly selects training modes during iterations, preventing catastrophic forgetting and enabling harmonious coexistence of multiple training objectives [21][22]. - The model incorporates a context learner and a modality-adaptive switcher to effectively distinguish between different modality signals, enhancing its ability to generalize across tasks [27][30]. Group 3: Training Strategy - UnityVideo employs a two-phase curriculum learning strategy, first training on carefully selected single-person scene data to establish spatial correspondence, followed by introducing all modalities and diverse scene data [33][35]. - The OpenUni dataset, containing 1.3 million multimodal video samples, supports this unified training paradigm, ensuring balanced sampling across modalities [35][36]. Group 4: Performance Results - UnityVideo outperforms existing models in various tasks, achieving high scores in physical reasoning, controllable generation, and modality estimation [39][41]. - The model's qualitative results demonstrate superior understanding of physical phenomena, such as light refraction in water, and maintains high video quality without common issues like background flickering [41][42]. - In quantitative comparisons, UnityVideo achieves a background consistency score of 97.44% and an aesthetic quality score of 64.12% in text-to-video generation tasks [44]. Group 5: Generalization and Understanding - The model exhibits strong generalization capabilities, accurately estimating unseen data and overcoming overfitting issues common in specialized models [43][56]. - UnityVideo's design emphasizes the importance of integrating multiple dimensions of perception, akin to human understanding, which enhances its ability to model physical laws and improve overall video generation quality [60][65].
OpenAI突然开源新模型!99.9%的权重是0,新稀疏性方法代替MoE
量子位· 2025-12-14 05:17
闻乐 发自 凹非寺 量子位 | 公众号 QbitAI 破解AI胡说八道的关键,居然是给大模型砍断99.9%的连接线? 也就是 Circuit Sparsity 技术的开源实现。 这是一种通过人为约束模型内部连接的稀疏性,让模型计算过程可拆解、可理解的大语言模型变体,本质上是为了解决传统稠密Transformer 的黑箱问题,让内部的计算电路能被人类清晰解读,知道AI是如何做决策的,避免轻易相信AI的胡话(doge)。 OpenAI悄悄开源新模型,仅有0.4B参数,且99.9%的权重为零。 更有人直言这种「极致稀疏+功能解耦」的思路可能会让当下热门的MoE(混合专家模型)走上末路。 那么,当Transformer的权重被训练到近乎全0,会发生什么呢? 放弃粗糙近似,追求原生稀疏 先说说为啥这个模型的思考过程能像电路图一样好懂。 咱们平时用的传统大模型,内部神经元连接得密密麻麻,权重矩阵几乎全为非零值,信息传递呈现出高度叠加状态,就像一团扯不开的乱线, 没人能说清它是怎么得出某个结论的。 这些留存的非零权重连接就像电路图里的导线,信息只能沿着固定路径传递;同时,模型还会通过 均值屏蔽 剪枝方法,为每个任务拆出专属 ...
为Token付费是一件很愚蠢的事情,用户应该为智能付费丨RockAI刘凡平@MEET2026
量子位· 2025-12-13 08:30
Core Insights - The next stage of artificial intelligence (AI) development requires overcoming two major challenges: the Transformer architecture and the backpropagation algorithm [1][7][54] - The focus should shift from larger models to creating "living" models that possess native memory, autonomous learning, and continuous evolution capabilities [2][4][48] - This transition signifies a move from centralized cloud computing to decentralized learning, where each device can contribute to knowledge generation [3][5][70] Group 1: Hardware Awakening - The concept of "hardware awakening" suggests that devices can learn and adapt in real-time, transforming them from mere tools into active intelligent agents [4][64] - A multitude of such intelligent agents collaborating in the real world can lead to the emergence of collective intelligence [5][71] - The current reliance on the Transformer model limits the potential for true intelligence, as it does not facilitate autonomous learning or native memory [21][30][76] Group 2: Redefining Value - The future of AI will redefine the value of hardware, moving beyond traditional metrics like memory and processing power to focus on the co-creation of value between users and devices [64][66] - Users should pay for intelligence rather than token consumption, as the latter is seen as an inefficient model [15][19][21] - The emergence of devices with autonomous learning capabilities will enhance user experience and privacy, as data remains localized [68][69] Group 3: Collective Intelligence - Collective intelligence arises when each device possesses its own intelligence and can learn from the physical world, similar to human collaboration [71][76] - True intelligence is characterized by the ability to generate knowledge rather than merely disseminating it, which is a limitation of current large models [75][77] - The path to general artificial intelligence is through collective intelligence rather than the centralized model exemplified by companies like OpenAI [77]
太初元碁乔梁:AI算法已经跑到单芯片极限|MEET2026
量子位· 2025-12-13 06:30
随着AI技术不断发展落地,行业应用对于算力的需求与日俱增,这已经成为广泛共识。 编辑部 整理自 MEET2026 量子位 | 公众号 QbitAI 与此同时,算法本身的规模和复杂度也在成倍增长,让整个行业正式迈入一个更高强度的算力周期,对此 太初元碁联合创始人兼首席运营官 乔梁 表示: 当下行业应用对于算力的需求与日俱增,AI需要算法实现毫秒级精确度,而这恰好带动算力需求呈指数级增长。 这意味着,在未来的技术演进中,高性能计算将贯穿生产制造、科学研究到AI落地的全链路,成为各类计算场景的底层支撑力量。 在本次 量子位MEET2026智能未来大会 上,乔梁围绕超智融合、异构融合等关键词分享了自己对国产算力生态建设的看法: 目前,各类AI大模型、不同领域的AI Agent落地都需要大量算力来支撑,在这一背景下,"超智融合发展"已成为行业共识。 无论是AI算法的迭代,还是传统科学计算的发展,未来的趋势都会指向同一件事:在通用计算的场景下,通过硬件架构的设计来实现异构融 合。 为了完整体现乔梁的思考,在不改变原意的基础上,量子位对演讲内容进行了编辑整理,希望能给你带来更多启发。 MEET2026智能未来大会是由量子位 ...
量子位编辑作者招聘
量子位· 2025-12-13 04:34
编辑部 发自 凹非寺 量子位 | 公众号 QbitAI AI热潮还在汹涌,但如果你还不知道如何参与……那为什么不来 量子位 呢? 我们是一家以 追踪AI新进展 为核心的内容平台,经过8年积累,目前拥有顶流影响力,广泛且备受认可的产业资源,以及时代风口的最佳观 测和学习生态位。 目前,我们有 三大方向 岗位招聘,希望你是 (或者能成为) 这三个方向的内容专家: 岗位均为全职,工作地点:北京中关村。 岗位面向: 加入我们,你可以获得: 以下是岗位详情: 所有岗位不同能力层级职位均在开放,欢迎结合个人履历和经验申请。 AI产业方向 岗位职责: AI产业方向 :关注基建层创新,包含芯片、AI Infra、云计算; AI财经方向 :关注AI领域创投和财报,跟踪产业链资本动向; AI产品方向 :关注AI在应用和硬件终端方向的进展。 社招:覆盖编辑、主笔、主编各个层级,按能力匹配岗位; 校招:应届毕业生,接受实习且可转正。 跟进AI基建层新进展,包括但不限于芯片、AI Infra、云计算领域新进展,核心玩家动态; 做前沿论文、开源社区、技术大会 (Hot Chips、NeurIPS、MLSys) 技术报告大众化解读; 参与 ...
面向「空天具身智能」,北航团队提出星座规划新基准丨NeurIPS'25
量子位· 2025-12-13 04:34
△ 卫星星座任务规划效果展示 卫星星座是由多颗卫星组成的协同网络,具备远超单星的全球覆盖、快速响应和高频观测能力。从美国的巨型卫星通信星座到我国的"千帆"星 座, 卫星星座已从科幻概念走向产业核心,成为数字经济时代的基础设施。 这些运行在距地数百公里的卫星星座,正默默支撑着遥感、通信、导航、气象预测等关键行业。但每一个稳定运行的星座背后,都藏着一个高 维、动态、强约束的规划难题。 如何在短短几分钟的观测窗口内,调度数十颗卫星形成协同观测网络,执行上百项任务,同时响应地震救 援、海上搜救、森林火灾等突发需求? 人工智能技术正在成为破解这一难题的关键钥匙。北航刘偲教授团队提出 首个大规模真实星座调度基 准AEOS-Bench ,更创新性地将Transformer模型的泛化能力与航天工程的专业需求深度融合,训练 内嵌时间约束的调度模型AEOS- Former 。这一组合为未来的"AI星座规划"奠定了新的技术基准。 AEOS-Bench&AEOS-Former团队 投稿 量子位 | 公众号 QbitAI 将卫星星座送入轨道我们都知道很难,但高效规划调度在轨卫星星座执行任务也不简单。 随着部署的星座规模越来越大,通过人 ...
美国视频生成老炮儿,入局世界模型
量子位· 2025-12-13 04:34
Core Insights - Runway has launched its first general world model GWM-1, which is based on the latest Gen-4.5 video generation model [1][8] - The GWM-1 includes three variants: GWM Worlds, GWM Avatars, and GWM Robotics, each designed for different applications [5][12] Group 1: GWM-1 Overview - GWM-1 utilizes an autoregressive architecture that allows for frame-by-frame prediction based on previous memory content [9] - The model supports real-time interactive control, enabling users to adjust camera angles, modify robot operation commands, or audio [10] Group 2: GWM Worlds - GWM Worlds allows users to explore a coherent and responsive environment without manually designing each space [13] - Users can provide a static scene for reference, and the model will generate an immersive, infinite, and explorable space in real-time [13] - It maintains spatial consistency of scene elements during long sequences of movement, unlike other world models that generate limited frame sequences [13] - Users can change physical rules of the environment through text prompts, facilitating training for agents in real-world actions [15][16] - GWM Worlds can also support VR immersive experiences by generating virtual environments in real-time [17] Group 3: GWM Avatars - GWM Avatars is an audio-driven interactive video generation model that simulates human dialogue with realistic facial expressions and gestures [18][19] - It can serve as a personalized tutor or enhance customer service by creating digital humans that can interact naturally [20] - The model is set to launch with an API for integration into various products or services [22] Group 4: GWM Robotics - GWM Robotics functions as a learning-based simulator rather than a fixed-rule programming model, predicting video sequences based on robot data [23] - It generates synthetic training data to enhance existing robot datasets without the need for expensive real-world data collection [24] - The model allows for direct testing of strategy models without deploying them on physical robots, improving safety and efficiency [26] - A Python SDK for GWM Robotics has been released, supporting multi-view video generation and long context sequences for seamless integration into modern robot strategy models [29] Group 5: Gen-4.5 Upgrades - The latest Gen-4.5 update includes native audio generation and editing capabilities, allowing for realistic dialogue, sound effects, and background audio [30][31] - Users can edit existing audio to meet specific needs and utilize multi-shot editing for consistent transformations across video segments [33]
半世纪难题48小时破解!陶哲轩组队把AI数学玩成打怪游戏了
量子位· 2025-12-13 04:34
Core Viewpoint - The collaboration between mathematicians and AI has led to the resolution of the long-standing Erdős 1026 problem, which had remained unsolved for 50 years, in just 48 hours [1][2][3]. Group 1: Problem Overview - The Erdős 1026 problem was proposed in 1975 and involves determining the minimum possible value of a function related to a game theory scenario involving two players, Alice and Bob [8][10][12]. - The problem's complexity was highlighted by the introduction of a maximum constant c(n) that represents the minimum proportion of coins Bob can guarantee to take, regardless of how Alice distributes them [10][13]. Group 2: AI's Role in the Solution - AI tools played a crucial role in solving the problem quickly, with traditional methods potentially taking weeks or months to reach a conclusion [3][5]. - The use of AI models, such as Harmonic and AlphaEvolve, allowed mathematicians to automate the construction and proof of key inequalities, transforming the original problem into a computational geometry challenge [16][18][22]. Group 3: Collaborative Efforts - The solution involved multiple mathematicians working together, with contributions from Boris Alexeev, Koishi Chan, and Lawrence Wu, showcasing the effectiveness of human-AI collaboration [17][28][32]. - The collaborative approach of combining human insight with AI capabilities is emerging as a new trend in mathematical problem-solving [46]. Group 4: Historical Context and Future Implications - The Erdős problems, proposed by the renowned mathematician Paul Erdős, have been a significant part of mathematical research, with many remaining unsolved [39][41]. - The increasing success of AI in solving these problems suggests a shift in how mathematical research may be conducted in the future, with AI becoming a standard tool for researchers [41][42].
交大高金朱宁:经济学家视角下AI时代的范式思维转变 | MEET2026
量子位· 2025-12-13 02:00
Core Viewpoints - The concept of scarcity has changed after the emergence of AI, prompting a need for deeper consideration on how to make better choices in the face of this new reality [6][11] - As AI begins to replace human decision-making, competition may arise between humans and algorithms, as well as among algorithms themselves [6][22] Economic Implications - Economics has historically focused on technological progress and its impact on economic principles and human welfare, with fundamental concepts like "what is human?" and "what is production?" undergoing significant changes in the AI era [8][11] - The traditional view of scarcity, which included time, computational power, and creativity, is being challenged as AI can now perform tasks that previously required significant human effort [11][12] - AI is expected to contribute to global economic growth by 0.5% to 0.7% annually over the next decade, although this may not be sufficient to support high valuations in tech markets [14][24][25] Industry Impact - The nature of work is changing, with both white-collar and blue-collar jobs facing potential replacement by AI, blurring the lines between these categories [31] - Knowledge-intensive industries, previously thought to be safe from AI disruption, are also at risk as AI capabilities evolve [33] - Companies are encouraged to focus on how to leverage AI technology to enhance productivity and efficiency rather than seeking industries that are immune to AI [33] Global Considerations - There is a significant disparity in access to AI capabilities between high-income and low-income countries, which may exacerbate global wealth distribution issues [28][29] - The shift towards AI-driven trade will lead to new regulatory and governance challenges, particularly regarding accountability in cross-border transactions [30]