量子位
Search documents
深大团队让机器人听懂指令精准导航!成功率可达72.5%,推理效率提升40%|AAAI2026
量子位· 2025-12-10 04:26
Core Insights - The article discusses the introduction of a new framework called UNeMo for visual-language navigation (VLN), developed by a team led by Professor Li Jianqiang from Shenzhen University in collaboration with other institutions [1][4]. Group 1: Framework Overview - UNeMo utilizes a multi-modal world model (MWM) and a hierarchical predictive feedback navigator (HPFN) to enhance navigation capabilities by allowing agents to predict future visual states and make informed decisions [3][11]. - The framework addresses the disconnection between language reasoning and visual navigation, which has been a challenge in existing methods [8][9]. Group 2: Performance Metrics - UNeMo demonstrates a navigation success rate of 72.5% in unseen environments, outperforming the previous method NavGPT2, which had a success rate of 71% [4][26]. - The model's resource efficiency is notable, with GPU memory usage reduced by 56% from 27GB to 12GB and an improvement in inference speed by 40% [24]. Group 3: Robustness in Complex Scenarios - UNeMo shows significant advantages in long-path navigation, with a success rate increase of 5.6% for paths longer than 7 units, compared to a minor increase of 1.2% for shorter paths [28][29]. - This improvement indicates that UNeMo effectively mitigates cumulative errors in long-distance navigation tasks [30]. Group 4: Scalability and Adaptability - The framework has been tested across various navigation baselines and datasets, demonstrating its adaptability and scalability beyond LLM-based systems [31][33]. - UNeMo's collaborative training architecture allows it to perform well in diverse task scenarios, enhancing its overall value [34].
读懂2025中国AI走向!公司×产品×人物×方案,最值得关注的都在这里了
量子位· 2025-12-10 04:26
Core Insights - The year 2025 is marked by significant advancements in AI, particularly with the emergence of DeepSeek-R1 and the release of the V3.2 series, which encapsulate the year's technological narrative [1] - The main storyline revolves around the competition between open-source and closed-source AI models, focusing on inference efficiency, training paradigms, and cost structures, while world models evolve from theoretical concepts to real products [1] - 2025 is referred to as the "Agent Year," where AI agents transitioned from passive responders to proactive planners, leading to transformative changes across various industries [1] Group 1: AI Development and Trends - The AI landscape is evolving into an "Agent Internet Era," indicating a shift in how AI technologies are integrated into everyday applications [2] - AI is becoming a critical infrastructure in sectors like healthcare, meteorology, and industry, moving beyond mere plugins to essential components of existing systems [3] - The interplay between open-source and closed-source technologies is blurring, with agents, embodied intelligence, and world models overlapping and facilitating cross-industry collaboration [3] Group 2: AI Awards and Recognition - The "2025 AI Annual List" was unveiled at the MEET2026 Smart Future Conference, recognizing leading companies, potential startups, outstanding products, solutions, and key figures in the AI sector [6][8] - The selection process involved hundreds of companies and individuals, with results based on real data and expert opinions, reflecting the most representative forces in China's AI ecosystem [7][8] - The awards highlight companies that have played dual roles as "wave makers" and "steady navigators," continuously introducing new paradigms, tools, and models to the industry [12][14] Group 3: Notable Companies and Products - The "2025 AI Annual Leading Enterprises" list features companies that excel in technology, long-term investment, product implementation, and industry reputation, showcasing a diverse range of approaches to AI [12][18] - The "2025 AI Annual Outstanding Products" list includes applications that integrate AI into daily communication, search, and creative processes, as well as tools embedded in enterprise workflows [24] - The "2025 AI Annual Outstanding Solutions" list emphasizes solutions that incorporate cutting-edge algorithms into mature product forms, enhancing real business processes and accelerating the integration of AI technologies [30][31] Group 4: Key Figures in AI - The "2025 AI Annual Focus Figures" list includes entrepreneurs and leaders who have made significant contributions to the AI field, demonstrating the importance of human influence in technological advancements [35][36] - These individuals are recognized for their roles in driving product and business growth, advancing scientific research, and fostering collaboration across the industry [35][36]
2比特复数模型媲美全精度!北大通用框架让大模型在手机上也能流畅运行
量子位· 2025-12-10 04:26
Fairy2i团队 投稿 量子位 | 公众号 QbitAI 无需重新训练,模型压缩实现 2比特媲美FP16 。 近日,北京大学团队提出一个直接基于已有预训练模型进行极低比特量化的通用框架—— Fairy2i 。 该框架通过广泛线性表示将实数模型无损转换为复数形式,再结合相位感知量化与递归残差量化,实现了在仅2比特的情况下,性能接近全精 度模型的突破性进展。 下面是更多详细内容。 研究核心:复用真值权重与递归残差量化 众所周知,大模型在推理时,通常因其庞大的参数存储和计算需求,难以在手机、汽车等边缘设备上高效部署。 传统的量化方法在将模型压缩到极低比特 (如1-2比特) 时,常面临性能严重下降的问题,尤其是在直接复用预训练模型的情况下,难以在 压缩和精度之间找到平衡。 Fairy2i针对性地解决了这一痛点,具体表现在: 1、广义线性表示:低成本无损继承,打通实数与复数桥梁 在"架构"上,Fairy2i通过解决实数模型如何"变身"复数模型的问题,极大地降低了训练所需的成本。 不同于iFairy等需要花费高昂算力从头预训练 (Pre-training from scratch) 的方式, Fairy2i选择了一条更 ...
5天连更5次,可灵AI年末“狂飙式”升级
量子位· 2025-12-10 04:26
Core Viewpoint - The article highlights the recent advancements of 可灵AI in the generative AI field, showcasing multiple new features and models that enhance user experience and creativity in video and image generation [2][15][29]. Group 1: New Product Launches - 可灵AI has launched several new products within a short span, including the multi-modal video and image creation tool "可灵O1" and the "可灵2.6" model, which significantly elevate the competition in the generative AI sector [2][15]. - The "可灵O1" integrates various tasks such as video generation, editing, and modification into a single engine, allowing users to complete the entire creative process seamlessly [3][6]. Group 2: Technological Innovations - The "可灵2.6" model introduces a groundbreaking "audio-visual co-generation" capability, enabling the simultaneous generation of video with natural language, sound effects, and ambient sounds, thus transforming traditional workflows [10][11]. - The latest image generation model, "图像O1," allows users to create images from text or by uploading up to 10 reference images for enhanced creativity [7]. Group 3: User Engagement and Feedback - The rapid updates reflect 可灵AI's commitment to meeting user demands, as evidenced by the community's active participation in suggesting new features [8][15]. - The company has garnered a loyal user base, with over 20,000 enterprise users across various industries, including film production, advertising, and e-commerce [26][27]. Group 4: Market Position and Future Outlook - 可灵AI is positioned as a leading player in the domestic video generation model sector, consistently generating excitement with each iteration since its launch in June 2024 [20]. - The advancements in technology and user engagement suggest that 可灵AI is on a trajectory to further enhance its market presence and application in diverse creative fields [25][29].
量子位编辑作者招聘
量子位· 2025-12-10 04:26
以下是岗位详情: 所有岗位不同能力层级职位均在开放,欢迎结合个人履历和经验申请。 AI产业方向 编辑部 发自 凹非寺 量子位 | 公众号 QbitAI AI热潮还在汹涌,但如果你还不知道如何参与……那为什么不来 量子位 呢? 我们是一家以 追踪AI新进展 为核心的内容平台,经过8年积累,目前拥有顶流影响力,广泛且备受认可的产业资源,以及时代风口的最佳观 测和学习生态位。 目前,我们有 三大方向 岗位招聘,希望你是 (或者能成为) 这三个方向的内容专家: 岗位均为全职,工作地点:北京中关村。 岗位面向: 加入我们,你可以获得: 岗位职责: AI产业方向 :关注基建层创新,包含芯片、AI Infra、云计算; AI财经方向 :关注AI领域创投和财报,跟踪产业链资本动向; AI产品方向 :关注AI在应用和硬件终端方向的进展。 社招:覆盖编辑、主笔、主编各个层级,按能力匹配岗位; 校招:应届毕业生,接受实习且可转正。 站在AI浪潮之巅 :第一时间接触和了解AI领域最新技术和产品,构建完整的AI认知体系。 玩转AI新工具 :将各种AI新技术、新工具应用于工作,提升工作效率和创造力。 打造个人影响力 :通过撰写独家原创内 ...
微软急了!紧急腰斩AI产品销售指标,内部拉响红色警告
量子位· 2025-12-09 10:44
Core Viewpoint - Microsoft's AI products are facing significant challenges in the market, leading to a reduction in sales targets and a loss of user interest due to poor product performance [2][13][38]. Group 1: Sales Performance - Microsoft has lowered its sales KPIs for various AI product departments, particularly for Azure AI, indicating a serious decline in product demand [7][10]. - A specific Azure sales team aimed for a 50% growth in sales for the Foundry platform but only achieved less than 20%, prompting a revised target of 25% [9][10]. - Another Azure department also revised its sales target from doubling to a 50% increase, reflecting widespread sales struggles across Microsoft's AI offerings [11][13]. Group 2: Strategic Issues - Microsoft's AI products have failed to successfully meet user needs, leading to a lack of engagement with integrated AI features in Windows and other applications [15][22]. - The company's strategy has been criticized for prioritizing low-cost, low-performance products, which has resulted in a weak market response [25]. - Microsoft's reliance on OpenAI for technology and support has become a liability, especially as OpenAI faces its own challenges with the emergence of competitors like Gemini [26][27][31]. Group 3: Competitive Landscape - Google's AI products, particularly the Gemini model, are gaining traction and are expected to surpass Microsoft's Copilot in market position [34][35]. - Google has established a stronger product ecosystem, allowing for quicker adoption of its AI innovations compared to Microsoft's offerings [37]. - Despite the downturn in AI product sales, Microsoft's overall AI business remains on a growth trajectory, primarily driven by its partnership with OpenAI, which is projected to generate approximately $15 billion from cloud services [39][40].
倒计时1天!MEET2026,明天见
量子位· 2025-12-09 10:44
12月10日9点 ,时间过得飞快, MEET2026智能未来大会 就在明天了! MEET组委会 发自 凹非寺 量子位 | 公众号 QbitAI 一起来AI认知跨年吧,记下时间和地址前来赴约~ 大会时间 :2025年12月10日(周三)9:00-18:00 大会地址 :北京金茂万丽酒店 现在观众报名通道还可以报名!期待 明天 与你线下见面啦~ 如果来不及线下参与,也可以来线上围观直播 最新最全大会议程奉上! 敬请期待。 会议能 - 上午议程 c 09:15-09:20 © 主办方致辞 置 鸿 量子位创始人兼 CEO 人工智能 + 趋势 09:20-09:35 张亚勤 清华大学智能产业研究院院长,中国工程院院士 Al 打造超级智能体, 09:35-09:50 成就超级个体、超级团队、超级组织 王颖 百度集团副总裁,文库事业部、网盘事业部负责人 Al 觉醒之年: 09:50-10:05 从数字世界迈向物理世界 王仲远 北京智源人工智能研究院院长 10:05-10:20 混合 AI: 从云端到边缘智能 万卫星 高通公司 AI 产品技术中国区负责人 10:20-10:35 Agentic Al 未来已来 大会上还将发布 ...
起底“豆包手机”:核心技术探索早已开源,GUI Agent布局近两年,“全球首款真正的AI手机”
量子位· 2025-12-09 07:37
Core Insights - The article discusses the rapid success and technological foundation of the "Doubao Phone" and its assistant, which has gained significant attention in the market due to its advanced capabilities in automating tasks on mobile devices [1][50]. Group 1: Product Overview - The "Doubao Phone" sold out its initial stock of 30,000 units, with prices in the second-hand market doubling [1]. - The phone's assistant can automate complex tasks across applications, such as submitting leave requests and booking train tickets [4][5]. - The assistant is built on ByteDance's self-developed UI-TARS model, which has been optimized for mobile use [7][8]. Group 2: Technological Development - The UI-TARS model has undergone significant iterations, with the initial version released in January 2023, followed by UI-TARS-1.5 and the latest UI-TARS-2, which enhances the agent's capabilities [11][23][34]. - UI-TARS-2 addresses issues related to data scalability and multi-round reinforcement learning, allowing for more autonomous interactions with graphical user interfaces [34][35]. - The model has shown superior performance in various benchmarks compared to competitors like OpenAI's models [27][28]. Group 3: User Experience and Feedback - Users have reported high satisfaction with the assistant's ability to perform tasks efficiently, with one user describing it as the "world's first true AI smartphone" [69]. - The assistant's design includes a dual-mode system, allowing for both rapid responses and deeper reasoning capabilities [60][62]. - Concerns regarding privacy and security have been raised, but the company has emphasized that user consent is required for high-level permissions [50][51]. Group 4: Market Implications - The success of the "Doubao Phone" indicates a shift towards AI-driven mobile technology, where devices can autonomously understand and execute user intentions [85]. - The product's development reflects a broader trend in the industry towards integrating advanced AI capabilities into everyday technology, potentially redefining user interaction with mobile devices [86].
稚晖君5000台机器人量产下线!创业仅3年,订单数亿元
量子位· 2025-12-09 05:39
Core Insights - The company "Zhiyuan" has successfully mass-produced its 5000th general-purpose humanoid robot, showcasing rapid growth in the embodied intelligence sector [1][5][8] - The production scale achieved by Zhiyuan is ahead of industry predictions, with the potential for 2026 to be a landmark year for mass production in the sector [9][10] Production and Product Lines - Zhiyuan's 5000 robots include three main product lines: - The "Expedition" series with 1742 units, designed for industrial manufacturing and interactive services [13][14] - The "Lingxi" series with 1846 units, aimed at family companionship and entertainment, featuring advanced navigation and interaction capabilities [16][18] - The "Spirit" series with 1412 units, focusing on industrial applications with a wheeled design for enhanced stability [20] Market Applications - The majority of Zhiyuan's robots are deployed in industrial manufacturing, with significant contracts in the automotive and electronics sectors [22][25] - Notable partnerships include a multi-million dollar deal with Longqi Technology for precision tasks in tablet assembly and a contract with Jingsheng Electronics for automotive safety component production [27] - The company has also secured a major procurement project with China Mobile, involving the deployment of 200 humanoid robots for customer service [29] Industry Context - The humanoid robot market in China is projected to reach approximately 5000 units in sales by 2025, indicating a growing demand for such technologies [7][8] - International competitors like Figure and Tesla are also ramping up production, with Figure aiming for an annual capacity of 12,000 units and Tesla targeting close to 10,000 units, although actual production rates may vary [11][12]
摩尔线程新一代GPU架构10天后发布
量子位· 2025-12-09 05:39
Core Viewpoint - The MUSA Developer Conference (MDC 2025) will be held in Beijing on December 19-20, 2025, focusing on the development of domestic full-function GPUs and the exploration of breakthroughs in computing power for AI and GPU fields [1][2]. Group 1: Conference Overview - MDC 2025 aims to gather global developers, technology leaders, and industry pioneers to discuss the self-reliance in technology and industrial upgrades, with the theme "Create, Connect, Converge" [1]. - The conference will showcase the MUSA technology system and its full-stack capabilities, promoting the integration of GPU technology across various industries [1][2]. Group 2: Main Forum Highlights - The main forum will focus on intelligent computing as a core engine for digital transformation across industries, featuring a presentation by Zhang Jianzhong, the founder and CEO of Moole Technology, on the new GPU architecture and strategic vision [2]. - The forum will also include discussions on product systems, core technologies, industry solutions, and case studies [2][3]. Group 3: Technical Sessions - Over 20 technical sub-forums will be held, covering key areas such as intelligent computing, graphics computing, AI infrastructure, and developer tools, aimed at empowering developers and partners [4]. - The conference will facilitate deep integration of cutting-edge technologies with industry practices [4]. Group 4: Developer Empowerment - The "Moole Academy" will be established to support developer growth through systematic technology sharing, resource integration, and talent cultivation, fostering a sustainable domestic GPU application ecosystem [5]. Group 5: Interactive Experience - A 1000㎡ immersive "MUSA Carnival" will be created, featuring diverse thematic exhibition areas that cover advanced technologies and popular application scenarios such as AI models, intelligent manufacturing, and digital twins [6][9]. - Live demonstrations will provide an interactive experience, showcasing the real-world integration of technology and industry [7].