Workflow
量子位
icon
Search documents
哈工大系闯出人形机器人黑马:成立不到一年,全栈开源3m/s原型机,小米商汤都投了
量子位· 2026-01-19 07:00
Core Viewpoint - Roboparty has launched a fully open-source bipedal humanoid robot prototype, "Roboto_Original," aiming to revolutionize the humanoid robot development industry through collaborative innovation and shared resources [2][10]. Group 1: Open Source Initiative - The open-source release includes not only software code but also hardware schematics, EBOM material lists, supplier information, and a comprehensive knowledge base to facilitate development [5][10]. - The goal is to create a reproducible, verifiable, and modifiable open-source framework, addressing the industry's long-standing pain points of high development barriers and lack of standardization [6][9][10]. Group 2: Technical Specifications - The "Roboto_Original" prototype has a running speed of up to 3 m/s, positioning it among the leading open-source humanoid robots globally [4][24]. - The robot's hardware features a height of 1.2m and a weight of 30kg, with detailed design documents available to lower the barriers for hardware development and replication [12][14]. Group 3: Software and Control - The project has released full control code covering core modules for imitation, perception, and navigation, allowing developers to leverage extensive motion capture data [16]. - The AMP control algorithm enhances the robot's walking and running capabilities, ensuring natural movement and stability, which is crucial for real-world applications [26][27]. Group 4: Engineering and Collaboration - Roboparty has established a knowledge base for hands-on learning in humanoid robotics, focusing on practical issues like walking stability and production costs [21][36]. - The initiative aims to shift the industry from isolated trial-and-error approaches to collaborative breakthroughs, fostering a community-driven development environment [22][30]. Group 5: Industry Impact and Funding - The project has secured millions in seed funding from notable investors, indicating strong market interest and validation of its technological approach [29]. - Roboparty aims to reduce development costs by 80%, making humanoid robotics more accessible and scalable across various industries [32][31].
45年数论猜想被GPT-5.2 Pro独立完成证明,陶哲轩:没犯任何错误
量子位· 2026-01-19 07:00
Core Viewpoint - The article discusses the significant achievement of OpenAI's GPT-5.2 Pro in independently proving a mathematical conjecture known as the Erdős problem, specifically the 281st problem from the Erdős problem collection, which had remained unsolved for 45 years [2][4][5]. Group 1: Proof and Validation - The proof was verified by Fields Medalist Terence Tao, who described it as "the clearest first-class result contributed by AI to date" [3]. - The proof utilized concepts from ergodic theory and combinatorial mathematics, specifically leveraging the Birkhoff theorem and avoiding common pitfalls such as limit exchanges and quantifier order errors [9][15][12]. - Tao translated the proof into combinatorial language, confirming its validity and establishing that the proof is indeed correct [16][17]. Group 2: Alternative Solutions - An unexpected discovery was made by a user named KoishiChan, who pointed out that a simpler solution to the problem exists, utilizing two theorems established in 1936 and 1966 [18]. - The first theorem is the density convergence theorem co-proven by Harold Davenport and Paul Erdős in 1936, and the second is Rogers' theorem from a 1966 publication [19]. - This raises questions about why Erdős himself did not recognize the proximity of the solution when he proposed the problem in 1980 [20]. Group 3: AI's Success Rate and Future Implications - Following the announcement, various AI models were tested for their ability to validate the proof, with Gemini 3 Pro confirming its correctness [24]. - However, Tao cautioned that the true success rate of AI tools in solving such problems is likely skewed due to reporting biases, with only about 1% to 2% of attempts yielding positive results [30]. - Despite this low success rate, the existence of over 600 unsolved problems in the Erdős collection suggests that AI contributions could still be significant [31].
真·开外挂!MIT新研究:架构0改动,让大模型解锁千万级上下文
量子位· 2026-01-19 03:48
Core Insights - The article discusses a new method called Recursive Language Model (RLM) developed by MIT CSAIL for processing long texts, addressing the issue of context decay in large models [1][5][11] - RLM allows top models like GPT-5 and Qwen-3 to handle super long texts with millions of tokens without modifying their architecture [2][23] Summary by Sections Context Decay Issue - Large models struggle with context decay, where the performance declines as the text length increases, leading to a loss of memory for earlier information [5][6] - Current mainstream solutions include context compression, retrieval-augmented generation (RAG), and architectural optimizations [7][10] RLM Methodology - RLM outsources context processing to an interactive Python environment, enabling models to programmatically break down tasks and process them as needed [4][13][15] - The model initiates a Python REPL environment, storing long prompts as string variables and performing operations like keyword filtering and logical decomposition [14] Performance Metrics - RLM has demonstrated the ability to effectively handle over 10 million tokens, significantly surpassing the native context window of models like GPT-5 [16] - In complex long text tasks, RLM showed substantial improvements, achieving F1 scores of 58.00% and 23.11% for GPT-5 and Qwen-3, respectively, in the OOLONG-Pairs task [16] - For the BrowseComp-Plus multi-document reasoning task, RLM (GPT-5) achieved a correct rate of 91.33%, outperforming other long text processing methods [16] Cost Efficiency - RLM's cost at the 50th percentile is competitive with other long text processing solutions, indicating a favorable cost-performance ratio in most regular task scenarios [19] - However, at the 95th percentile, RLM's costs can spike due to its dynamic reasoning process, which increases API call frequency based on task complexity [20][21]
零样本&少样本横扫12个工业医疗数据集:西门子×腾讯优图新研究精准定位缺陷,检测精度新SOTA丨AAAI 2026
量子位· 2026-01-19 03:48
Core Insights - The article discusses the development of AdaptCLIP, a universal visual anomaly detection framework that aims to improve performance in industrial quality inspection and medical imaging by leveraging the capabilities of the CLIP model while addressing its limitations in zero-shot and few-shot scenarios [2][4]. Group 1: Challenges in Anomaly Detection - Traditional models for defect detection require extensive labeled data, making them less effective in real-world scenarios where data is scarce [1][3]. - The core challenge in anomaly detection is the need for models to generalize across domains while accurately identifying subtle anomalies with minimal target domain data [3][4]. Group 2: AdaptCLIP Framework - AdaptCLIP introduces a lightweight adaptation approach by adding three adapters to the CLIP model without altering its core structure, enabling it to perform both image-level anomaly classification and pixel-level anomaly segmentation [5][6]. - The framework employs an alternating learning strategy, optimizing visual and textual representations separately to enhance performance in zero-shot anomaly detection [20][21]. Group 3: Key Innovations - The visual adapter fine-tunes CLIP's output tokens to better align with the anomaly detection task, significantly improving pixel-level localization capabilities [15][18]. - The text adapter eliminates the need for manually designed prompts by learning optimized embeddings for "normal" and "anomalous" classes, thus reducing dependency on prompt engineering [16][18]. Group 4: Experimental Results - AdaptCLIP achieved an average image-level AUROC of 86.2% across multiple industrial datasets in zero-shot scenarios, outperforming existing methods [31]. - In medical imaging tasks, AdaptCLIP demonstrated an average pixel-level AUPR of 48.7% and an average image-level AUROC of 90.7%, indicating superior performance compared to other approaches [31][32]. Group 5: Efficiency and Scalability - The model introduces approximately 0.6 million additional trainable parameters under zero-shot conditions, significantly lower than competing methods that can exceed 10.7 million parameters [32][37]. - AdaptCLIP maintains a reasonable inference time of about 162 ms per image at a resolution of 518x518, balancing detection accuracy with deployment efficiency [32][37].
李飞飞的World Labs联手光轮智能,具身智能进入评测驱动时代!
量子位· 2026-01-19 03:48
Core Viewpoint - The collaboration between World Labs, led by Fei-Fei Li, and Guanglun Intelligent, a leading synthetic data company, aims to address the long-standing issue of "scalable evaluation" in the field of embodied intelligence, marking the entry into an evaluation-driven era for this technology [1][2][3]. Group 1: Companies Involved - World Labs is founded by Fei-Fei Li, a prominent figure in AI, known for her work on ImageNet and as a former chief AI scientist at Google Cloud [4][5]. - Guanglun Intelligent is recognized as a hot company in the embodied intelligence infrastructure sector, having established a strong partnership with NVIDIA and contributing to the development of simulation systems [54][55]. Group 2: Technological Innovations - World Labs is set to launch its first product, Marble, by the end of 2025, which can generate high-fidelity 3D worlds from minimal input [8][9]. - Marble aims to provide a visualized world model, allowing users to create and export 3D environments efficiently, thus serving as a productivity tool for visual effects and game developers [15][16]. Group 3: Challenges in Evaluation - The rapid advancement of models in embodied intelligence has outpaced existing benchmarks, creating a need for new evaluation methods [20][22]. - Traditional evaluation methods are inadequate for assessing the capabilities of embodied intelligence, necessitating the use of simulation as a scalable solution [29][30]. Group 4: Strategic Collaboration - The partnership between World Labs and Guanglun Intelligent is crucial for developing a comprehensive evaluation framework that integrates environment generation and physical interaction [37][49]. - Guanglun Intelligent's role is to provide the necessary physical assets and evaluation loops, ensuring that the simulated environments can support real physical interactions [49][50]. Group 5: Future Directions - The collaboration signifies a pivotal moment in the embodied intelligence sector, as it transitions into an evaluation-driven era, with the potential to shape research directions and identify technological bottlenecks [71][72][76]. - The establishment of robust evaluation standards, such as RoboFinals, highlights the industry's shift towards scalable and credible assessment frameworks for advanced robotic models [63][64].
量子位编辑作者招聘
量子位· 2026-01-19 03:48
编辑部 发自 凹非寺 量子位 | 公众号 QbitAI AI热潮还在汹涌,但如果你还不知道如何参与……那为什么不来 量子位 呢? 我们是一家以 追踪AI新进展 为核心的内容平台,经过8年积累,目前拥有顶流影响力,广泛且备受认可的产业资源,以及时代风口的最佳观 测和学习生态位。 目前,我们有 三大方向 岗位招聘,希望你是 (或者能成为) 这三个方向的内容专家: 岗位均为全职,工作地点:北京中关村。 岗位面向: 加入我们,你可以获得: 以下是岗位详情: 所有岗位不同能力层级职位均在开放,欢迎结合个人履历和经验申请。 AI产业方向 岗位职责: AI产业方向 :关注基建层创新,包含芯片、AI Infra、云计算; AI财经方向 :关注AI领域创投和财报,跟踪产业链资本动向; AI产品方向 :关注AI在应用和硬件终端方向的进展。 AI财经商业方向 岗位职责: 任职要求: AI产品方向 岗位职责: 社招:覆盖编辑、主笔、主编各个层级,按能力匹配岗位; 校招:应届毕业生,接受实习且可转正。 站在AI浪潮之巅 :第一时间接触和了解AI领域最新技术和产品,构建完整的AI认知体系。 玩转AI新工具 :将各种AI新技术、新工具应用 ...
全球首个负载100斤的真实持续干活机器人,来自银河通用
量子位· 2026-01-19 01:00
衡宇 发自 凹非寺 量子位 | 公众号 QbitAI 全球第一个负载50公斤的、真实自主干活的具身智能机器人,已经进宁德时代工厂干活了! 它就是 银河通用近期发布的具身智能重载机器人Galbot S1 。 其双臂最大持续作业负载能力达50公斤,一举突破行业上限,填补了具身智能满足工业领域负载刚需的空白。 从此, 具身智能正式迈入工业级重载新时代 。 目前,Galbot S1已在宁德时代等制造业头部企业实现产线级应用,开始承担先进制造流程中的重载关键环节。 具身智能机器人首次在工业重载场景中进入核心生产流程,颇具里程碑式的产业意义: Galbot S1是一套可复制、可推广的生产力解决方 案,且 其技术能力已被证明能与真实工业需求匹配。 当头部制造企业将核心生产环节交由具身智能系统,体现的是对技术可靠性及长期价值的认可,也意味着"具身智能生产力"正式融入产业升级 主航道。 50kg双臂重载,Galbot S1突破具身智能负载上限天花板 在具身智能领域,负载能力始终是一道真正的工业门槛。 过去几年,行业并不缺参数亮点,但在真实工业语境下,多数具身智能机器人的有效负载仍停留在十公斤以内,且往往依赖单臂、静态或短时 作 ...
马斯克最大算力中心建成了:全球首个GW级超算集群,再创世界纪录
量子位· 2026-01-18 05:29
Core Viewpoint - The launch of Colossus 2, the world's first 1GW supercomputing cluster, marks a significant advancement in AI infrastructure, with plans to upgrade to 1.5GW by April and potentially reach 2GW, which could match the power consumption of major U.S. cities [2][12]. Group 1: Colossus 2 Overview - Colossus 2 is equipped with approximately 200,000 NVIDIA H100/H200 GPUs and around 30,000 NVIDIA GB200 NVL72 GPUs, significantly enhancing its computational power compared to its predecessor, Colossus 1, which was built in just 122 days [9][10]. - The cluster's 1GW capacity can power about 750,000 households, equivalent to the peak power demand of San Francisco [11]. - Once fully operational, Colossus 2 will house 555,000 GPUs, surpassing the GPU counts of Meta, Microsoft, and Google [13][14]. Group 2: Implications for AI Development - The advancements in Colossus 2 are expected to facilitate the development of Grok 5, which is projected to have parameters around 6 trillion, more than double that of Grok 4 [15][18]. - With the recent $20 billion funding round for xAI, the scaling capabilities for Grok 5 are increasing, leading to larger model parameters and faster training and deployment speeds [18][19]. - The rapid development of AI models is seen as a competitive advantage in the industry, emphasizing that speed is a crucial factor in the AI era [20]. Group 3: Energy Supply Concerns - The construction of large data centers like Colossus 2 is contributing to a projected annual electricity demand growth of 4.8% over the next decade, which is unprecedented for the U.S. energy system [27]. - The imbalance between rapidly increasing demand and slow supply growth is causing concerns about the stability of the power grid, leading to potential rolling blackouts for 67 million residents in 13 states during extreme weather [5][22][23]. - PJM, the regional transmission organization, is struggling to maintain supply-demand balance and has proposed measures to reduce peak demand from data centers, which have faced opposition from major tech companies [32][34].
机器人终于能用明白洗碗机了|UC伯克利新研究
量子位· 2026-01-18 05:29
Choice Policy团队 投稿 量子位 | 公众号 QbitAI 在家庭厨房自主使用洗碗机,在办公室边移动边擦拭白板——这些人类习以为常的场景,对人形机器人来说,却是需要调动全身关节协同运 作才能完成的 "高难度挑战" 。 近日,UC Berkeley加州大学伯克利分校团队在arXiv平台发表了题为《Coordinated Humanoid Manipulation with Choice Policies》的 研究论文,通过"模块化教学+智能选动作"的创新方案,成功破解了人形机器人全身协同的核心难题,为其走进真实人类环境铺平了道路。 阻碍人形机器人走进日常生活的"两大困境" 人形机器人一直被寄予厚望,有望在家庭、办公等非结构化环境中帮助人类完成日常工作,但长期以来,两个关键难题让它始终无法突 破"实验室边界",难以真正落地应用: 难题1. 全身协同难,"教学数据"获取贵且难 像使用洗碗机、移动擦黑板这类"长时连续任务",需要机器人同时协调头部 (定位目标) 、双手 (抓握操作) 、腿部 (移动平衡) ,实 现类似人类"眼到手到、脚步稳健"的状态。 但传统的"遥操作"模式,需要操作员同时控制机器人几十个 ...
猎头黄仁勋的2025:高管从巨头挖,干活钟爱华人创业团队
量子位· 2026-01-18 05:29
henry 发自 凹非寺 量子位 | 公众号 QbitAI 已经是全球市值第一了,还怎么继续往上走? 英伟达给出的答案很简单: 挖人,挖更多的人。 过去的2025年,黄仁勋一边扩编管理层,一边掏钱收团队—— 从挖角市场、政策、人力资源高管,到收购初创公司"打包"引入技术负责人,一套典型的"黄氏挖人+黄氏收购"正在成型。 不止芯片,用挖人重塑"第二增长曲线" 2025财年,英伟达营收 1305亿 美元,较前一财年增长逾一倍,成为科技史上的增长奇迹。 与此同时,英伟达正在用挖人重塑自己的"第二增长曲线": 一方面系统性"挖人",补齐市场、政策、研究与组织管理等关键能力。 另一方面则通过收购初创公司,直接将核心技术负责人和软件骨干纳入体系。 在今年最新的人事动作中,英伟达把"挖人"的铲子伸向了谷歌。 据悉,英伟达将聘请谷歌云老将 Alison Wagonfeld 出任公司 首位首席营销官 (Chief Marketing Officer,CMO) 。 Wagonfeld于今年2月正式履新,将此前分散在多位高管手中的相关职责,统一整合,全面负责英伟达的市场与传播工作。 (注:英伟达此前从未设立过专职的首席营销官(CM ...