量子位
Search documents
马斯克脑机接口意念控制机械臂!演示者获得钢铁之吻,理论上可控制一切
量子位· 2025-12-03 13:06
Core Viewpoint - Neuralink is making significant advancements in brain-machine interface technology, focusing on enabling paralyzed patients to regain mobility and enhancing user interaction with technology through thought control [20][29]. Group 1: Technological Developments - Neuralink has successfully demonstrated thought-controlled robotic arm movements, showcasing the potential for direct brain control over devices [2][3]. - The company is implementing a "dual implant" strategy, where a second interface will be added to the spinal cord of the first human subject, aiming to create a "digital neural bridge" to bypass damaged biological pathways [8][9]. - Recent improvements have addressed issues with electrode stability, ensuring long-term signal reliability and shifting focus from unidirectional brain signal reading to complex bidirectional communication between the brain and spinal cord [9]. Group 2: User Experiences and Innovations - Users of Neuralink's technology are actively exploring additional applications beyond official functionalities, such as controlling everyday devices [10][11]. - A user named Brad Smith has successfully integrated a consumer webcam with the Neuralink system to overcome physical limitations caused by his condition, demonstrating the adaptability of the technology [12][15]. - Another user, Nathan Copeland, emphasizes the importance of combining the brain-machine interface with common devices to enhance daily life, illustrating the practical utility of the technology [19]. Group 3: Clinical Progress and Future Directions - Neuralink has achieved its first human implant in January 2024, marking a significant milestone in its clinical trials [21][29]. - The company has received FDA breakthrough device designations for projects aimed at restoring vision and aiding speech, indicating regulatory support for its innovative approaches [27][28]. - As of September 2024, Neuralink has completed implants for 12 subjects, with over 15,000 hours of device operation and no major rejection issues, paving the way for broader clinical applications [29].
人形机器人控制新突破!敏捷稳定两不误,一个策略让人形机器人完成叶问蹲和跳舞|港大&英伟达&清华
量子位· 2025-12-03 13:06
OpenDriveLab 投稿 量子位 | 公众号 QbitAI 人形机器人要在人类环境中执行各种任务,需要同时具备两个看似矛盾的能力: 敏捷的动态运动 和 精确的平衡控制 。 AMS从三个关键方面解决动态运动与平衡控制的统一问题: 1. 异构数据源 :从机器人动作空间直接采样生成可扩展的平衡数据,突破人类数据限制,缓解长尾分布问题。 下面来看详细内容。 人形机器人的"两难困境" 叶问蹲、跳舞、跑步,一个策略全搞定! 核心思路: 反观人类,却能轻松自然的实现这种协同——比如在动态行走后精确放置物体,或者在单腿站立时用自由肢体作为临时支撑去够取物体。 近日,来自香港大学、NVIDIA和清华大学的联合研究团队提出了一种名为 AMS (Agility Meets Stability) 的统一人形机器人全身控制框 架, 首次 实现了在单一策略中同时具备动态运动跟踪和极限平衡控制能力。 2. 混合奖励机制 :选择性应用平衡先验奖励,精准平衡指导不牺牲敏捷性,化解优化目标冲突。 3. 自适应学习策略 :动态调整采样概率,同时对每个动作"因材施教",实现高效的自适应学习。 然而,对于人形机器人来说,同时实现这两种能力却是一 ...
DeepSeek-V3.2被找出bug了:疯狂消耗token,答案还可能出错,研究人员:GRPO老问题没解决
量子位· 2025-12-03 09:05
Core Viewpoint - DeepSeek-V3.2 has gained significant attention but has been found to have issues, particularly with token consumption during complex tasks, leading to longer and potentially incorrect answers [1][4][5]. Group 1: Token Consumption Issues - DeepSeek-V3.2's Speciale version consumes more tokens compared to competitors, using 77,000 tokens for certain tasks while Gemini only uses 20,000 tokens [5]. - The model's reliance on the GRPO algorithm has led to a "length bias," where longer incorrect answers are less penalized, resulting in the generation of "long and wrong" responses [10][11]. Group 2: Hidden Biases in GRPO Algorithm - The GRPO algorithm has two hidden biases: length bias and difficulty bias. The length bias results in longer incorrect answers being favored, while the difficulty bias causes the model to focus excessively on overly simple or overly difficult questions, neglecting medium-difficulty questions that are crucial for skill improvement [10][12]. - Despite attempts to address these biases, the length bias remains a challenge, as acknowledged in DeepSeek's technical report [15][13]. Group 3: Cost and Resource Considerations - DeepSeek-V3.2's output cost is significantly lower than that of GPT-5, at only 1/24 of the price, which may make it more acceptable despite its token efficiency issues [17]. - The model's context length of 128K has not been updated for a long time, which may be related to limited GPU resources [18].
后生可畏!何恺明团队新成果发布,共一清华姚班大二在读
量子位· 2025-12-03 09:05
Core Viewpoint - The article discusses the introduction of Improved MeanFlow (iMF), which addresses key issues in the original MeanFlow (MF) model, enhancing training stability, guidance flexibility, and architectural efficiency [1]. Group 1: Model Improvements - iMF reformulates the training objective to a more stable instantaneous velocity loss, introducing flexible classifier-free guidance (CFG) and efficient in-context conditioning, significantly improving model performance [2][14]. - In the ImageNet 256x256 benchmark, the iMF-XL/2 model achieved a FID score of 1.72 in 1-NFE, a 50% improvement over the original MF, demonstrating that single-step generative models can match the performance of multi-step diffusion models [2][25]. Group 2: Technical Enhancements - The core improvement of iMF is the reconstruction of the prediction function, transforming the training process into a standard regression problem [4]. - iMF constructs the loss from the perspective of instantaneous velocity, stabilizing the training process [9][10]. - The model simplifies input to a single noisy data point and modifies the prediction function's computation, removing dependency on external approximations [11][12][13]. Group 3: Flexibility and Efficiency - iMF internalizes the guidance scale as a learnable condition, allowing the model to adapt and learn average velocity fields under varying guidance strengths, thus enhancing CFG flexibility during inference [15][16][18]. - The improved in-context conditioning architecture eliminates the need for the large adaLN-zero mechanism, optimizing model size and efficiency, with iMF-Base reducing parameters by about one-third [19][24]. Group 4: Experimental Results - iMF demonstrates exceptional performance on challenging benchmarks, with iMF-XL/2 achieving a FID of 1.72 in 1-NFE, outperforming many pre-trained multi-step models [26][27]. - In 2-NFE, iMF further narrows the gap between single-step and multi-step diffusion models, achieving a FID of 1.54 [29].
速报!MEET2026嘉宾阵容再更新,观众报名从速
量子位· 2025-12-03 02:38
Core Insights - The MEET2026 Smart Future Conference will focus on cutting-edge technologies and industry developments that have garnered significant attention throughout the year [1] - The theme "Symbiosis Without Boundaries, Intelligence to Ignite the Future" emphasizes how AI and smart technologies penetrate various industries, disciplines, and scenarios, becoming a core driving force for societal evolution [2] Group 1: Conference Highlights - The conference will cover hot topics in the tech circle this year, including reinforcement learning, multimodal AI, chip computing power, AI in various industries, and AI going global [3] - It will feature the latest collisions between academic frontiers and commercial applications, showcasing leading technological achievements from infrastructure, models, and product industries [4] - The event will also include the authoritative release of the annual AI rankings and the annual AI trend report [5][116] Group 2: Notable Speakers - Zhang Yaqin, President of Tsinghua University's Intelligent Industry Research Institute and an academician of the Chinese Academy of Engineering, will be a key speaker [11] - Sun Maosong, Executive Vice President of Tsinghua University's Artificial Intelligence Research Institute, has led numerous national projects [15] - Wang Zhongyuan, Director of the Beijing Academy of Artificial Intelligence, has extensive experience in AI core technology research [19] - Wang Ying, Vice President of Baidu Group, oversees several key business units including Baidu Wenku and Baidu Netdisk [24] - Han Xu, Founder and CEO of WeRide, has led the company to become a leader in autonomous driving technology [28] Group 3: Annual AI Rankings and Trends - The "Artificial Intelligence Annual Rankings" initiated by Quantum Bit has become one of the most influential rankings in the AI industry, evaluating companies, products, and individuals across three dimensions [117] - The "2025 Annual AI Top Ten Trends Report" will analyze ten AI trends that are releasing significant potential, considering factors like technological maturity and practical application [118] Group 4: Event Details - The MEET2026 Smart Future Conference is scheduled for December 10, 2025, at the Beijing Jinmao Renaissance Hotel, with registration now open [119] - The conference aims to attract thousands of tech professionals and millions of online viewers, establishing itself as an annual barometer for the smart technology industry [122]
云计算一哥10分钟发了25个新品!Kimi和MiniMax首次上桌
量子位· 2025-12-03 02:38
Core Insights - Amazon Web Services (AWS) showcased an unprecedented number of product launches at the re:Invent 2025 event, with CEO Matt Garman challenging himself to release 25 products in 10 minutes, ultimately unveiling 40 new products in just over two hours, emphasizing practicality and addressing challenges in AI applications [1][7][9]. Group 1: AI Computing Power - AWS has restructured its AI computing supply model by focusing on self-developed chips, specifically the Trainium series, which has grown into a multi-billion dollar business with over 1 million chips deployed, outperforming competitors by four times in speed [15][20]. - The latest Trainium3 Ultra Servers, based on 3nm technology, offer a 4.4 times increase in computing performance and a 3.9 times increase in memory bandwidth compared to the previous generation [18]. - The upcoming Trainium4 chip promises significant advancements, including a 6 times increase in FP4 computing performance and a 4 times increase in memory bandwidth, tailored for large model training needs [20][22]. - AWS introduced AI Factories, allowing clients to deploy AWS AI infrastructure within their data centers, thus maintaining control and security while accessing top-tier AI computing power [23][24]. Group 2: Model Development and Integration - AWS launched Amazon Bedrock, a flexible and customizable model platform, which now includes Chinese models Kimi and MiniMax, marking their entry into the global developer ecosystem [26][28]. - The new Amazon Nova 2 series includes various models designed for different tasks, with Nova 2 Light focusing on cost-effectiveness and low latency, Nova 2 Pro excelling in complex tasks, and Nova 2 Sonic optimizing real-time voice interactions [30][32]. - Nova Forge introduces the concept of Open Training Models, allowing enterprises to integrate their proprietary data with AWS's training datasets, creating specialized models that retain general reasoning capabilities while understanding unique business knowledge [40][41]. Group 3: AI Agents - AI Agents emerged as a key focus, with Garman stating that the era of AI assistants is being replaced by AI Agents, which will be widely adopted across companies [45][46]. - AWS introduced several new Agents, including Kiro Autonomous Agent for complex development tasks, AWS Security Agent for proactive security measures, and AWS DevOps Agent for continuous system monitoring and troubleshooting [50][52][56]. - AWS provides tools like AWS Transform Custom for code migration and Policy in AgentCore for defining agent behavior, ensuring that agents operate within controlled parameters [58][61]. Group 4: Strategic Vision - AWS's strategy emphasizes the importance of practical applications of AI technologies, focusing on building a comprehensive, secure, and scalable enterprise-level infrastructure rather than solely on technological breakthroughs [68][70]. - The company aims to address challenges related to computing costs, model understanding of proprietary knowledge, and the controllability of AI Agents through its innovative solutions and partnerships [70].
浙大系具身智能再闯港交所:主打工业场景,每天进账1000000元
量子位· 2025-12-03 02:38
Core Viewpoint - The article discusses the growth and challenges faced by XianGong Intelligent, a company specializing in robotic control systems, as it prepares for its IPO on the Hong Kong Stock Exchange. Despite increasing revenues, the company has not yet reached profitability and has accumulated significant losses over the past three years [1][2][49]. Company Overview - XianGong Intelligent focuses on providing robotic control systems and solutions primarily for industrial applications, rather than consumer-oriented robots [3][5]. - The company has developed a product matrix that includes controllers, software, robots, and accessories, aiming to simplify the development and deployment of intelligent robots [6][24]. Financial Performance - XianGong Intelligent's revenue has shown consistent growth, with figures of 184 million RMB in 2022, 249 million RMB in 2023, and projected 339 million RMB in 2024, reflecting a compound annual growth rate of 35.7% [34][32]. - Despite revenue growth, the company has reported cumulative losses of 122 million RMB over the past three years [49]. Product and Service Offerings - The company's main products include the SRC series controllers, which serve as the "brain" of robots, enabling them to operate autonomously [9][10]. - XianGong Intelligent offers over 1,000 robot models, primarily targeting industrial environments that require high precision and durability [15][19]. - The software component acts as a central command system for managing robotic fleets, enhancing operational efficiency [12][13]. Market Position - As of 2024, XianGong Intelligent holds a 23.6% market share in the global robotic controller market, ranking first in sales volume [31]. - The company has steadily increased its customer base, serving over 1,600 integrators and end customers across more than 35 countries [28][30]. Challenges and Risks - The company has faced continuous losses due to high R&D expenses, which amounted to 39 million RMB in 2022, 64 million RMB in 2023, and are projected to reach 71 million RMB in 2024 [53]. - XianGong Intelligent's cash flow has been negatively impacted by extended payment cycles with customers, leading to a net operating cash flow deficit [65][62]. - The reliance on external suppliers for manufacturing and components poses risks, especially if payment terms lead to disruptions in supply [68][70]. Management and Team - The founding team of XianGong Intelligent includes experienced professionals with backgrounds in robotics and AI, contributing to the company's technological advancements and strategic direction [72][74][78].
量子位编辑作者招聘
量子位· 2025-12-03 02:38
AI热潮还在汹涌,但如果你还不知道如何参与……那为什么不来 量子位 呢? 我们是一家以 追踪AI新进展 为核心的内容平台,经过8年积累,目前拥有顶流影响力,广泛且备受认可的产业资源,以及时代风口的最佳观 测和学习生态位。 编辑部 发自 凹非寺 量子位 | 公众号 QbitAI 目前,我们有 三大方向 岗位招聘,希望你是 (或者能成为) 这三个方向的内容专家: 岗位均为全职,工作地点:北京中关村。 岗位面向: 加入我们,你可以获得: 以下是岗位详情: 所有岗位不同能力层级职位均在开放,欢迎结合个人履历和经验申请。 AI产业方向 岗位职责: AI产业方向 :关注基建层创新,包含芯片、AI Infra、云计算; AI财经方向 :关注AI领域创投和财报,跟踪产业链资本动向; AI产品方向 :关注AI在应用和硬件终端方向的进展。 社招:覆盖编辑、主笔、主编各个层级,按能力匹配岗位; 校招:应届毕业生,接受实习且可转正。 站在AI浪潮之巅 :第一时间接触和了解AI领域最新技术和产品,构建完整的AI认知体系。 玩转AI新工具 :将各种AI新技术、新工具应用于工作,提升工作效率和创造力。 打造个人影响力 :通过撰写独家原创内 ...
DeepSeekV3.2技术报告还是老外看得细
量子位· 2025-12-03 00:11
Core Insights - The article discusses the launch of two open-source models, DeepSeek-V3.2 and DeepSeek-V3.2-Speciale, which have gained significant attention in Silicon Valley, indicating a shift in the competitive landscape of AI models [2][6]. Group 1: Model Performance - DeepSeek-V3.2 has achieved the highest level among current open-source models, significantly narrowing the gap with top closed-source models [6]. - The standard version of DeepSeek-V3.2 reached performance levels comparable to GPT-5, while the Speciale version surpassed GPT-5 and competed closely with Gemini-3.0-Pro in mainstream reasoning tasks [7][8]. - DeepSeek-V3.2-Speciale won gold medals in various competitions, demonstrating its advanced capabilities [9]. Group 2: Technical Innovations - The model utilizes DSA sparse attention to address efficiency issues with long contexts, laying the groundwork for subsequent long-sequence reinforcement learning [14]. - By introducing scalable reinforcement learning and allocating over 10% of pre-training compute for post-training, the model significantly enhances general reasoning and agent capabilities [15]. - The Speciale version allows for extended reasoning chains, enabling deeper self-correction and exploration, which unlocks stronger reasoning abilities without increasing pre-training scale [16][17]. Group 3: Economic Implications - DeepSeek-V3.2 is approximately 24 times cheaper than GPT-5 and 29 times cheaper than Gemini 3 Pro in terms of output token costs [29][30]. - The cost of using DeepSeek-V3.2 for generating extensive content is significantly lower, making it an economically attractive option compared to its competitors [31][32]. - The model's deployment on domestic computing power (e.g., Huawei, Cambricon) could further reduce inference costs, posing a challenge to established players like Google and OpenAI [36]. Group 4: Market Impact - The success of DeepSeek-V3.2 challenges the notion that open-source models lag behind closed-source ones, indicating a potential shift in market dynamics [10][26]. - The article highlights that the gap between DeepSeek and top models is now more of an economic issue rather than a technical one, suggesting that with sufficient resources, open-source models can compete effectively [26].
GPT5.5代号“蒜你狠”曝光!OpenAI拉响红色警报加班赶制新模型,最快下周就发
量子位· 2025-12-03 00:11
Core Insights - OpenAI is under pressure to enhance ChatGPT in response to Google's Gemini 3 Pro, leading to an internal "code red" alert to prioritize resources for this task [2][34] - The competition has intensified, with Google's Gemini series gaining significant traction, resulting in a decline in ChatGPT's traffic and user engagement [10][15] Group 1: Competitive Landscape - OpenAI's ChatGPT traffic dropped by 6% within a week following the release of Gemini 3 Pro and Nano Banana Pro [10] - Gemini's monthly active users surged from 450 million in July to 650 million in October, indicating rapid growth and increased competition [14] - OpenAI's market share in AI assistants remains at 70%, but growth is slowing, raising concerns about its competitive edge [15][16] Group 2: Financial Challenges - OpenAI is not yet profitable and faces increasing financial pressure, needing to raise approximately $100 billion to sustain operations amid high cash burn [19][21] - Projected revenues from ChatGPT are $10 billion for this year, $20 billion next year, and $35 billion by 2027, but these figures are insufficient to cover expenses [20][21] - The company must achieve $200 billion in revenue by 2030 to reach profitability, which poses a significant challenge given its current financial trajectory [21] Group 3: Product Development and Strategy - OpenAI plans to release a new reasoning model next week, with the internal codename "Garlic," which is expected to outperform Gemini 3 in evaluations [5][24] - The development of Garlic has reportedly made significant strides in pre-training, allowing for better performance in coding and reasoning tasks [28] - OpenAI is also working on additional models, including one named "Shallotpeat," to further enhance its product offerings and respond to competitive pressures [27][31]