Workflow
量子位
icon
Search documents
机器人集体到香港户外极限挑战,狗比人强
量子位· 2025-12-08 06:07
Group 1 - The core event of the ATEC2025 offline challenge was to encourage robots to complete tasks autonomously without remote control, showcasing their capabilities in real-world environments [1][8] - The competition featured various challenges including garbage sorting, autonomous watering, orienteering, and bridge crossing, emphasizing the importance of autonomous operation and limiting human intervention [10][16] - The event was organized by the Chinese University of Hong Kong and included participation from several prestigious institutions, with a panel of renowned robotics experts serving as judges [8][9] Group 2 - The competition revealed that quadruped robots (robot dogs) significantly outperformed bipedal robots (humanoid) in all tasks, particularly in outdoor orienteering where bipedal robots struggled due to their high center of gravity and fewer contact points [26][27][29] - Notably, the winning team, Zhejiang University Wongtsai, demonstrated exceptional performance in fully autonomous tasks, earning a prize of $150,000 [25][33] - The event highlighted the challenges faced by robots in outdoor environments, such as variations in lighting and wind, which can disrupt perception and task execution [37][39][42] Group 3 - The competition exposed several shortcomings in current robotic capabilities, particularly in multi-step reasoning and environmental adaptation, as robots often struggled to plan subsequent actions after completing a single task [46][56] - Many teams employed a strategy of decoupling upper body operations from lower body movements, leading to inefficiencies in task execution [50][51] - The event served as a testing ground for the future of robotics, pushing the industry to rethink how robotic capabilities are measured and improved in real-world scenarios [58][61]
量子位编辑作者招聘
量子位· 2025-12-08 06:07
编辑部 发自 凹非寺 量子位 | 公众号 QbitAI AI热潮还在汹涌,但如果你还不知道如何参与……那为什么不来 量子位 呢? 我们是一家以 追踪AI新进展 为核心的内容平台,经过8年积累,目前拥有顶流影响力,广泛且备受认可的产业资源,以及时代风口的最佳观 测和学习生态位。 目前,我们有 三大方向 岗位招聘,希望你是 (或者能成为) 这三个方向的内容专家: 岗位均为全职,工作地点:北京中关村。 岗位面向: 加入我们,你可以获得: 任职要求: 以下是岗位详情: 所有岗位不同能力层级职位均在开放,欢迎结合个人履历和经验申请。 AI产业方向 岗位职责: AI产业方向 :关注基建层创新,包含芯片、AI Infra、云计算; AI财经方向 :关注AI领域创投和财报,跟踪产业链资本动向; AI产品方向 :关注AI在应用和硬件终端方向的进展。 社招:覆盖编辑、主笔、主编各个层级,按能力匹配岗位; 校招:应届毕业生,接受实习且可转正。 站在AI浪潮之巅 :第一时间接触和了解AI领域最新技术和产品,构建完整的AI认知体系。 玩转AI新工具 :将各种AI新技术、新工具应用于工作,提升工作效率和创造力。 打造个人影响力 :通过撰 ...
哈萨比斯:DeepMind才是Scaling Law发现者,现在也没看到瓶颈
量子位· 2025-12-08 06:07
Core Insights - The article emphasizes the importance of Scaling Laws in achieving Artificial General Intelligence (AGI) and highlights Google's success with its Gemini 3 model as a validation of this approach [5][19][21]. Group 1: Scaling Laws and AGI - Scaling Laws were initially discovered by DeepMind, not OpenAI, and have been pivotal in guiding research directions in AI [12][14][18]. - Google DeepMind believes that Scaling Laws are essential for the development of AGI, suggesting that significant data and computational resources are necessary for achieving human-like intelligence [23][24]. - The potential for Scaling Laws to remain relevant for the next 500 years is debated, with some experts expressing skepticism about its long-term viability [10][11]. Group 2: Future AI Developments - In the next 12 months, AI is expected to advance significantly, particularly in areas such as complete multimodal integration, which allows seamless processing of various data types [27][28][30]. - Breakthroughs in visual intelligence are anticipated, exemplified by Google's Nano Banana Pro, which demonstrates advanced visual understanding [31][32]. - The proliferation of world models is a key focus, with notable projects like Genie 3 enabling interactive video generation [35][36]. - Improvements in the reliability of agent systems are expected, with agents becoming more capable of completing assigned tasks [38][39]. Group 3: Gemini 3 and Its Capabilities - Gemini 3 aims to be a universal assistant, showcasing personalized depth in responses and the ability to generate commercial-grade games quickly [41][44][45]. - The architecture of Gemini 3 allows it to understand high-level instructions and produce detailed outputs, indicating a significant leap in intelligence and practicality [46]. - The frequency of Gemini's use is projected to become as common as smartphone usage, integrating seamlessly into daily life [47].
英伟达4B小模型击败GPT-5 Pro!成本仅1/36
量子位· 2025-12-08 06:07
Core Insights - The article highlights the success of NVIDIA's small model, NVARC, which achieved a top score of 27.64% in the ARC-AGI 2 competition, outperforming GPT-5 Pro, which scored 18.3% [2][4] - NVARC's cost per task is only $0.20, significantly lower than GPT-5 Pro's cost of over $7, making it a cost-effective solution [4] - The key innovation of NVARC lies in its zero pre-training deep learning method, avoiding biases and data dependencies associated with large-scale pre-trained models [5] Performance and Methodology - ARC-AGI 2 is a challenging test that assesses a model's ability to acquire new skills beyond its training data, eliminating overlap with public training datasets [6] - NVIDIA's strategy involves moving complex reasoning tasks to an offline synthetic data pipeline, allowing for the training of smaller models that can run quickly during evaluation [9][10] - The NVARC team utilized a large-scale synthetic dataset, creating over 3.2 million augmented samples through a structured pipeline that ensures data quality [18][19] Technical Innovations - The NVARC model is based on an improved ARChitects method, utilizing a small parameter model, Qwen3-4B, and simplifying puzzle understanding through dialog templates [19] - Key to NVARC's success was the implementation of Test-Time Fine-Tuning (TTFT) and LoRA fine-tuning techniques, allowing the model to adapt quickly to new rules for each task [21] - The decoding phase was optimized with batch processing to address non-deterministic outcomes, and eight data augmentation operations were unified to evaluate candidate solutions [22][23] Strategic Implications - The article emphasizes that small models, when optimized for specific tasks, can perform competitively against larger models, highlighting their advantages in cost, speed, adaptability, and domain focus [25] - The success of NVARC suggests that the right methodologies applied in the right contexts can yield significant value, challenging the notion that larger models are always superior [25]
本周三!量子位的这件大事就要来了|MEET2026
量子位· 2025-12-08 06:07
Core Insights - The MEET2026 Intelligent Future Conference is a significant event in the AI sector, featuring prominent speakers from academia and industry, including Tsinghua University and major tech companies like Baidu and Google Cloud [1][21][39] - The conference will cover a wide range of topics related to AI, including large language models, embodied intelligence, and cloud computing applications [3][39] - The event aims to provide practical insights and discussions on the current state and future of AI technology, focusing on real-world applications rather than theoretical concepts [33][34] Highlights - Highlight 1: The conference will feature a GenAI dialogue and an Agent roundtable, addressing pressing questions about AI's impact on industries and the evolution of autonomous technologies [5][8][12] - Highlight 2: Nearly thirty influential guests from academia and industry will participate, discussing the latest advancements and challenges in AI, including insights from Tsinghua University and leading tech firms [17][21] - Highlight 3: The event will release two important documents: the "2025 AI Top Ten Trends Report" and the "2025 AI Annual List," summarizing key developments and influential figures in the AI landscape [35][39] Event Details - The MEET2026 conference is scheduled for December 10, 2025, at the Beijing Jinmao Hotel, focusing on how AI technologies can drive societal progress [37][39] - The agenda includes various sessions led by industry leaders, covering topics from AI's role in enhancing productivity to the future of AI agents [41][42]
嚯,38%斯坦福本科生是“残疾人”
量子位· 2025-12-08 04:00
Core Viewpoint - The article discusses the rising percentage of students at elite universities, particularly Stanford, who register as having disabilities, primarily for the purpose of receiving academic accommodations such as extended exam time. This trend raises concerns about the fairness and integrity of educational assessments in higher education institutions [2][11][15]. Group 1: Statistics and Trends - 38% of Stanford undergraduate students are registered as having disabilities, with 24% receiving academic or housing accommodations during the fall semester [2][3]. - Similar trends are observed at other elite institutions, with over 20% of undergraduates at Brown University and Harvard registered as disabled, and 34% at Amherst College [9]. - The number of students receiving accommodations due to disability at the University of Chicago has tripled over the past eight years, while at the University of California, Berkeley, it has increased more than fivefold over fifteen years [8]. Group 2: Systemic Issues and Exploitation - The process for obtaining disability certification has become easier, requiring only a basic doctor's note, leading to potential exploitation of the system [5][7]. - Wealthy families have been known to bribe doctors for disability diagnoses for their non-disabled children to gain academic advantages, as highlighted by the 2019 college admissions scandal [12][16]. - The article suggests that the current system disproportionately benefits affluent students while disadvantaging genuinely disabled students from lower-income backgrounds [16][17]. Group 3: Impact on Education and Resources - The increase in students with disability certifications has led to a significant rise in costs for universities, with Stanford's budget for accommodating these students tripling [25]. - The presence of a large number of students with disability accommodations can skew academic assessments, as those with accommodations often perform better on standardized tests compared to their peers without such accommodations [18][25]. - The trend of seeking disability status has become a social phenomenon among students, leading some to self-diagnose in response to peer pressure [20][21]. Group 4: Perspectives and Discussions - Some experts argue that the increase in accommodations is a sign of a functioning system that aims to support students with genuine learning disabilities [28]. - There is a call for a reevaluation of educational assessment standards to ensure fairness for all students, regardless of disability status [33]. - The article concludes with a discussion on the implications of these trends for educational equity and the potential need for systemic changes [36].
英伟达自毁CUDA门槛!15行Python写GPU内核,性能匹敌200行C++
量子位· 2025-12-08 04:00
Core Viewpoint - NVIDIA's latest CUDA 13.1 release is described as the most significant advancement since its inception in 2006, introducing a new CUDA Tile programming model that allows developers to write GPU kernels in Python, achieving performance equivalent to 200 lines of CUDA C++ code with just 15 lines [2][3][22]. Group 1: Changes in CUDA Programming - The traditional CUDA programming model, based on SIMT (Single Instruction Multiple Threads), required developers to manually manage thread indices, thread blocks, shared memory layouts, and thread synchronization, making it complex and demanding [6][7]. - The new CUDA Tile model allows developers to organize data into Tiles and define operations on these Tiles, with the compiler and runtime handling the mapping to GPU threads and Tensor Cores automatically [8][11]. - This shift is likened to the ease of using NumPy in Python, significantly lowering the barrier for entry into GPU programming [9]. Group 2: Components and Optimizations - NVIDIA has introduced two core components: CUDA Tile IR, a new virtual instruction set that ensures compatibility across different generations of GPUs, and cuTile Python, an interface that enables developers to write GPU kernels directly in Python [11][12]. - The update includes performance optimizations specifically for the Blackwell architecture, focusing on AI algorithms, with plans for future expansion to more architectures and a C++ implementation [14]. Group 3: Industry Implications - Jim Keller raises concerns that lowering the programming barrier could undermine NVIDIA's competitive advantage, as the Tile programming model is not exclusive to NVIDIA and can be supported by AMD, Intel, and other AI chip manufacturers [15]. - While the new model makes code easier to migrate within NVIDIA's GPU generations, it does not facilitate easy migration to competitors' hardware, which still requires code rewriting [20][21]. - The reduction in programming complexity means that a larger pool of data scientists and AI researchers can now write high-performance GPU code without needing HPC experts for optimization [22][23].
打工15年,被大厂裁4次了
量子位· 2025-12-07 11:00
Core Viewpoint - The article discusses the challenges faced by tech workers, particularly focusing on the story of Lee Givens, who has been laid off multiple times from major tech companies like Microsoft, Meta, and Apple, highlighting the impact of AI on employment in the tech industry [1][30]. Group 1: Lee Givens' Career Journey - Lee Givens, a seasoned tech worker, has been laid off four times in 15 years, with his most recent layoff from Apple after a brief contract position [1][4]. - Despite his extensive experience, Givens struggled to find a new job for six months, receiving consistent rejections during interviews [2][6]. - Givens' career began at Microsoft at the age of 43, where he was eventually laid off during a significant downsizing of 18,000 employees [8][9]. Group 2: Impact of AI on Employment - The article notes that over 110,000 tech workers have lost their jobs in 2025 due to layoffs across more than 200 tech companies, with AI being a significant factor in these decisions [30][31]. - Major companies like Amazon and Intel have announced substantial layoffs, with Amazon cutting 14,000 jobs and Intel 24,000 jobs, indicating a trend driven by AI advancements [32][33]. - The rise of AI is leading to structural unemployment, where jobs are permanently lost due to technological advancements, as companies find it more cost-effective to replace human labor with AI [41][51]. Group 3: Changing Job Market Dynamics - The article highlights a shift in the job market where companies are increasingly looking for employees who understand AI, leading to a reorganization of teams and a preference for tech-savvy workers [59][60]. - There is a growing trend of individuals transitioning from traditional employment to entrepreneurship, with many former tech workers starting their own businesses or engaging in freelance work [70][71]. - The narrative suggests that the future of work will involve leveraging AI tools, allowing individuals to enhance their productivity and creativity, thus creating new opportunities [72][73].
实测完豆包Seedream 4.5,替我设计师朋友哭了
量子位· 2025-12-07 09:00
嘻疯 发自 凹非寺 量子位 | 公众号 QbitAI 豆包升级上新,火山引擎带着 图 像创作模型 Doubao-Seedream-4.5 来了。 新模型有三个主打点。 一是强化了 原 图保持能 力 ,最大化保持原图的人脸、光影与色调、画面细节,可以用来P图。 例如"只保留绿线中的人物,将其他角色都删掉": 再复杂一些,将白天变为黑夜: 二是重点强化了 多图组合生成能力 。 在官方展示中,输入8张参考图,并指定画面布局后,让它生成图画故事书封面: 童话故事书封面:小女孩与小狐狸站在发光森林小屋前,月亮巨大而梦幻,星尘在他们周围飘浮;萤火虫的光点点亮草地;小白花细致 点缀;雾气营造柔和深度;古铜色童话边框华丽包围整个场景;色调是蓝紫与暖金对撞;角色面部特征保持原图一致;整体梦幻、温 柔、魔法感强烈,适合作为儿童绘本封面。 把图片中的英文转成手写体中文: Seedream-4.5 能 精准执 行复杂指令,将多种元素精准识别提取出来 ,并自然融合: 同样地,让多个角色"拍"一张大合照: 模型也能生成无违和感的群像画面: 反过来,根据一张参考图,一次性生成6张海报,比例分别改成1:1、2:3、4:3、16:9、1:2、 ...
他们让万亿参数RL学会了「省着跑」,顺便砍掉九成算力
量子位· 2025-12-07 09:00
Core Insights - The competition focus in AI large models is fundamentally shifting towards Reinforcement Learning (RL) as the next growth engine, with significant advancements in RL training methods [2][3][10] - The cost of running RL on trillion-parameter models has been prohibitively high, limiting access to only a few companies, but recent breakthroughs have drastically reduced these costs [4][5][11] - Mind Lab's innovative approach using LoRA for efficient RL training has achieved a 90% reduction in GPU consumption while maintaining performance, marking a paradigm shift in training methodologies [6][18][20] Group 1: Reinforcement Learning Advancements - The marginal returns of pre-training are declining, and the industry is actively seeking new growth engines, with RL emerging as a key focus [2][10] - RL is transitioning from a supplementary role to becoming the main battleground for the evolution of large models, essential for adapting trillion-parameter models to agent tasks [3][10][11] - Mind Lab's solution involves using LoRA for parameter-efficient adaptation, significantly reducing the computational load of RL training [13][18] Group 2: Cost and Efficiency - The cost of running LoRA RL on the Kimi K2 model is only about 10% of traditional full-parameter RL, enabling broader access to RL training [18] - Training stability has improved, with consistent increases in reward and task success rates during training, avoiding catastrophic failures [19] - The general capabilities of the models have been preserved while enhancing specific task performance through LoRA RL [20] Group 3: Technical Challenges and Solutions - The challenges of running RL on trillion-parameter models include imbalanced routing, communication overhead, and complex parallel layouts [21][24][25] - Mind Lab's mixed cooperative parallel engine design addresses these challenges by unifying various parallel processing methods, optimizing resource scheduling [26] - The introduction of truncated importance sampling ratios helps mitigate distribution mismatches during RL training, ensuring effective learning [30] Group 4: Memory Mechanisms and Real-World Applications - Mind Lab has developed a new memory mechanism called Memory Diffusion, which mimics human-like "intelligent forgetting" to enhance memory efficiency [42][45] - This approach allows the model to dynamically compress and retain meaningful experiences while discarding irrelevant information, achieving high accuracy in benchmarks [49] - The concept of Research-Product Co-Design emphasizes the importance of real-world feedback in training, leading to more effective RL environments [50][54] Group 5: Future Directions and Industry Impact - The transition from a pre-training era to an experiential intelligence era is underway, focusing on how intelligence grows in real-world contexts [59][62] - Mind Lab aims to enhance model learning efficiency and adaptability, positioning itself as a leader in the next generation of AI research [66] - The team's diverse expertise and commitment to open-source collaboration are expected to accelerate advancements in AI technologies [64][68]