量子位
Search documents
1700个OpenClaw技巧,我用多邻国的方式学会的!
量子位· 2026-02-10 05:33
金磊 发自 凹非寺 量子位 | 公众号 QbitAI AI来了,学习方式也变样了。 就好比面对最近大火的 OpenClaw (原Clawdbot、Moltbot),单单是与它相关的使用技巧,就已经有 1700多个 。 这个GitHub项目里有如此多的Skills,到底该 怎么学才能记得住? That's a big problem~ 但也正如我们刚才说的,AI来了,一切都变了。 现在,你可以把这个GitHub下载成PDF,然后直接喂给AI: 在开始学习之后,教程会先对OpenClaw的基础知识做一个梳理,而且还是 图文并茂 的那种: 与此同时,为了保证 准确性 ,课程还专门设置了 对照学习 的功能,可以比对上传文件对应的位置边比对边学习: 在每一个环节过后,还会有一个 小测 环节,例如它会问你: ClawHub注册表在筛选技能时,剔除了哪些类型的低质内容? 作答过后,AI也会基于正确答案,给予 解析 反馈: 不大一会儿,一个与之相关的 多邻国式学习 的课程就被AI搞出来了: 点进去,映入眼帘的,就是一个涵盖10节课的教程,包括对整个OpenClaw的 知识框架 和 知识卡片 : 如此一来,海量的新知识就会以 ...
华为发布业界首个扩散语言模型Agent,部分场景提速8倍!
量子位· 2026-02-10 05:33
Core Insights - The article emphasizes that the evaluation of an Agent's strength has shifted from merely answering questions to its ability to efficiently handle multi-turn reasoning, tool invocation, and complex collaboration with minimal interaction budget [2][3]. Group 1: Research Findings - A recent study by teams from Huawei and several universities demonstrated that switching to a Diffusion Large Language Model (DLLM) significantly enhances Agent performance, achieving over 30% faster execution speed and up to 8 times efficiency in complex tasks compared to traditional Autoregressive (AR) models [3][4]. - The research employed a strict controlled experimental design to ensure that differences in performance were solely attributed to the generation paradigm, revealing that DLLM Agents require fewer interaction rounds and tool calls while maintaining similar accuracy levels [4][5]. Group 2: Performance Metrics - In the BrowseComp-zh benchmark, the DLLM Agent achieved an accuracy of 15.5% with an average of 6.7 tool calls and 13.0 turns used, while the AR Agent had an accuracy of 7.5% with 14.8 tool calls and 1.9 turns used [8]. - The DLLM Agent exhibited a lower invalid action rate of 6.4%, indicating a more efficient execution process [8]. Group 3: Case Study - A specific case study highlighted that the DLLM Agent completed a complex multi-constraint retrieval task in 140.95 seconds, while the AR Agent took 1152.68 seconds, showcasing an 8.18 times speed difference [13][14]. Group 4: Planning and Execution - The DLLM Agent demonstrated superior planning capabilities by identifying key constraints quickly and refining task structures in a two-phase approach, contrasting with the AR Agent's sequential and often error-prone process [16][19]. - The study found that the DLLM's attention mechanism allows for a more coordinated global-to-local decision-making process, leading to faster task completion and fewer detours [28][30]. Group 5: Implications for Agent Design - The findings suggest that the generation paradigm fundamentally shapes Agent behavior, indicating that DLLM should not merely replace AR but requires a re-alignment of interfaces and training objectives to fully leverage its potential [24][30].
量子位编辑作者招聘
量子位· 2026-02-10 05:33
Core Viewpoint - The article emphasizes the ongoing AI boom and invites individuals to join the company "Quantum Bit," which focuses on tracking AI advancements and has established itself as a leading content platform in the industry [1]. Group 1: Job Opportunities - The company is hiring for three main directions: AI Industry, AI Finance, and AI Product, with positions available for both experienced professionals and fresh graduates [2][4]. - Positions are open for various levels, including editors, lead writers, and chief editors, with a focus on matching roles to individual capabilities [6]. Group 2: Job Responsibilities - **AI Industry Direction**: Responsibilities include tracking innovations in infrastructure, such as chips, AI infrastructure, and cloud computing, as well as interpreting technical reports from conferences [6][7]. - **AI Finance Direction**: Focuses on venture capital, financial reports, and capital movements within the AI industry, requiring strong analytical skills and a passion for interviews [11]. - **AI Product Direction**: Involves monitoring AI applications and hardware developments, producing in-depth evaluations of AI products, and engaging with industry experts [11]. Group 3: Benefits and Work Environment - Employees will have the opportunity to engage with cutting-edge AI technologies, enhance their work efficiency through new tools, and build personal influence in the AI field [6]. - The company offers competitive salaries, comprehensive benefits including social insurance, meal allowances, and performance bonuses, along with a dynamic and open team culture [6]. Group 4: Company Growth and Reach - By 2025, Quantum Bit aims to have over 2.4 million subscribers on WeChat and more than 7 million users across platforms, with a daily reading volume exceeding 2 million [12].
阿里达摩院开源具身大脑基模:3B激活参数性能超越72B,转身就忘事的机器人有救了
量子位· 2026-02-10 03:00
Core Viewpoint - The article discusses the launch of RynnBrain, the first embodied brain model with spatiotemporal memory, developed by Alibaba's Damo Academy, which significantly enhances the capabilities of embodied robots in understanding and interacting with the physical world [7][9][76]. Group 1: RynnBrain Model Features - RynnBrain consists of seven models ranging from 2B to 30B parameters, designed to understand both "time" and "space," allowing it to remember past trajectories and predict future actions [7][9]. - It outperforms leading models like Nvidia's Cosmos-reason2 and Google's Gemini Robotics ER 1.5 across 20 benchmarks, achieving 16 state-of-the-art (SOTA) results [7]. - RynnBrain-30B-A3B, the first MoE architecture in embodied models, demonstrates exceptional efficiency, requiring only 3B active parameters while surpassing the performance of a 72B model [10][11]. Group 2: Training and Data Utilization - The model was trained using over 20 million pairs of high-quality data, incorporating various multimodal training datasets to enhance its understanding of physical space [19][20]. - A unique aspect of the training involved generating 1 million pairs of "self-centered" OCR question-answer data, enabling the robot to interpret labels and numbers in its environment [21][23]. Group 3: Functional Capabilities - RynnBrain exhibits strong flexibility in input and output, capable of processing images and videos of varying resolutions and providing multiple modalities of output, such as trajectories and poses [26][28]. - It possesses spatiotemporal memory, allowing it to maintain awareness of object locations and trajectories even after interruptions, which is crucial for long-term tasks [34][40]. Group 4: System Architecture and Scalability - The model employs a "big brain-small brain" layered architecture, where RynnBrain handles long-term planning and scene understanding, while a smaller execution layer focuses on motor control [54][56]. - This architecture facilitates modular iteration and enhances the model's adaptability to various tasks, such as complex navigation and planning [57][58]. Group 5: Open Source and Industry Impact - Damo Academy has open-sourced RynnBrain along with comprehensive training codes and a new evaluation benchmark, RynnBrain-Bench, which assesses the model's understanding of video sequences and spatial positioning [60][62]. - This initiative aims to lower barriers in the industry by providing a shared infrastructure for understanding physical concepts, improving system efficiency, and fostering healthy competition among teams [66][69].
0.3B参数,600MB内存!腾讯混元实现产业级2Bit量化,端侧模型小如手机App
量子位· 2026-02-10 03:00
Core Viewpoint - Tencent Hunyuan has launched a new ultra-small model, HY-1.8B-2Bit, designed for consumer-grade hardware, which is significantly smaller than many common mobile applications, making it suitable for edge deployment [2][13]. Group 1: Model Specifications and Performance - The HY-1.8B-2Bit model has a parameter count of only 0.3 billion and a memory footprint of just 600MB, making it ideal for deployment on edge devices [1][13]. - The model utilizes a unique 2-bit quantization scheme, which reduces the parameter count by six times compared to the original model while maintaining its full cognitive capabilities [2][6]. - Compared to the original precision model, the generation speed of HY-1.8B-2Bit is enhanced by 2-3 times on real edge devices, significantly improving user experience [2][6][13]. Group 2: Quantization Techniques - The model employs Quantization Aware Training (QAT) to mitigate the precision loss typically associated with 2-bit quantization, allowing it to approach the performance of full-precision models [6][11]. - The "Elastic Stretch Quantization" (SEQ) strategy is introduced to address the challenges of low precision, enhancing the model's ability to capture high-dimensional feature distributions [9][11]. - Data optimization strategies have been implemented, increasing the proportion of scientific data and incorporating long-text data to improve the model's overall capabilities [8][7]. Group 3: Training and Deployment - The training process for HY-1.8B-2Bit was optimized to require only 10% of the tokens needed for training the Bitnet-2B model, demonstrating efficiency in achieving low-bit model performance [12][11]. - The model is compatible with Arm computing platforms and has been tested on devices like the MacBook M4 and Dimensity 9500, showing significant acceleration in both latency and generation speed compared to original models [13][14]. - Future developments will focus on reinforcement learning and model distillation to further enhance the capabilities of low-bit quantized models, aiming to bridge the performance gap with full-precision models [15].
这输入法200多一个月,竟还有10万人排队送钱???
量子位· 2026-02-09 12:53
Core Viewpoint - Typeless, an AI voice keyboard, has gained significant popularity with over 100,000 users subscribing despite its high monthly fee of over 200 yuan, primarily due to its unique voice input capabilities [2][3]. Group 1: Product Features - Typeless offers a distinct functionality by converting voice directly into structured text, eliminating filler words and redundancies [11][20]. - The AI can automatically organize spoken content into coherent sections, making it easier for users to manage information [22]. - It supports over 100 languages and operates seamlessly on both computer and mobile platforms, with no typing required [12]. Group 2: Performance Evaluation - The transcription accuracy of Typeless is notably high, effectively filtering out filler words and maintaining the core message of the spoken input [14][18]. - In tests involving complex phrases and homophones, the AI demonstrated a strong ability to capture the intended meaning, although occasional issues with rapid speech were noted [17][18]. - The structured output from the AI is comparable to markdown formats commonly used in professional settings, enhancing usability for tasks like note-taking [24]. Group 3: AI Enhancement Capabilities - Typeless includes an AI refinement feature that improves the clarity and flow of the transcribed text, although its utility may vary based on user needs [25][26]. - The AI's ability to refine text is limited to the current input and does not support historical edits, which may restrict its application in certain scenarios [28][29]. - Despite some limitations, the overall user experience with the AI refinement feature is positive, contributing to a more polished final output [30][32]. Group 4: Market Position and User Demographics - The product is positioned as a versatile tool for individuals who require high transcription accuracy, such as those who drive or have limited typing capabilities [34]. - Users have identified various innovative applications for Typeless, including social media content creation and enhancing interactions with AI models like ChatGPT [36][38]. - The subscription price is considered high, suggesting that the product may be more suitable for users with specific needs, such as content creators and professionals who frequently record notes [42].
量子位编辑作者招聘
量子位· 2026-02-09 12:53
Core Viewpoint - The article emphasizes the ongoing AI boom and invites individuals to join the company "Quantum Bit," which focuses on tracking AI advancements and has established itself as a leading content platform in the industry [1]. Group 1: Job Opportunities - The company is hiring for three main directions: AI Industry, AI Finance, and AI Product, with positions available for both experienced professionals and fresh graduates [2][4]. - Positions are open for various levels, including editors, lead writers, and chief editors, with a focus on matching roles to individual capabilities [6]. Group 2: Job Responsibilities - **AI Industry Direction**: Responsibilities include tracking innovations in infrastructure, such as chips, AI infrastructure, and cloud computing, as well as producing accessible reports on technical conferences and papers [6][7]. - **AI Finance Direction**: Focuses on venture capital, financial reports, and analyzing capital movements within the AI industry, including interviews with investors and entrepreneurs [11]. - **AI Product Direction**: Involves monitoring AI applications and hardware developments, writing in-depth product evaluations, and engaging with product experts [11]. Group 3: Benefits and Work Environment - Employees can expect a vibrant team atmosphere, opportunities for personal influence through original content creation, and professional mentorship from senior editors [6][11]. - The company offers competitive salaries and comprehensive benefits, including social insurance, meal allowances, and performance bonuses [6]. Group 4: Company Growth and Reach - By 2025, Quantum Bit aims to have over 2.4 million subscribers on WeChat and more than 7 million users across platforms, with a daily reading volume exceeding 2 million [12]. - The company is recognized as the top new media outlet in the AI and frontier technology sectors according to third-party data platforms [12].
“AI提高了我的生产力,但我更累了”
量子位· 2026-02-09 12:53
Core Viewpoint - The article discusses the phenomenon of "AI fatigue," where increased productivity through AI tools leads to greater stress and exhaustion among developers, rather than the anticipated efficiency gains [1][42]. Group 1: AI's Impact on Productivity - AI has the potential to significantly enhance productivity, allowing tasks that previously took a day to be completed in an hour [9]. - However, this efficiency often results in an increased workload, as developers are expected to handle multiple tasks simultaneously, leading to fragmented attention and higher energy consumption [10][9]. - The shift from a creator role to a quality control role means developers spend more time evaluating and correcting AI-generated outputs, which is more mentally taxing than traditional coding [12][14]. Group 2: Psychological and Emotional Effects - The unpredictability of AI outputs creates anxiety, as developers cannot rely on consistent results, leading to a constant state of alertness [18][20]. - The rapid evolution of AI tools requires continuous learning, which can lead to feelings of inadequacy and pressure to keep up with peers, exacerbating stress levels [23][39]. - Over-reliance on AI can result in cognitive decline, as critical thinking skills may diminish when individuals do not engage in independent problem-solving [33]. Group 3: Strategies for Managing AI Fatigue - The author suggests implementing time limits for AI tasks, distinguishing between thinking and execution time, and accepting that AI outputs do not need to be perfect [43][45]. - Developers are encouraged to focus on foundational concepts rather than chasing every new tool, and to document the efficiency of AI usage to determine when to rely on it [43][45]. - Emphasizing the importance of mental breaks and allowing for downtime can help maintain overall well-being and productivity in the AI-driven work environment [47].
给GRPO加上运筹外挂让7B模型比肩GPT-4!Li Auto团队发布多目标强化学习新框架 | ICASSP 2026
量子位· 2026-02-09 12:53
HVO-GRPO团队 投稿 量子位 | 公众号 QbitAI 文本摘要作为自然语言处理 (NLP) 的核心任务,其质量评估通常需要兼顾 一 致性 (Consistency) 、连贯性 (Coherence) 、流畅 性 (Fluency) 和相关性 (Relevance) 等多个维度。 然而,在实际优化过程中,开发者往往面临"拆东墙补西墙"的窘境:提升了相关性,一致性可能随之下降。如何让模型在多个目标之间达成完 美的"帕累托最优" (Pareto optimal) ? 近日,Li Auto团队一项被 ICASSP 2026 接收的研究提出了 HyperVolume Optimization (HVO) 。这是一种全新的多目标强化学习 (MORL) 策略,它基于GRPO框架,无需SFT或冷启动,就能让7B参数的模型在摘要任务上展现出媲美GPT-4的性能,且生成内容更加简 洁。 △ HVO性能对比雷达图 研究背景 核心痛点:多目标优化的"不平衡" 文本摘要生成是自然语言处理 (NLP) 中的一项核心且具有挑战性的任务。为了全面评估生成摘要的质量,研究人员通常会考察多个维度, 例如连贯性、一致性、流畅性和相关性。然 ...
1分钱部署OpenClaw!不挑设备4步搞定,全图形界面10分钟跑通专属AI助理
量子位· 2026-02-09 09:50
Core Viewpoint - OpenClaw, an AI and Agent application, has gained significant popularity since 2026, with over 177,000 stars on GitHub, indicating its rapid growth and acceptance as a "digital employee" capable of performing various tasks [1][3]. Group 1: Deployment Challenges - Users face difficulties in deploying OpenClaw due to the need for technical knowledge, such as understanding command lines and SSH connections, which can be a barrier for non-technical users [6][7][8]. - The installation process is complicated by strict version checks and interactive limitations in the installation scripts, making it challenging for users without technical expertise to troubleshoot issues [10][12][14]. Group 2: Simplified Deployment Solutions - Baidu Intelligent Cloud has introduced a simplified deployment solution that allows users to set up OpenClaw without needing coding skills or extensive technical knowledge, effectively lowering the entry barrier [4][15]. - The deployment process can be completed in as little as ten minutes, with a promotional offer allowing users to experience a lightweight application server for just 0.01 yuan in the first month [16][17]. Group 3: Features and Capabilities - OpenClaw can be integrated with various models, including Baidu's ERNIE, and offers functionalities such as web search and academic search capabilities, enhancing its utility as a digital assistant [44][45]. - New skills have been introduced, including AI-generated presentations and in-depth research capabilities, transforming OpenClaw into a powerful tool for productivity and research [48][49][50]. Group 4: Accessibility and User Experience - The deployment and configuration process has been significantly simplified, allowing users to set up OpenClaw with minimal effort, making it accessible to a broader audience beyond technical experts [53][55]. - OpenClaw's capabilities can be utilized across various sectors, such as HR for resume screening, education for research, and operations for data automation, showcasing its versatility as a digital assistant [55].