Workflow
量子位
icon
Search documents
LLM幻觉不只是「胡说八道」?新理论首次拆解幻觉的两大根源丨ICLR'2026
量子位· 2026-03-13 06:10
HALLUGUARD团队 投稿 量子位 | 公众号 QbitAI 来自弗吉尼亚理工等机构的重磅研究, 第一次用统一理论解释:为什么LLM会产生幻觉?又为什么越推理越离谱? 在医疗、法律、科研等高风险场景, "幻觉"正在成为大模型落地的最后一道坎 。 你可能已经见过这些情况: 问题是: 这些幻觉,真的是同一种问题吗? 幻觉不是一种,而是两种,而且会"演化" 来自弗吉尼亚理工大学、MIT、达特茅斯学院等机构的研究团队,在 ICLR 2026 发表论文 HALLUGUARD ,首次从理论上给出明确答 案: LLM幻觉并非单一来源,而是由两类机制共同作用、逐步演化而成。 在真实应用中,LLM幻觉远比你想象得复杂。 两大根源,第一次被严格区分: 1. 数据驱动型幻觉(Data-driven) 来源:预训练/微调阶段的知识缺失、偏差、分布错配 更关键的是: 真实幻觉往往不是"二选一",而是先错在数据,再被推理放大。 理论突破:首个"幻觉风险界",解释幻觉如何产生和放大 论文提出一个全新的理论框架—— Hallucination Risk Bound (幻觉风险界) 它首次在数学上证明: 整体幻觉风险=数据误差+推理不稳定 ...
我用10块钱的「熟虾」,搞了个AI编辑部!
量子位· 2026-03-13 06:10
Core Viewpoint - The article discusses the launch of ArkClaw, a cloud-based AI editing tool developed by Volcano Engine, which simplifies the process of content creation by automating various editorial tasks through a collaborative AI team [7][8][61]. Group 1: Overview of ArkClaw - ArkClaw is a SaaS version of OpenClaw that operates directly in the browser, eliminating issues related to local deployment such as power consumption and network interruptions [8][61]. - The service is available at a low cost of 8.91 yuan, with discounts for referrals, making it accessible to a wider audience [8][67]. - ArkClaw integrates seamlessly with Feishu, allowing users to manage documents, spreadsheets, and calendars within a unified ecosystem [21][62]. Group 2: AI Editorial Team Structure - The AI editorial team consists of six roles: Chief Editor, Reporter, Editor, Proofreader, Formatter, and Operator, each responsible for specific tasks in the content creation process [5][14]. - Each role is equipped with specialized skills to enhance efficiency, such as web search for the Reporter and grammar check for the Proofreader [25][27]. - The system allows for real-time tracking of progress, ensuring transparency in the workflow [30][34]. Group 3: Key Features and Advantages - ArkClaw offers a zero-threshold experience, enabling users without technical expertise to set up automated workflows within half an hour [62]. - The platform supports multiple large models, allowing users to switch between different AI capabilities based on task requirements [63]. - Enhanced security measures are implemented to protect sensitive data, addressing concerns associated with local deployments [64]. Group 4: Cost Efficiency - ArkClaw provides a cost-effective solution for continuous operation, requiring only a small monthly fee compared to the high costs associated with local deployments or cloud servers [65]. - The service is designed to be affordable, allowing users to access a full-fledged AI editorial team without significant financial investment [66].
量子位编辑作者招聘
量子位· 2026-03-13 06:10
Core Viewpoint - The article emphasizes the ongoing AI boom and invites individuals to join the company "Quantum Bit," which focuses on tracking AI advancements and has established itself as a leading content platform in the industry [1]. Group 1: Job Opportunities - The company is hiring for three main directions: AI Industry, AI Finance, and AI Product, with positions available for both experienced professionals and fresh graduates [2][4]. - Positions are open for various levels, including editors, lead writers, and chief editors, with a focus on matching roles to individual capabilities [6]. Group 2: Job Responsibilities - **AI Industry Direction**: Responsibilities include tracking innovations in infrastructure, such as chips, AI infrastructure, and cloud computing, as well as producing accessible reports on cutting-edge research and technology conferences [6][7]. - **AI Finance Direction**: Focuses on venture capital, financial reports, and analyzing capital movements within the AI industry, including interviews with investors and entrepreneurs [11]. - **AI Product Direction**: Involves monitoring the application of AI in software and hardware, writing in-depth product evaluations, and engaging with product experts and entrepreneurs [11]. Group 3: Benefits and Work Environment - Employees can expect a vibrant team atmosphere, opportunities for personal influence through original content creation, and professional mentorship from senior editors [6][11]. - The company offers competitive salaries and comprehensive benefits, including social insurance, meal allowances, performance bonuses, and overtime compensation [6]. Group 4: Company Growth and Reach - By 2025, Quantum Bit aims to have over 2.4 million subscribers on WeChat and more than 7 million users across all platforms, with a daily reading volume exceeding 2 million [12]. - The company is recognized as the top new media outlet in the AI and frontier technology sectors according to third-party data platforms [12].
马斯克从Cursor挖走两位天才少年
量子位· 2026-03-13 06:10
Core Viewpoint - The article discusses the recent hiring of two senior leaders from Cursor by Elon Musk's xAI, highlighting the contrasting dynamics within the company as it faces departures of its co-founders while simultaneously attracting new talent [2][52]. Group 1: New Hires at xAI - Andrew Milich and Jason Ginsberg, both former senior leaders at Cursor, have joined xAI, reporting directly to Elon Musk [5][7]. - Milich has a long-standing admiration for Musk, having previously interned at SpaceX in 2017, and sees this new role as fulfilling a personal ambition [11][13]. - Ginsberg is motivated by the significant transformation occurring in the software industry, believing that traditional operating systems and applications will soon become obsolete [14][15]. Group 2: Context of Departures and Challenges - The article notes that while new talent is being welcomed, xAI is also experiencing a significant exodus of its co-founders, with only two remaining after recent departures [52][54]. - The management style at xAI is described as "hardcore," which may contribute to employee burnout, as many who joined with high hopes have left feeling exhausted [56][60]. - The contrasting experiences of new hires and departing employees create a perception of xAI as a "walled city," where the internal culture may be challenging for some [62]. Group 3: Cursor's Market Position - Cursor, once a leading AI application, is now facing increased competition and a decline in market share, particularly with the emergence of alternatives like Claude Code and Codex [43][45]. - The rapid commercialization of Cursor's product was notable, achieving $100 million in ARR within two years, with projections to reach $1 billion by November 2025 [41][42]. - The departure of Milich and Ginsberg from Cursor reflects the company's struggle to maintain its competitive edge in a rapidly evolving market [46][48].
吃够了全自动的龙虾,我决定把AI的方向盘抢回来
量子位· 2026-03-13 03:51
Core Viewpoint - The article discusses the emergence of MorphMind, the world's first controllable AI platform, which allows users to have greater control over AI outputs and interactions, addressing the common frustrations associated with traditional AI systems [2][51]. Group 1: Introduction to MorphMind - MorphMind transforms AI from a black box into a controllable work system, enabling users to intervene at any time [2]. - The platform allows users to manage a team of AI experts, ensuring transparency and clarity in the workflow [3][4]. Group 2: User Experience and Functionality - Users can oversee each step of the AI's process, allowing for real-time adjustments and interventions [10][80]. - The platform provides a structured approach to tasks, breaking them down into manageable steps and assigning roles to different AI experts [16][76]. Group 3: Unique Features of MorphMind - MorphMind emphasizes a collaborative structure where users can continuously engage with the AI, enhancing its understanding and capabilities over time [66][68]. - The system is designed to learn from user feedback, ensuring that the AI becomes more aligned with user preferences and logic [67][70]. Group 4: Comparison with Traditional AI - Traditional AI systems often operate as a single model generating outputs, leading to a lack of transparency and control for users [92][94]. - MorphMind's approach allows for a more interactive and iterative process, where users can pause, modify, or roll back steps without starting over [80][82]. Group 5: Future Implications - The platform aims to redefine the relationship between humans and AI, promoting a symbiotic interaction where AI handles repetitive tasks while humans maintain control over critical decisions [109][110]. - MorphMind envisions a future where individual users can manage a team of AI agents, transforming the traditional team-based work structure into a more flexible and efficient model [121][124].
华人博士4个月干出具身独角兽!斯坦福家务机器人再融11亿,开建中国团队
量子位· 2026-03-13 03:51
Core Viewpoint - Sunday Robotics has successfully raised $165 million in Series B funding, achieving a valuation of $1.15 billion and entering the unicorn club. The company aims to transition its robots from demo stages to real household applications by focusing on deployment and research [1][2][4][19]. Funding and Valuation - Sunday Robotics completed a $165 million Series B funding round, led by Coatue Management, with participation from Bain Capital Ventures, Tiger Global, and existing investors Benchmark and Conviction Partners [14][18]. - The company's valuation has reached $1.15 billion, marking its entry into the unicorn category [2]. Company Growth and Expansion - Since its product launch in November, Sunday Robotics has raised approximately $200 million in total funding within just over four months [4][18]. - The team has doubled in size from 35 to 70 employees in recent months to support its rapid expansion [7]. Deployment Plans - The company plans to initiate a beta testing program for its robot Memo, aiming to deploy it in real household environments for user testing by November 2025 [21][23]. - The beta testing will focus on the robot's performance in complex scenarios, such as interacting with children and pets, and adapting to various household layouts [23][24]. Research and Development - Sunday Robotics is significantly increasing its investment in the foundational models for its robots, tripling its engineering team and quadrupling its research team [28]. - The company anticipates that its training data will grow fivefold by the end of the year to support faster model iterations [28]. Product Overview - Memo, the company's first robot, is designed to perform complex household tasks such as cleaning tables, loading dishwashers, folding clothes, and making coffee [30]. - The robot stands approximately 1.7 meters tall and weighs around 77 kilograms, with a maximum height of 2.1 meters [31]. Data Collection Strategy - Sunday Robotics employs a unique data collection method using Skill Capture Gloves, which allow users to perform household tasks while recording their actions for robot training [37][40]. - The gloves are cost-effective at $400, significantly lowering the barrier for data collection compared to traditional systems [42]. Data Utilization - The data collected from both the Skill Capture Gloves and real-world robot operations will be used to train the ACT-1 model, which controls the robot's actions [44][47]. - This dual-source data strategy aims to create a feedback loop that enhances the robot's capabilities through continuous learning from real household environments [50]. Founders' Background - The founders of Sunday Robotics, Tony Zhao and Cheng Chi, have strong academic and industry backgrounds, having studied at Stanford and worked at leading AI companies [52][56]. - Their experience in both academia and industry contributes to the company's innovative approach to robotics and data utilization [51].
统计学最高荣誉回归华人!苏炜杰:AI需要一门新的数学语言
量子位· 2026-03-12 09:37
Group 1 - The article highlights that Professor Su Weijie from the University of Pennsylvania received the COPSS Presidents' Award for his significant contributions to AI deployment, privacy protection, and statistical frameworks [1][7][10] - This award marks the return of a Chinese scholar to the highest honor in statistics after 14 years [2] - Professor Su believes that statistics will become increasingly important in the AI era, providing a solid theoretical foundation for AI applications [4][6] Group 2 - Professor Su's work includes formalizing issues like traceability of AI-generated content and aligning human preferences within a rigorous statistical framework [9] - He proposed a Gaussian differential privacy framework that was applied in the 2020 U.S. Census to enhance the utility of private data [9] - He introduced a quality ranking mechanism for authors' submissions, which was officially implemented at ICML in 2026 [9] Group 3 - The article discusses the need for a new mathematical language for AI, as current mathematical frameworks may not adequately describe AI's underlying structures [12][82] - Professor Su compares the evolution of AI to a "new physics," emphasizing that AI's structure differs fundamentally from classical physics [13][82] - He invites mathematically trained individuals to contribute to creating a more suitable mathematical framework for AI, which could have a significant impact comparable to classical mechanics or relativity [14][85] Group 4 - The article addresses the challenges of fully understanding AI's black box nature, suggesting that a complete white-box approach may be unrealistic [5][49] - It proposes a probabilistic approach to AI behavior, focusing on performance rather than internal mechanisms, which could help manage risks in real-world applications [56][57] - Professor Su emphasizes the importance of combining evidence from mechanisms and behavioral performance to find a balanced solution in the context of AI [62] Group 5 - The article discusses privacy protection as a critical area of focus, highlighting the challenges posed by neural networks in maintaining privacy while ensuring model effectiveness [64][65] - Professor Su suggests a tiered approach to privacy goals, advocating for a balance between privacy and utility in various contexts [70] - He proposes creating an incentive structure similar to blockchain to transform privacy protection into an intrinsic motivation for companies [73]
全网首份「龙虾」安全部署指南来了!360出品
量子位· 2026-03-12 09:37
Core Viewpoint - OpenClaw, an AI agent, is gaining popularity and prompting government support for its deployment, but it also presents significant security challenges due to its autonomous capabilities and system resource access [1][3]. Group 1: OpenClaw Overview - OpenClaw integrates communication software with large language models to autonomously perform complex tasks like file management and data processing [1]. - The tool's ability to call system resources and execute commands autonomously raises new security concerns [1]. Group 2: Security Risks - The Ministry of Industry and Information Technology has issued security warnings, indicating that even with updates, risks remain due to the agent's decision-making capabilities and complex skill sources [3]. - High system permissions required for OpenClaw can lead to severe consequences, such as data breaches or system control loss if security measures are inadequate [3][4]. Group 3: Expert Insights - Industry experts, including Zhou Hongyi, emphasize that while OpenClaw has innovative potential, it is still in its early stages, with high usage barriers and stability issues [3][4]. - Zhou compares AI agents to "interns" that require training and strict rules, highlighting the need for caution in their deployment [4]. Group 4: Security Guidelines - 360 Group has released a security deployment guide for OpenClaw, outlining various risks such as API key leaks and injection attacks [4][6]. - The guide recommends a "controlled first, then efficiency" approach, suggesting the use of containerization and minimal privilege strategies for individual developers and small teams [6]. Group 5: Implementation Strategies - For enterprise-level applications, a zero-trust security architecture is advised, including security gateways and fine-grained permission management [6][7]. - Integrating operational logs into security platforms can help identify and mitigate high-risk behaviors in real-time [7]. Group 6: Industry Implications - OpenClaw and similar AI agents are expected to transform industries like cloud computing did, but security capabilities must be developed concurrently to avoid high-risk costs in large-scale applications [7].
对话VAST曹炎培:2秒才是3D生成本该有的速度
量子位· 2026-03-12 09:37
Core Viewpoint - VAST has launched the Tripo P1.0 model, which represents a paradigm shift in AI 3D generation, enabling the rapid creation of high-quality 3D models in just 2 seconds, significantly enhancing the accessibility and efficiency of 3D content creation [11][21][29]. Group 1: Technology and Innovation - The Smart Mesh feature allows for the rapid generation of detailed 3D models from simple prompts or reference images, achieving results comparable to professional modelers [11][12][22]. - Tripo P1.0 utilizes a novel "overall generation" approach, which models the geometry and topology of 3D shapes probabilistically, bypassing traditional methods that rely on intermediate representations [21][37][39]. - The model's speed and quality improvements stem from its ability to generate 3D assets directly from raw polygonal data, eliminating the need for complex transformations [21][41]. Group 2: Market Position and Funding - VAST recently secured $50 million in Series A funding, led by Alibaba and Hengxu Capital, which will support its vision of democratizing 3D content creation [15][16]. - The company positions itself as a leader in the global 3D field, claiming to have the best high-fidelity generation model currently available [31][68]. Group 3: User-Generated Content (UGC) and Future Plans - VAST aims to launch a UGC platform that will allow users with no prior knowledge of 3D modeling to create interactive 3D content easily, similar to a "3D version of TikTok" [25][26]. - The company believes that lowering the barriers to content creation will lead to an explosion of UGC platforms, as seen in other creative fields [49][50]. - Future developments will focus on enhancing the interactivity and functionality of generated models, enabling them to respond to user inputs and environmental changes [58][60].
一年一度最值得关注的AI榜单来啦!申报即日启动
量子位· 2026-03-12 09:37
Core Insights - The article discusses the evolution of generative AI in China, highlighting its transition from a "new technology" to an essential tool for businesses, impacting content production, R&D efficiency, marketing methods, team collaboration, and decision-making processes [1] - The upcoming 2026 China AIGC Industry Summit will recognize outstanding generative AI companies and products based on their performance and feedback over the past year, with results to be announced in May 2026 [1][2] Selection Criteria for AIGC Companies - Companies must be based in China or have their main business operations in China [7] - The primary business should focus on generative AI or have widely applied AI in its core operations [7] - Companies should have demonstrated outstanding performance in technology/products and commercialization over the past year [7] Selection Criteria for AIGC Products - Products must be based on generative AI capabilities [13] - Products should have mature technology, be market-released, and possess a certain user scale [13] - Significant technological innovations or functional iterations should have occurred in the past year, influencing industry applications [13] Evaluation Dimensions for AIGC Companies - **Technical Dimension**: Focus on the company's technical strength, R&D capabilities, and innovation, including technical achievements, R&D investment, and talent reserves [12] - **Product Dimension**: Emphasizes the innovation, market adaptability, and user experience of core products, including product innovation, user scale, and user experience [12] - **Market Dimension**: Evaluates the company's market performance and growth opportunities, including business models, market size, and revenue [12] - **Potential Dimension**: Assesses the core team's strength and brand potential, including team capabilities, financing progress, and brand influence [12] Evaluation Dimensions for AIGC Products - **Product Technical Strength**: Focus on the product's technological advancement, maturity, and efficiency, including technical architecture and outcomes [13] - **Product Innovation**: Emphasizes the uniqueness and innovation in functionality, experience, and application scenarios [13] - **Product Performance**: Evaluates user feedback and market performance, including user scale and retention rates [13] - **Product Potential**: Assesses future development and market expansion potential, including product ecosystem and strategic planning [13] Event Details - The registration for the evaluation starts immediately and ends on April 27, with results to be announced at the May summit [14] - The summit will focus on how to effectively utilize AI, inviting entrepreneurs, developers, and industry veterans to engage in discussions [17]