Workflow
量子位
icon
Search documents
短视频刷多了AI也会变蠢!“年度最令人不安的论文”
量子位· 2025-11-16 07:20
Core Insights - The article discusses the phenomenon of "Brain Rot" in AI, indicating that exposure to low-quality data can lead to irreversible cognitive decline in large language models (LLMs) [2][13][26] - The research highlights that even after retraining with high-quality data, the damage caused by low-quality data cannot be fully repaired, suggesting a permanent cognitive shift [4][26][27] Research Findings - The study introduces the "LLM Brain Rot Hypothesis," exploring whether LLMs experience cognitive decline similar to humans when exposed to low-quality data [8][13] - Two dimensions were used to define "garbage data": M1 focuses on engagement metrics (short, high-traffic content), while M2 assesses semantic quality (clickbait and conspiracy theories) [11][12] - The models tested showed a 23% decline in reasoning ability and a 30% decrease in long-context memory after exposure to garbage data [6][14] Cognitive Impact - The study found that LLMs exhibit cognitive decline akin to "Brain Rot," with significant negative effects on safety and personality traits, particularly from M1 data [14][19] - A dose-effect relationship was observed, where increased exposure to garbage data correlates with greater cognitive damage [15] Repair Attempts - Attempts to repair the cognitive damage through external feedback and large-scale fine-tuning were unsuccessful, with models failing to regain baseline performance [25][26] - The research indicates that LLMs lack the ability to self-correct effectively, unlike humans who can mitigate cognitive decline through various means [24][27] Industry Implications - The findings emphasize the importance of data quality during the pre-training phase, suggesting that the industry should focus on data selection as a safety issue [28] - Implementing cognitive assessments for LLMs, such as ARC and RULER benchmarks, is recommended to prevent long-term exposure to low-quality data [29] - The study suggests prioritizing the exclusion of short, high-engagement content from training datasets to enhance model performance [29]
6款小游戏难倒所有顶级VLM!愤怒的小鸟让它们全军覆没,性能不如随机猜测
量子位· 2025-11-16 04:45
Core Insights - The article introduces DeepPHY, the first comprehensive benchmark designed to systematically evaluate the interactive physical reasoning capabilities of Vision-Language Models (VLMs) [1][5][10] - Despite advancements in VLMs for dynamic interaction environments, significant limitations remain in their ability to translate physical knowledge into precise and predictable control actions [4][7][29] Group 1: DeepPHY Overview - DeepPHY integrates six distinct physical challenge environments, ranging from fundamental physics to complex dynamics, to assess VLMs' interactive physical reasoning [12][19] - The benchmark reveals that existing VLMs struggle with physical interaction, planning, and environmental adaptation, often performing similarly to random action execution [10][18][29] Group 2: Benchmark Environments - The six environments included in DeepPHY are PHYRE, I-PHYRE, Kinetix, Pooltool, Angry Birds, and Cut the Rope, each focusing on different aspects of physical reasoning [12][13][19] - Each environment is designed to test various dimensions of physical understanding, such as collision, gravity, and multi-body dynamics, with specific tasks that require strategic planning and real-time adaptation [14][19] Group 3: Performance Evaluation - A comprehensive evaluation of 17 mainstream VLMs, including both open-source and closed-source models, demonstrated widespread limitations in their physical reasoning capabilities [16][17] - The results indicated that many models could not surpass a baseline of random action execution, highlighting a fundamental disconnect between descriptive physical knowledge and actionable control signals [18][29] Group 4: Key Findings - The study found that VLMs often fail to learn effectively from unsuccessful attempts, indicating an inability to construct accurate internal models of the physical world [22][29] - The performance of VLMs significantly declines as task complexity increases, revealing vulnerabilities in processing complex information and executing precise strategies [22][24] Group 5: Implications for Future AI Development - The findings suggest that current VLMs possess descriptive knowledge of physics but lack the predictive and procedural capabilities necessary for effective interaction with the physical world [29][30] - The authors hope that DeepPHY will serve as a rigorous benchmark to encourage the development of AI agents that truly understand and can interact with physical environments [30]
不到48小时,人工智能年度榜单申报即将截止
量子位· 2025-11-16 04:45
组委会 发自 凹非寺 量子位|公众号 QbitAI 「2025人工智能年度榜单」申报 已进入倒计时阶段。 今年是量子位 「2025人工智能年度榜单」评选报名 的 第8年。 八年来,我们见证了技术的突破与落地,产业的融合与重塑,也见证了一批 又一批推动时代前行的企业、人物与产品。 本次评选已经从 企业 、 产品 、 人物 三大维度,设立五类奖项。欢迎企业抓住最后时间,尽快报名! 让我们共同见证年度之星,点亮未来的方向。 企业榜 产品榜 人物榜 2025 人工智能年度 焦点人物 报名方式 本次评选将于 2025年11月17日 截止。评选结果将于量子位主办的 MEET2026智能未来大会 上正式公布。 2025 人工智能年度 领航企业 2025 人工智能年度 潜力创业公司 扫描二维码即可报名评选: 网页端链接:https://wj.qq.com/s2/23740133/iso8/ 2025 人工智能年度 杰出产品 2025 人工智能年度 杰出解决方案 1、注册地在中国,或主营业务主要面向中国市场; 2、主营业务属于人工智能及相关产业,或已将人工智能广泛应用于主营业务,并在细分领域居于行业领先地位; 3、具备成熟的产 ...
库克被曝最早明年让位CEO,“苹果AI已落后同行2年”
量子位· 2025-11-16 04:45
一水 发自 凹非寺 量子位 | 公众号 QbitAI 掌权苹果14年的库克,正在进入退休倒计时! 金融时报最新爆料,这家公司正在加快其CEO换届步伐—— 现任CEO库克最早将于明年退休 ,而最有可能接棒的人是John Ternus (苹果现任硬件工程高级副总裁) 。 老实说,库克退位这事儿也不是第一天传出了。 上个月初,外媒就曝出苹果正筹划十多年来最大规模的领导层换届。而此前一度被公认为是库克继任者的COO威廉姆斯 (库克接棒乔布斯时 就是COO) ,当时就被爆料已于今年7月卸任,并将于年底离开苹果。 而随着威廉姆斯的正式"出局",John Ternus的名字其实就已经开始被更频繁地提起。 所以说,如果这次"库克加快让位"的爆料属实 (截至目前苹果未回应相关置评请求) ,那只能说明一件事—— 苹果是真的急了 。 也正因如此,此轮关于"库克接班人"的讨论,看起来声势也似乎远超以往。 就连一个随机发起的投票 (你认为John Ternus能接棒乔布斯库克吗?) ,都瞬间引来大量围观和讨论。 所以,John Tenus到底是谁?他真的能接棒库克吗? 头号种子选手:John Tenus John Tenus,苹果现任 ...
ChatGPT爱用破折号是病,奥特曼刚宣布已经治好了
量子位· 2025-11-16 04:45
Core Viewpoint - The article discusses a significant update from ChatGPT regarding its excessive use of dashes, which has been a point of frustration for users and has become a hallmark of AI-generated content [1][2][8]. Group 1: User Frustration and AI Behavior - Users have expressed their annoyance with ChatGPT's persistent use of dashes, which has led to numerous complaints on OpenAI's official forum [7][8]. - Despite users' attempts to instruct ChatGPT not to use dashes, the AI continued to incorporate them in its responses, indicating a lack of compliance [3][4][9]. - The overuse of dashes has become a recognizable trait of AI writing, making it easy to identify AI-generated text [8][15]. Group 2: Analysis of Dash Usage - A blog by GitHub engineer Sean Goedecke explores the reasons behind ChatGPT's affinity for dashes, suggesting that it may stem from the language habits of RLHF (Reinforcement Learning from Human Feedback) providers [20][22]. - The blog notes that the preference for dashes increased significantly with the release of GPT-4, with usage rising tenfold compared to earlier versions [27]. - The introduction of 19th-century literature into AI training data is posited as a potential factor for the increased use of dashes, as this period saw a peak in dash usage [30][32].
小度AI眼镜Pro 2299元起售:这次把“超能小度”塞进了39g的眼镜里
量子位· 2025-11-16 01:30
Core Viewpoint - Baidu has launched a new AI-powered smart glasses, the Xiaodu AI Glasses Pro, starting at 2299 yuan, featuring advanced functionalities and aesthetic improvements [2][31]. Product Features - The Xiaodu AI Glasses Pro weighs 39g and incorporates a new multimodal AI assistant called "Super Xiaodu," which can translate, recognize objects, and automatically take photos while generating memos [3][9]. - The AI object recognition capability allows the glasses to provide intelligent responses based on context, covering over 2000 categories including plants, products, and artworks [11][12]. - The AI memo function supports voice and photo inputs to create automatic notes, enabling users to record information hands-free [15][16]. - The glasses offer real-time translation capabilities, achieving near-instantaneous results in about 3 seconds, with support for various professional terminologies [21][22]. - A unique feature called "Atmosphere Playlist" collaborates with NetEase Cloud Music to suggest music based on the current environment [25][26]. Design and Usability - The Xiaodu AI Glasses Pro comes in two styles: Boston and Cat Eye, focusing on aesthetics and comfort for users [32]. - The glasses have a daily battery life of approximately 7.5 hours, extendable to 68 hours with a charging case, catering to all-day usage needs [36]. - Equipped with the first-generation Snapdragon AR1 platform, the glasses enhance image processing, wireless connectivity, and audio experience [37]. - The imaging hardware includes a 12MP Sony sensor and supports 4K photography and 1440p video recording with stabilization features [38][41]. Market Positioning - The Xiaodu AI Glasses Pro aims to redefine the aesthetic standards of smart glasses while integrating advanced technology for practical applications in daily life and professional settings [31][39]. - The Boston sunglasses version is already available for purchase, with additional models set to release in December [43].
10人团队千万融资,这个原生AI产品要做“人人可用的数据Agent”丨对话ChatExcel
量子位· 2025-11-16 01:30
Core Insights - The article emphasizes the urgency for AI products to incorporate Agent elements, as users are increasingly likely to abandon products lacking these features [4][5]. - ChatExcel is highlighted as a pioneering AI DataAgent that simplifies data processing through natural language interactions, targeting a broad user base rather than just elite professionals [10][15]. Group 1: Market Trends and User Needs - The rise of Agent products reflects a market demand for solutions that address real user pain points, particularly in data processing [5][6]. - Data processing is identified as a critical challenge for many workers, with the need for 100% accuracy in handling complex datasets [6][68]. - ChatExcel's approach to data processing through conversational AI has attracted a significant user base, with nearly one million users reported [23][14]. Group 2: Product Features and Capabilities - ChatExcel offers a comprehensive suite of features, including multi-modal data input, intelligent dialogue interaction, and the ability to handle various data formats [11][13]. - The product's architecture supports complex data processing tasks, including handling large files and integrating with enterprise databases [13][10]. - ChatExcel's iterative development strategy focuses on expanding its capabilities from simple Excel processing to more complex data analysis and reporting functions [16][61]. Group 3: Business Model and Growth Strategy - The company has successfully secured angel funding and formed partnerships with major tech firms, enhancing its market presence [14][15]. - ChatExcel prioritizes user engagement metrics such as usage rates and customer satisfaction over sheer user numbers, indicating a focus on quality interactions [15][23]. - The product's pricing model is designed to be accessible, with various subscription options to cater to different user needs [122]. Group 4: Competitive Landscape and Future Outlook - The competitive landscape is characterized by a mix of established BI tools and emerging AI solutions, with ChatExcel positioning itself as a user-friendly alternative [104][113]. - The company aims to leverage partnerships to amplify its reach and user engagement, aspiring to have a network of partners rather than just a large user base [17][110]. - Future developments will focus on enhancing the product's capabilities and expanding its application across various data processing scenarios [108][109].
次月留存80%、全球用户超百万:不靠功能堆砌,靠操作「一体化」| 对话AI教育应用Asksia
量子位· 2025-11-15 11:49
以下文章来源于量子位智库 ,作者量子位智库 量子位智库 . 连接AI创新,提供产业研究 分析师 刘萌媛 奕然 量子位智库 | 公众号 AI123All 是否有哪个AI产品,让你觉得——它已经深入我们某个核心生活或工作场景,并让我们完全离不开? 当然,现在问这个问题还为时尚早。毕竟各类AI产品从上线到落地应用、占领用户场景,不过也就2年左右时间,还远没有到达寡头竞争的巅 峰对决时刻。 但是,目前在AI教育赛道,有一款面向 高校留学生上课场景 的AI教育产品AskSia,已近乎成为他们的刚需,让用户产生了离不开的依赖感。 次月留存超80%,六个月留存超60%,拥有超过百万级别的全球用户 ——就是它产生一定"场景黏性"的证明。 在量子位智库看来,这种成功一方面来源于 够小够深 (即没有选择通用性功能、热度更高的K12领域) 的细分场景切口,但更多来源于其对 产品功能设计 的独到理解—— 要观察用户的workflow (工作流) 到底是什么,如果用户想要的结果是"帮我解答一个问题",产品就应该做成工具式来解决单点需求。 但是,我们发现用户需求是多层次的,且希望能够在花费最少努力的时候输出有效结果,比如拿A。 体察大学 ...
全球销量第一的AI玩具,如何用“无用”撬动情感价值?丨对话跃然创新
量子位· 2025-11-15 08:30
Core Insights - The article discusses the challenges and opportunities in the AI toy market, emphasizing the need to convert emotional value into market value [2][8] - The AI toy market is experiencing significant growth, with the global sales of AI toys currently under 1 million units compared to several hundred million smartphones sold annually [3][25] - Haivivi, a leading AI toy brand, has successfully launched products that combine AI technology with popular IPs, achieving substantial sales figures [4][5][14] Market Dynamics - The AI toy market is still in its early stages, with a high user satisfaction rate attributed to the ability of products to engage in role-playing and continuous dialogue [29][30] - The first product, BubblePal, sold nearly 300,000 units within a year, while the second product, CocoMate, is also gaining traction [4][26] - The emotional connection and interactive capabilities of AI toys are key factors driving consumer interest and purchase decisions [51][52] Product Development - Successful AI toys require a combination of hardware, software, and algorithms, with a focus on emotional value rather than just functional attributes [37][46] - The integration of large model capabilities is crucial for enhancing user experience, allowing for personalized interactions and memory retention [18][41] - The company aims to develop AI toys that can operate without internet connectivity, enhancing user privacy and reducing long-term costs [84][85] Competitive Landscape - The primary differentiator in the toy industry is the IP associated with the products, with a dual strategy of leveraging licensed IPs and developing proprietary characters [20][61] - The company does not view established brands like Disney and Pop Mart as direct competitors but rather potential collaborators in the emotional value space [76] - The focus on emotional engagement rather than mere functionality sets the company apart from traditional toy manufacturers [95] Future Outlook - The market potential for AI toys is significant, with expectations that advancements in AI technology will lead to more interactive and emotionally resonant products [53][88] - The company is exploring new features such as selective memory and emotional recognition to enhance the user experience [82][83] - The belief in the ongoing development of AI technology and its application in toys is fundamental to the company's long-term strategy [93][94]
Jeff Dean盛赞姚班校友AI新研究,目前人已到Meta
量子位· 2025-11-15 05:00
Core Viewpoint - The article discusses a new paradigm in AI called Nested Learning (NL), which addresses the issue of catastrophic forgetting in large language models and proposes a more efficient learning structure that mimics human cognitive processes [2][10][25]. Summary by Sections Nested Learning Concept - Nested Learning transforms models from a flat computational network to a hierarchical, self-adjusting learning system, inspired by the human brain's memory processes [6][12][14]. - Traditional models like Transformers are seen as simplified versions of NL, lacking the multi-level advantages that NL offers [6][14]. Innovations of Nested Learning - The research team introduced three core innovations based on NL: 1. **Deep Optimizer**: Unlike traditional optimizers, NL's deep optimizer uses a pre-processing mechanism to understand gradient properties and employs MLP neural networks for memory, allowing for flexible parameter adjustments [17][18]. 2. **Self-Modifying Model**: This allows models to autonomously learn how to adjust their parameters during training, adapting to new data without manual intervention [19]. 3. **Continuous Memory System**: Upgrades the traditional short-term/long-term memory structure to a multi-scale memory chain, enabling efficient storage and processing of information [20]. Performance of Hope Model - The Hope model, based on NL, significantly outperforms mainstream baseline models like Transformer, RetNet, and DeltaNet in language modeling and common-sense reasoning tasks, demonstrating lower perplexity and higher accuracy across various metrics [8][23][24]. - For instance, in language modeling tasks, Hope achieved a perplexity of 26.05 with 760M parameters, outperforming other models [24]. Implications of Nested Learning - The introduction of NL represents a paradigm shift in deep learning, moving away from the traditional approach of stacking layers and parameters, and instead leveraging cognitive science to create a collaborative, hierarchical intelligence system [25]. - This new paradigm may enable AI to continuously learn and accumulate knowledge like humans, potentially solving key challenges in long-context reasoning and lifelong learning [25].