机器之心

Search documents
清华辍学、斯坦福睡地板,华人小哥用AI社交挑战Meta,融资数千万美元
机器之心· 2025-08-26 04:11
机器之心报道 机器之心编辑部 打造更聪明、更全能的社交。 大家都说,在国外,社交应用是 Meta 的天下。 但来自中国的一位小哥偏不信邪,他打造的一款 AI 原生即时通讯工具 Intent,广受好评。 我们再回到这个通讯工具 Intent,下面是操作展示。看完后,你可能觉得,这不就是微信那种社交软件吗?但如果你细细看就会发现不一样的地方:我们 暂且把聊天对象命名为 ABC。 小哥名叫 Brandon Chen,小小年纪经历不少,清华辍学,生物学专业却跨行搞起了社交软件开发,不懂英语却敢只身一人来到美国闯荡,还在斯坦福睡 了一个学期的地板。至于为啥睡地板,Brandon 也没说原因。 据介绍,Intent 已经拿到了数千万美元融资。 A 问 :你们有我们仨昨晚的合照吗? B 回 :我没有,我只拍了张我和你的。然后上传了一张 AB 两人合照。 这时 C 回答 :没关系,Al 可以把我们的照片合在一起…… 然后发了一张自己的美照。 接下来就是魔法时刻,只见 AI 成功的捕捉了用户意图,将两张照片合二为一。 最后效果是这样的,看起来一点拼接的痕迹都没有。 你还可以继续提要求,把照片变成皮克斯风格: 就这样,你也不用费 ...
视频「缺陷」变安全优势:蚂蚁数科新突破,主动式视频验证系统RollingEvidence
机器之心· 2025-08-26 04:11
近日,蚂蚁数科 AIoT 技术团队独立完成的论文《RollingEvidence: Autoregressive Video Evidence via Rolling Shutter Effect》被网络安全领域学术 顶会 USENIX Security 2025 录用。 该论文提出了一套创新性的主动式可信视频取证系统,利用相机卷帘门效应在视频中嵌入高维物理水印,并结合 AI 技术与概率模型进行精准验证,能够有 效抵御深度伪造(Deepfake)和视频篡改等攻击。相较于传统被动识别技术,该系统在检测准确率和安全防护能力上均有显著提升。 会议简介:USENIX Security 于 1990 年首次举办,已有三十多年历史,与 IEEE S&P、ACM CCS、NDSS 并称为信息安全领域四大顶级学术会议,也 是中国计算机学会(CCF)推荐的 A 类会议,本届会议的论文录用率为 17.1%,被录用的稿件反映了网络安全领域国际前沿研究水平。 在深度伪造(Deepfake)与视频篡改日益泛滥的今天,真实性的边界正在被不断挑战。对此,蚂蚁数科 AIoT 技术团队提出了一项突破性创新 —— RollingEvidence ...
热议!DeepSeek V3.1惊现神秘「极」字Bug,模型故障了?
机器之心· 2025-08-26 04:11
机器之心报道 图源:知乎 @Fun10165 而后面在 Trae 中测试 DeepSeek-V3.1 时也同样出现了这个问题。 有意思的是,她还尝试了调用官方 API 修复这个问题。结果,在修复的过程中又出现了这个问题。 编辑:Panda 这个先进的 AI 为何会突然对一个汉字「情有独钟」?DeepSeek 最新的 V3.1 模型上线不到一周,就因一个离奇的 Bug 引发社区热议:无论任 务是写代码还是整理物理试卷,模型总会莫名其妙地在文本中插入「极」字,甚至在自我修复时也无法幸免 。 上周三, DeepSeek 开源了新的基础模型 ,但不是万众期待的 V4,而是 V3.1-Base,而更早时候,DeepSeek-V3.1 就已经上线了其网页、App 端和小程序。 经过这差不多一周时间的真实用户测试,DeepSeek-V3.1 却被发现存在一个相当让人无语的问题:其某些输出 token 会被随机替换为「极」。 具体来说,据知乎用户 Fun10165 描述,她在调用火山引擎版 DeepSeek V3.1 帮助整理一份物理试卷时发现,该模型的输出中会莫名出现一些「极」字。 图源:知乎 @Fun10165 她表示: ...
ChatGPT到底学了多少「污言秽语」?清华团队首提大语言模型中文语料污染治理技术
机器之心· 2025-08-25 23:38
Core Viewpoint - The research highlights that the Chinese vocabulary of advanced ChatGPT models is contaminated with 46.6% polluted tokens, primarily related to pornography and gambling, which significantly affects the model's performance [3][6][41]. Group 1: Research Findings - The study identifies that the Chinese vocabulary of models like GPT-4o/o1/o3/4.5/4.1/o4-mini contains a high level of pollution, with specific examples of contaminated tokens including terms related to adult content and online gambling [3][6][12]. - A total of 1659 Chinese long tokens were analyzed, revealing that 773 tokens (46.6%) are polluted, with 219 tokens (13.2%) specifically related to adult content [13][14]. - The performance of ChatGPT models drops significantly when polluted tokens are input, with approximately 50% loss in interpretation and repetition tasks [17][18]. Group 2: Pollution Detection and Analysis - The research team developed a model to automatically detect polluted Chinese tokens, achieving a recognition accuracy of 97.3% [23]. - The study also proposes a pollution tracking scheme that estimates training data pollution based on vocabulary contamination, providing a lightweight solution for data governance [29][35]. - The analysis of open-source pre-training corpora revealed that polluted tokens cluster at the beginning and end of certain web pages, leading to misinterpretation by the models [19][21]. Group 3: Future Implications - The research raises questions about whether the presence of polluted data is entirely detrimental, suggesting that a moderate amount of harmful data might help in distinguishing harmful representations in models [37][40]. - The findings aim to provide a systematic approach for addressing the governance of large language model training data, potentially influencing future model training practices [41].
刚刚,马斯克将OpenAI和苹果告上法庭:指控ChatGPT垄断iPhone,自家Grok被打压
机器之心· 2025-08-25 23:38
机器之心报道 机器之心编辑部 通过双方达成的协议,iPhone 用户「没有理由」下载第三方 AI 应用。苹果在启用 Apple Intelligence 时「强迫」他们使用 ChatGPT 作为默认聊天机器人应用。 当地时间周一,马斯克向 OpenAI 和苹果「开炮」了! 据多家外媒报道,马斯克旗下 xAI 一纸讼书,控告它们通过 ChatGPT 和苹果 App Store 进行非法垄断。 在一则推文中,马斯克表示,自家 Grok 有 100 万条评论,评论分高达 4.9,但苹果仍然拒绝在任何排名中将 Grok 列入其中。 xAI 指控 OpenAI 和苹果通过达成协议,将 ChatGPT 内置到 iPhone 中,从而扼杀 AI 行业的竞争。此外,苹果的 App Store 被指控「降低」了竞品聊天机器人和 「超级应用」的优先级,包括 Grok 和 X。 我们搜索发现,在苹果 App Store 最新的免费 App 应用榜单中,「ChatGPT 排在首位,而 xAI 和 X 分别排在了 31 和 36 位。」 正如 xAI 在诉讼书中所言,「苹果和 OpenAI 已经锁定了市场,并得以维持垄断地位,阻止像 ...
唯快不破:上海AI Lab 82页综述带你感受LLM高效架构的魅力
机器之心· 2025-08-25 09:10
Core Insights - The article discusses the advancements and challenges in large language models (LLMs), emphasizing their transformative impact on human-computer interaction and the need for efficient architectures to overcome high training and inference costs [2][3][8]. Group 1: LLM Architecture and Efficiency - The efficiency of LLMs is primarily attributed to the Transformer architecture, which, despite its breakthroughs, faces challenges due to its O(N^2) complexity in long sequence tasks [3][4]. - Recent innovations in Transformer architecture have emerged, but a comprehensive review summarizing these advancements has been lacking [4][5]. - A collaborative effort by Shanghai AI Lab and several institutions has resulted in a survey of over 440 papers, focusing on the latest progress in efficient LLM architectures [5][6]. Group 2: Categories of Efficient Architectures - The survey categorizes efficient LLM architectures into seven types, including linear sequence modeling, sparse sequence modeling, efficient full attention, sparse expert models, mixed model architectures, diffusion language models, and applications to other modalities [6][8]. - Linear sequence modeling aims to reduce attention training and inference complexity without incurring KV cache overhead [6][8]. - Sparse sequence modeling leverages the inherent sparsity of attention maps to accelerate computation [21][22]. Group 3: Innovations in Attention Mechanisms - Efficient full attention methods optimize memory access and KV storage while maintaining complete attention [22][23]. - Sparse expert models enhance model capacity without proportionally increasing computational costs through conditional activation of experts [27][28]. - Mixed architectures find a balance between linear/sparse attention and full attention, optimizing both efficiency and performance [35][36]. Group 4: Applications and Future Directions - Diffusion language models represent a novel approach by applying diffusion models from visual tasks to language generation, significantly improving generation speed [38][39]. - Efficient architectures are being applied across various modalities, including vision and audio, demonstrating their versatility and effectiveness [44][45]. - The overarching goal is to achieve substantial acceleration in AI development, akin to the phrase "Speed Always Wins," suggesting a focus on efficiency in training and deploying powerful models [45].
全球开源大模型,前十五名全是中国的
机器之心· 2025-08-25 09:10
Core Viewpoint - The article highlights the significant emergence of domestic open-source large language models (LLMs) in China, with all top-ranked models on the Design Arena leaderboard being Chinese [1][3]. Group 1: Overview of Design Arena - Design Arena is the largest crowdsourced AI-generated design benchmark platform, utilizing a user evaluation system based on Elo Rating, similar to chess scoring [2]. - The platform allows users to vote on which of two model-generated responses is better, creating a dynamic ranking system that reflects real user experiences [2]. Group 2: Rankings of Open-Source Models - The top 15 open-source models on Design Arena are all from China, with DeepSeek-R1-0528 ranked first, followed by Zhipu's GLM 4.5 and Alibaba's Qwen 3 Coder 480B [4][5]. - The ranking details show that DeepSeek has 5 models, Alibaba has 6 models, and Zhipu has 3 models in the top 15 [6][7]. Group 3: Recent Developments in Open-Source Models - Recently, domestic AI companies have been actively releasing new open-source LLMs, with 33 models launched by various firms including Alibaba and Tencent [7]. - A total of 19 leading open-source model laboratories in China have been identified, showcasing a diverse range of contributors to the open-source AI landscape [9]. Group 4: Impact on AI Research and Development - The rise of open-source models like DeepSeek and Qwen is shifting the focus of application companies towards model tuning and optimization, accelerating the deployment of AI technologies [10]. - The article suggests that the increasing prominence of Chinese AI models may reshape the global AI landscape, with a potential shift towards open-source as a standard in advanced model development [10].
突破长视频生成瓶颈:南大、TeleAI推出全新AI生成范式MMPL,让创意一镜到底
机器之心· 2025-08-25 06:08
向迅之,南京大学 R&L 课题组在读博士生,导师是范琦副教授。研究聚焦图像/视频生成与世界模型等 AIGC 方向。 你是否曾被 AI 生成视频的惊艳开场所吸引,却在几秒后失望于⾊彩漂移、画面模糊、节奏断裂? 当前 AI 长视频⽣成普遍⾯临 "高开低走 " 的困境:前 几 秒惊艳 夺⽬ ,之后却质量骤降、细节崩坏;更别提帧间串行生成导致的低效问题 —— 动辄数小时的等待,实时预览几乎难以企及。 这—行业难题,如今迎来突破性解法! 南京大学联合 TeleAI 推出长视频自回归生成新范式——Macro-from-Micro Planning( MMPL),重新定义 AI 视频创作流程。 灵感源自电影工业的 "分镜脚本 + 多组并行拍摄" 机制,MMPL 首创 "宏观规划、微观执行 " 的双层⽣成架构: 成果令人振奋: MMPL 不仅是—项技术升级,更是向 "AI 导演" 迈进的重要—步 —— 让机器不仅会 "拍镜头" ,更能 "讲好—个故事"。 先谋全局:在宏观层面统—规划整段视频的叙事脉络与视觉—致性,确保剧情连贯、风格统—; 再精细节:将长视频拆解为多个短片段,并通过并行化⽣成管线⾼效填充每—帧细节,大幅提升速 ...
超97万:Yoshua Bengio成历史被引用最高学者,何恺明进总榜前五
机器之心· 2025-08-25 06:08
Core Insights - The article highlights the prominence of AI as the hottest research direction globally, with Yoshua Bengio being the most cited scientist ever, accumulating a total citation count of 973,655 and 698,008 citations in the last five years [1][3]. Group 1: Citation Rankings - The AD Scientific Index ranks 2,626,749 scientists from 221 countries and 24,576 institutions based on total citation counts and recent citation indices [3]. - Yoshua Bengio's work on Generative Adversarial Networks (GANs) has surpassed 100,000 citations, outpacing his co-authored paper "Deep Learning," which also exceeds 100,000 citations [3][4]. - Geoffrey Hinton, a pioneer in AI, ranks second with over 950,000 total citations and more than 570,000 citations in the last five years [4][5]. Group 2: Notable Papers and Their Impact - The paper "AlexNet," co-authored by Hinton, Krizhevsky, and Sutskever, has received over 180,000 citations, marking a significant breakthrough in deep learning for computer vision [5][6]. - Kaiming He’s paper "Deep Residual Learning for Image Recognition" has over 290,000 citations, establishing ResNet as a foundational model in modern deep learning [10][11]. - The article notes that ResNet is recognized as the most cited paper of the 21st century, with citation counts ranging from 103,756 to 254,074 across various databases [11]. Group 3: Broader Implications - The high citation counts of these influential papers indicate their lasting impact on the academic community and their role in shaping future research directions in AI and related fields [17].
刚刚,2025年科学探索奖出炉,复旦姜育刚、清华吴嘉敏等获奖
机器之心· 2025-08-25 04:13
Core Viewpoint - The 2025 "Science Exploration Award" highlights the importance of encouraging originality and rewarding young scientists in China, with a focus on innovative research in various fields [2][4]. Group 1: Award Overview - The "Science Exploration Award" was established in 2018 by prominent scientists and Tencent's founder, aimed at supporting young researchers in mainland China and Hong Kong [2]. - This year, the evaluation mechanism emphasizes originality, with questions focusing on the uniqueness and innovation of the applicants' work during the final review [4]. - A total of 50 young scientists were selected from 1,238 applicants, including 13 young scientists under the age of 35 for males and 38 for females, with 6 being from the post-90s generation [4]. Group 2: Award Recipients in Information Electronics - The award recipients in the information electronics category include: - Chang Yi from Jilin University, known for his extensive research and numerous publications in computer science [6][8]. - Du Bo from Wuhan University, recognized for his contributions to artificial intelligence and computer vision [10][12]. - Jiang Yugang from Fudan University, a leader in multimedia information processing and artificial intelligence [13]. - Li Wei from the Chinese Academy of Sciences, specializing in micro-nano photonics and materials [14][16]. - Liao Qing from Harbin Institute of Technology (Shenzhen), focusing on data mining and artificial intelligence [19]. - Wu Jiamin from Tsinghua University, involved in optical computing and photonic intelligent computing [20].