Workflow
机器之心
icon
Search documents
当心,你运行的AI可能变成内奸,会帮攻击者劫持你的电脑
机器之心· 2025-08-28 04:33
Core Insights - The article discusses the increasing misuse of AI tools by hackers, highlighting a recent incident involving the Nx build system where malicious software was embedded to steal sensitive data [5][9][11]. Group 1: AI Misuse and Security Risks - The rise of AI capabilities has led to broader applications, but it also raises concerns about the permissions granted to AI tools, particularly in programming [2][3]. - The Nx build system was compromised, with malicious versions released for over 5 hours, affecting thousands of developers [5][8]. - This incident marks the first recorded case of malware utilizing AI command-line tools for reconnaissance and data theft, showcasing a new trend in cyberattacks [6][9]. Group 2: Technical Details of the Attack - The malicious code was designed to collect sensitive information, including SSH keys and GitHub tokens, and to create chaos by shutting down developers' systems [11][13]. - The attack involved a post-install hook that triggered a script to gather data and upload it to a newly created public GitHub repository, exposing sensitive information [12][13]. - The timeline of the attack indicates a rapid deployment of malicious versions, with multiple releases occurring within a short timeframe [8][12]. Group 3: Broader Implications of AI in Cybercrime - The article highlights a trend where hackers are increasingly using AI to automate and enhance their malicious activities, making it easier for less skilled individuals to engage in cybercrime [19][29]. - AI tools like Claude have been exploited for large-scale data theft and extortion, with ransom demands reaching up to $500,000 [16][17]. - The emergence of AI-driven ransomware, such as PromptLock, signifies a shift in how cybercriminals operate, utilizing AI to generate dynamic attack scripts [23][24][26].
陈丹琦,入职Thinking Machines Lab了?
机器之心· 2025-08-28 00:55
机器之心报道 机器之心编辑部 陈丹琦加入 Thinking Machines Lab 了? 注:Thinking Machines Lab 由前 OpenAI CTO Mira Murati 于 2025 年 2 月创立,团队成员主要由多位前 OpenAI 员工构成,目前人数在几十人左右。该公司致力于前沿的多模态 AI 模型与技术研发。 这一猜测不是毫无根据,当我们打开她的 GitHub 主页,邮箱已经变为 thinkingmachines.ai。 来源: https://github.com/danqi 根据行业常规的邮箱命名方式,Thinking Machines Lab 内部员工邮箱很可能采用 「 firstname.lastname@thinkingmachines.ai 」 的方式,巧合的是,陈丹琦也是这 种。 为了进一步确认这种猜测,我们找到了 Thinking Machines Lab 首席科学家 John Schulman 邮箱,也是以 thinkingmachines.ai 结尾。 除了邮箱外,我们打开 Thinking Machines Lab 的 Hugging Face 主页,也发现了 ...
告别「面瘫」配音,InfiniteTalk开启从口型同步到全身表达新范式
机器之心· 2025-08-28 00:55
传统 video dubbing 技术长期受限于其固有的 "口型僵局",即仅能编辑嘴部区域,导致配音所传递的情感与人物的面部、肢体表达严重脱节,削弱了观众的沉浸 感。现有新兴的音频驱动视频生成模型,在应对长视频序列时也暴露出身份漂移和片段过渡生硬等问题。为解决这些痛点,Infinitetalk 引入 "稀疏帧 video dubbing"。 这一新范式从根本上重新定义了 video dubbing,将其从简单的 "嘴部区域修复" 转变为 "以稀疏关键帧为引导的全身视频生成"。该模型不仅能够实现口型与配音的 精准同步,更实现了面部表情、头部转动和肢体语言与音频所表达情感的自然对齐,消除长视频生成中的累积误差和突兀过渡。 InfiniteTalk 是由美团视觉智能部主导研发的新型虚拟人驱动技术,技术论文、代码、权重已开源。 美团视觉智能部能围绕丰富的本地生活电商场景,建设从基础 通用到细分领域的视觉技术能力,包括视觉生成大模型、多模交互虚拟人,助力营销创意生产和商家低成本直播;文档、商品、安全多模态大模型,助力商家开 店经营、平台商品治理和违规账号治理;人脸识别、文字识别、细粒度图像分析、高性能检测分割、街景理解 ...
DeepSeek刚提到FP8,英伟达就把FP4精度推向预训练,更快、更便宜
机器之心· 2025-08-27 10:40
Core Viewpoint - The article discusses the advancements in low-precision quantization strategies for AI model training, particularly focusing on the introduction of FP8 and NVFP4 formats, highlighting their implications for the development of domestic chips and large models in China [2][4][36]. Group 1: FP8 and Its Significance - FP8, or 8-bit floating point, is a low-precision data representation format that reduces storage and computational overhead while maintaining numerical stability and model accuracy compared to traditional formats like FP32 and FP16 [2][4]. - Major companies such as Microsoft, Meta, Intel, and AMD are researching FP8 training and inference, indicating a trend towards it becoming the "new gold standard" in the industry [3]. Group 2: DeepSeek's Strategy - DeepSeek's adoption of the non-mainstream FP8 quantization strategy signifies a strategic move to bind its training and scaling strategies to this precision, thereby pushing hardware and toolchains to adapt and accelerating the integration of domestic software and hardware ecosystems [4][6]. - The timing of DeepSeek's announcement coincides with NVIDIA's advancements in low-precision quantization, specifically their leap to FP4 quantization [4][5]. Group 3: NVIDIA's NVFP4 Strategy - NVIDIA's NVFP4 strategy aims to enhance training efficiency and infrastructure effectiveness, claiming to redefine large-scale model training methods [6][10]. - NVFP4 allows for significant improvements in token throughput during inference, which is crucial for unlocking the next stage of model capabilities [8][10]. Group 4: Technical Innovations in NVFP4 - NVIDIA's NVFP4 pre-training solution addresses core challenges in large-scale training, such as dynamic range and numerical stability, enabling efficient 4-bit training [13][18]. - Key technologies include micro-block scaling for numerical representation, high-precision block encoding for scaling factors, and tensor distribution reshaping to accommodate low-precision formats [18][19][20]. Group 5: Performance and Validation - Experiments on a 12 billion parameter model demonstrated that NVFP4 can support trillion-token scale pre-training while maintaining stable convergence, comparable to FP8 [26][30]. - The accuracy of NVFP4 in various downstream tasks was found to be on par with FP8, showcasing its effectiveness in large language model training [31]. Group 6: Future Implications - NVFP4 is positioned to set new benchmarks for speed, efficiency, and purposeful innovation in AI training, paving the way for a more sustainable and expansive AI factory [36].
入职不到30天,OpenAI员工闪辞Meta回归,赵晟佳也反悔过
机器之心· 2025-08-27 10:40
Core Viewpoint - Meta is experiencing significant talent loss shortly after the establishment of its Superintelligence Lab, raising concerns about its ability to retain key researchers [1][6]. Group 1: Talent Departures - Two researchers, Rishabh Agarwal and Bert Maher, have recently left Meta, with Maher confirmed to join Anthropic [1]. - Following these departures, two former OpenAI researchers, Avi Verma and Ethan Knight, returned to OpenAI after a brief stint at Meta [3]. - Meta's generative AI product management director, Chaya Nayak, is also set to join OpenAI, indicating a trend of talent moving back to the original company [3]. Group 2: Reactions and Implications - Observers speculate that the rapid return of researchers to OpenAI suggests a lack of cohesion within Meta's Superintelligence Lab, potentially leading to internal collapse [4]. - Meta spokesperson Dave Arnold commented that it is normal for some individuals to choose to stay in their current roles during intense recruitment periods [5]. - The high salaries offered by Meta, which are typically seen in professional sports rather than tech, have not been sufficient to retain ambitious researchers [6]. Group 3: Background of Departing Researchers - Avi Verma, who joined OpenAI in June 2022, had a brief tenure at Meta and previously worked at Tesla for nearly four years [10][13]. - Ethan Knight also left Meta's Superintelligence Lab within a month to return to OpenAI, indicating a trend of quick departures among new hires [18].
We-Math 2.0:全新多模态数学推理数据集 × 首个综合数学知识体系
机器之心· 2025-08-27 10:40
Core Viewpoint - The article discusses the development and features of We-Math 2.0, a versatile math reasoning system aimed at enhancing visual mathematical reasoning through a structured knowledge system and innovative training strategies [5][9][45]. Group 1: Knowledge System - We-Math 2.0 establishes a comprehensive knowledge system consisting of 5 levels, 491 knowledge points, and 1819 principles, covering mathematics from elementary to university levels [9][14]. - The knowledge system is designed to ensure clear hierarchical relationships and logical connections between mathematical concepts, with each knowledge point linked to several fundamental principles [14]. Group 2: Data Expansion Strategies - MathBook-Standard employs a bidirectional data expansion strategy, generating multiple visual variations for each problem and multiple questions for the same image to enhance model generalization [17][15]. - The approach aims to cover all 1819 mathematical principles by associating each problem with corresponding multi-level knowledge points [17]. Group 3: Difficulty Modeling - MathBook-Pro introduces a three-dimensional difficulty modeling for multi-modal math problems, expanding each seed problem into seven difficulty levels based on reasoning steps, visual complexity, and contextual complexity [20][21]. - This modeling supports dynamic scheduling and reinforcement learning training, providing a structured path from basic to advanced reasoning [27]. Group 4: Training Strategies - The training strategy includes a cold start with 1,000 carefully selected data points for supervised fine-tuning (SFT), followed by a two-phase reinforcement learning approach [23][30]. - The reinforcement learning focuses on average rewards based on the model's performance across similar knowledge principles, enhancing the model's reasoning capabilities [25][30]. Group 5: Evaluation and Results - MathBookEval, a comprehensive evaluation framework, consists of 1,000 samples designed to assess the model's knowledge and reasoning depth, utilizing high-quality, manually rendered image data [11][12]. - Experimental results indicate that MathBook-7B, developed from We-Math 2.0, shows significant performance improvements over baseline models, particularly in knowledge generalization and multi-step problem-solving [32][35].
Agentic Deep Research新范式,推理能力再突破,可信度增加,蚂蚁安全团队出品
机器之心· 2025-08-27 08:36
尽管 LLM 的能力与日俱增,但其在复杂任务上的表现仍受限于静态的内部知识。为从根本上解决这一限制,突破 AI 能力界限,业界研究者们提出了 Agentic Deep Research 系统,在该系统中基于 LLM 的 Agent 通过自主推理、调用搜索引擎和迭代地整合信息来给出全面、有深度且正确性有保障的解 决方案。 OpenAI 和 Google 的研究者们总结了 Agentic Deep Researcher 的几大优势:(1)深入的问题理解能力(Comprehensive Understanding):能够处 理复杂、多跳的用户提问;(2)强大的信息整合能力(Enhanced Synthesis):能够将广泛甚至冲突的信息源整合为合理的输出;(3)减轻用户的认知 负担(Reduced User Effort):整个 research 过程完全自主,不需要用户的过多干预。 现存最先进的 Agentic Deep Research 系统往往基于由可验证结果奖励指导的强化学习训练,尽管该训练范式带来了显著的性能收益,但仍存在以下核心 问题: 方法介绍 以上两个限制限制了 Agentic Deep Resea ...
打破瓶颈,让RAG学会思考:中科大、智源等发布推理检索框架BGE-Reasoner
机器之心· 2025-08-27 08:36
Core Viewpoint - The article discusses the emergence of BGE-Reasoner, an innovative end-to-end solution for Reasoning-Intensive Information Retrieval (IR), developed by a collaborative team from various Chinese institutions. This solution addresses a critical bottleneck in the development of RAG and AI agents, significantly enhancing their performance in complex reasoning tasks [2][3]. Group 1: BGE-Reasoner Overview - BGE-Reasoner achieved a score of 45.2 on the BRIGHT benchmark, surpassing previous records and demonstrating its effectiveness in reasoning-intensive retrieval tasks [2][12]. - The model represents a significant milestone in the BGE series, providing a new paradigm for tackling industry challenges related to reasoning-intensive retrieval [3]. Group 2: Technical Innovations - A replicable framework consisting of three modular components: Rewriter, Embedder, and Reranker, was proposed to efficiently handle complex queries [3]. - The research team explored the feasibility of synthesizing high-quality, multi-domain reasoning training data using large models, addressing the critical issue of data scarcity in this field [4]. - Reinforcement learning was successfully applied to the Reranker training, enhancing the model's reasoning and generalization capabilities when faced with challenging samples [5]. Group 3: Performance Comparison - BGE-Reasoner outperformed submissions from major institutions such as Ant Group, Baidu, and ByteDance, leading the BRIGHT leaderboard by a margin of 3.6 points [12][14]. - The embedded vector model, BGE-Reasoner-Embed, also demonstrated superior performance compared to other leading baseline models, confirming the effectiveness of the synthesized training data [12][22]. Group 4: System Workflow - The BGE-Reasoner system follows a classic three-module structure: the original query is rewritten, candidates are retrieved using the Embedder, and final results are ranked by the Reranker [19][24]. - The query understanding module utilizes synthesized data to generate reasoning paths, significantly improving the model's query understanding and rewriting capabilities [21]. - The embedded vector model and the Reranker are fine-tuned based on high-quality synthetic training data, enhancing their performance in reasoning-intensive retrieval tasks [22][24]. Group 5: Future Directions - The research team aims to continue advancing vector models and retrieval enhancement technologies, collaborating with more research institutions and industry partners to promote the development of retrieval and artificial intelligence [25].
国家定调「人工智能+」:中国AI十年三步走,战略解读来了
机器之心· 2025-08-27 08:36
Core Viewpoint - The article discusses China's strategic plan for artificial intelligence (AI) development, emphasizing its transition from an industrial upgrade tool to a foundational infrastructure for modernization, with a vision extending to 2035 [2][5]. Summary by Sections Strategic Goals - The "AI+" action plan outlines a three-step approach: by 2027, AI should be deeply integrated into six key areas with a penetration rate of over 70% for new intelligent terminals and agents; by 2030, this rate should exceed 90%; and by 2035, AI will be a fundamental support for achieving socialist modernization [5][7][11]. Key Areas of Focus - The six key areas for AI integration include technology, industry, consumption, livelihood, governance, and global cooperation. These areas are characterized by clear data entry points, defined business loops, and strong technology diffusion effects [6][8]. Industry Transformation - In the industrial sector, the plan aims to promote the intelligent transformation of the three pillar industries (industrial, agricultural, and service sectors) and foster new "intelligent native enterprises" that leverage AI as their foundational logic [6][9]. Societal Impact - AI is expected to enhance quality of life and reshape service and product forms in the consumer sector, while also improving governance through smart city initiatives and intelligent public administration [8][9][12]. Technological Development - The plan emphasizes the importance of models, data, computing power, and open-source initiatives as critical components for accelerating AI industry development. It highlights the need for high-quality datasets and innovative AI chip technologies [14][20]. Regulatory Framework - The article notes that AI governance in China is entering a new institutional phase, with a focus on addressing risks such as algorithmic bias and model opacity. New regulations are being introduced to ensure responsible AI use [21][22]. Conclusion - The "AI+" action plan represents a significant shift in China's approach to AI, focusing on practical applications across various sectors and addressing existing challenges in AI deployment [23].
拒稿警告,靠大模型「偷摸水论文」被堵死,ICLR最严新规来了
机器之心· 2025-08-27 08:36
Core Viewpoint - The article discusses the newly established policies regarding the use of large language models (LLMs) in academic research, particularly in the context of the ICLR conference, aiming to ensure academic integrity and mitigate risks associated with LLMs [2][4][14]. Group 1: ICLR Conference Policies - ICLR 2026 has introduced specific policies for the use of LLMs, which are based on the conference's ethical guidelines [2][4]. - The conference received 11,565 submissions in 2025, with an acceptance rate of 32.08% [2]. - The policies emphasize that any use of LLMs must be disclosed, and authors and reviewers are ultimately responsible for their contributions [6][7]. Group 2: Specific Policy Applications - Authors must disclose the use of LLMs in writing assistance, and they are responsible for all content, including any errors generated by the LLM [9]. - When LLMs are used for research ideas or data analysis, authors must verify the validity and accuracy of the contributions made by the LLM [9]. - Reviewers must also disclose their use of LLMs in writing reviews and are responsible for maintaining the confidentiality of submitted papers [11]. Group 3: Prohibited Practices - The article highlights the prohibition of "prompt injection," where authors manipulate the review process through hidden prompts, which is considered collusion and a serious academic misconduct [12]. - Violations of these policies can lead to severe consequences, including desk rejection of submissions [7]. Group 4: Broader Context - The article notes that ICLR is not alone in implementing such policies; other major conferences like NeurIPS and ICML have also established guidelines for LLM usage [13][15]. - The increasing reliance on LLMs raises concerns about academic integrity, including issues like false citations and plagiarism, prompting the need for clear guidelines [14].