Workflow
机器之心
icon
Search documents
定义科学智能2.0:在WAIC,复旦与上智院的答案是开放协作、科学家为中心,以及一个「合作伙伴」
机器之心· 2025-07-31 05:11
今年的世界人工智能大会(WAIC)可谓热闹非凡,据说有的展台甚至一度拥挤到工作人员都难以进入。 在出圈的众多机器人和终端产品之外,另一个领域也值得我们关注:科学智能(AI for Science,AI4S)。 机器之心报道 编辑:+0 在本届大会上,科学智能的战略地位被提到了新高度,作为十大核心方向之一,拥有专属论坛和多个交叉议题。这并非偶然,自从 AlphaFold 用惊人的效 率解决了困扰生物学界很长时间的难题, 科学智能 就已经证明,它不是未来的幻想,而是正在重塑科学根基的现实力量。 由复旦大学与上海科学智能研究院(上智院)联合主办的「星河启智·科学智能开放合作论坛」,为观察这一领域的变革趋势提供了一个窗口。 金力院士的呼吁描绘了「做什么」的宏大蓝图,而复旦大学特聘教授、上智院院长、无限光年创始人漆远则给出了「怎么做」的具体路径, 他在 星河启智 科学智能开放平台发布环节 的技术演讲中将 科学智能 的当前进展定义为 「科学智能 2.0 时代」:一个以领域科学家为中心,让 AI 进化为能理解科学家意 图、默契协作的「合作伙伴」的时代。 当强大的算力、前沿算法与具体的科学需求交织,未来将走向何方?从「超级科 ...
刚刚,扎克伯克公开信:Meta不会开源全部模型
机器之心· 2025-07-31 01:24
Core Viewpoint - Meta's CEO Mark Zuckerberg is aggressively recruiting top AI researchers from competitors and is sharing his vision for superintelligence, indicating significant advancements in AI development are imminent [2][3][12] AI Development and Strategy - Meta has observed signs of self-improvement in its AI systems, although progress is currently slow. The development of superintelligence is seen as approaching [2][7] - The company is shifting its approach to releasing AI models, emphasizing the need to balance the benefits of superintelligence with potential safety risks. This includes a cautious approach to open-sourcing content [3][11] - Zuckerberg has previously indicated that if the functionality of AI models changes significantly, Meta may reconsider its commitment to open-sourcing [4][5] Competitive Landscape - Meta's Llama series of open models is positioned as a key differentiator against competitors like OpenAI and Google DeepMind. The goal is to create open-source AI models that are as effective as closed-source alternatives [3][6] - The decision to keep models closed-source by competitors is driven by the desire for greater control over monetization. Meta's business model, primarily reliant on internet advertising, allows for a different approach [6] Vision for Superintelligence - Zuckerberg envisions a future where superintelligence enhances human capabilities, enabling individuals to pursue their personal goals and aspirations [9][10] - The company believes that personal superintelligence will empower individuals, contrasting with views that advocate for centralized control over superintelligence [10][11] Future Investments and Expectations - Meta plans to invest up to $72 billion in AI infrastructure by 2025, indicating a strong commitment to developing the necessary resources for superintelligence [12] - Following the announcement, Meta's stock price increased significantly, reflecting positive market sentiment towards the company's AI strategy [12]
把指纹焊死在频率上:抗微调神经网络指纹的硬核方案来了
机器之心· 2025-07-31 01:24
论文第一作者唐灵,张拳石老师课题组的博二学生。 今天要聊的是个硬核技术 —— 如何给神经网络刻上抹不掉的 "身份证"。现在大模型抄袭纠纷不断,这事儿特别应景。 所谓神经网络指纹技术,是指使用神经网络内部如同人类指纹一样的特异性信息作为身份标识,用于判断模型的所有权和来源。传统方法都在玩 "贴标签":往模 型里塞各种人造指纹。但问题是,模型微调(fine-tuning)就像给整容 —— 参数一动,"整张脸" 就变了,指纹自然就糊了。 面对神经网络微调训练的威胁,现有方案都在修修补补,而我们上升到理论层面重新思考:神经网络是否先天存在某种对微调鲁棒的特征?如果存在,并将该固 有特征作为网络指纹,那么无论对模型参数如何微调,该指纹就能始终保持不变。在这一视角下,前人的探索较为有限,没有从理论上证明出神经网络内部对微 调天然鲁棒的特征。 理论框架。我们证明,通过对卷积核 W 进行拓展后的离散傅里叶变换 (不是传统的傅里叶变换)所获得的特定频率成分 ,在训练过程中保持稳定。因此,我们使用这些特定的频率成分作 为对于微调鲁棒的神经网络指纹。 首先,我们发现神经网络时域上的前向传播过程可以写为频域当中的向量乘法。具体而言, ...
刚刚,DeepSeek梁文锋NSA论文、北大杨耀东团队摘得ACL 2025最佳论文
机器之心· 2025-07-30 16:25
Group 1 - The ACL conference is a premier event in the field of computational linguistics and natural language processing, with the 63rd edition scheduled for July 27 to August 1, 2025, in Vienna, Austria [2] - This year, the total number of submissions reached a record high of over 8,000, compared to 4,407 last year, with acceptance rates of 20.3% for main conference papers and 16.7% for Findings [3] - Over half of the first authors of the submitted papers are from China (51.3%), a significant increase from last year's 30.6%, while the second-largest group of authors comes from the United States at 14.0% [4] Group 2 - Four best papers were awarded, including two from teams led by Liang Wenfeng and Yang Yaodong from Peking University, with the other two awarded to teams from CISPA Helmholtz Center for Information Security & TCS Research & Microsoft, and Stanford University & Cornell Tech [6][10] - The first best paper discusses a theory of response sampling in large language models (LLMs), highlighting the ethical concerns arising from biases in decision-making processes influenced by LLMs [11][15] - The second best paper focuses on algorithmic fairness, introducing a framework that emphasizes group discrimination awareness in specific contexts, demonstrating that existing bias mitigation strategies may be counterproductive [16][19] Group 3 - The third best paper reveals a structural inertia mechanism in large models that resists alignment during fine-tuning, indicating that achieving robust alignment is more challenging than previously thought [24][25] - The fourth best paper presents a new hardware-aligned and natively trainable sparse attention mechanism, which significantly improves efficiency in long-context modeling for LLMs [31][40] Group 4 - A total of 26 outstanding papers were recognized, covering various topics such as multilingual summarization, hate speech analysis, and the evaluation of large language models [42] - The best demo paper was awarded to OLMoTrace, a system capable of tracing language model outputs back to trillions of training tokens [46][48] Group 5 - The ACL 2025 conference also recognized two time-tested awards, celebrating foundational papers from 2000 and 2015 that have significantly influenced the field [65][73] - Kathy McKeown received the Lifetime Achievement Award for her extensive contributions to natural language processing over 43 years [86][90] - Julia B. Hirschberg was awarded the Distinguished Service Award for her long-standing service to the ACL and contributions to the field [96][98]
P图手残党有救了,豆包·图像编辑模型3.0上线,一个对话框搞定「增删改替」
机器之心· 2025-07-30 05:13
Core Viewpoint - The article discusses the launch of the SeedEdit 3.0 image editing model by Volcano Engine, highlighting its advanced capabilities in personalized image editing and the increasing demand for intelligent editing tools in the market [2][3][85]. Group 1: Product Features - SeedEdit 3.0 offers three main advantages: stronger instruction adherence, better subject retention, and higher generation quality, particularly excelling in portrait editing, background changes, and lighting adjustments [5][68]. - The model can perform complex tasks such as removing unwanted elements from images while maintaining the integrity of the background and other subjects [18][21]. - Users can easily change text, backgrounds, and styles with simple prompts, making it accessible for non-professionals [25][34]. Group 2: Technical Innovations - SeedEdit 3.0 is built on the Seedream 3.0 model, addressing challenges in image structure, semantic consistency, and detail preservation [66][77]. - The model utilizes a multi-stage training strategy, combining diverse data sources and advanced optimization techniques to enhance performance [74][78]. - It achieves an 8-fold acceleration in inference time, reducing processing time from approximately 64 seconds to 8 seconds, significantly improving user experience [80]. Group 3: Market Implications - The introduction of SeedEdit 3.0 represents a significant shift in the image editing landscape, moving towards automation and creativity, allowing users without specialized skills to engage in advanced image creation [86]. - The model's potential applications extend beyond personal use, with opportunities in film production, advertising, media, e-commerce, and gaming, enhancing content production efficiency for businesses [87].
SPIRAL:零和游戏自对弈成为语言模型推理训练的「免费午餐」
机器之心· 2025-07-30 05:13
Core Insights - The research introduces SPIRAL, a framework that utilizes self-play in zero-sum games to enhance reasoning capabilities in language models without relying on human supervision [3][33]. - The study demonstrates that competitive self-play can lead to significant improvements in reasoning skills, as evidenced by a 8.7% increase in mathematical reasoning ability and an 18.1 percentage point improvement on the Minerva Math benchmark [7][30]. Group 1: Research Background - The collaborative research involves institutions such as the National University of Singapore and A*STAR, focusing on scalable autonomous agents capable of intelligent decision-making in unknown environments [1]. - The success of models like OpenAI's o1 and DeepSeek-R1 highlights the potential of reinforcement learning to enhance reasoning capabilities in language models [2]. Group 2: SPIRAL Framework - SPIRAL employs self-play in zero-sum games to autonomously discover and reinforce generalizable reasoning patterns, eliminating the need for manually designed reward functions and expert supervision [3][6]. - The framework utilizes a distributed online multi-agent reinforcement learning system for fine-tuning large language models across various two-player zero-sum games [24]. Group 3: Game-Based Training - The research identifies three games with distinct cognitive demands—TicTacToe, Kuhn Poker, and Simple Negotiation—as effective training environments for enhancing reasoning skills [12][11]. - The self-play mechanism allows for adaptive difficulty adjustments, ensuring continuous evolution of the model's capabilities [11]. Group 4: Transfer of Skills - The study reveals that reasoning patterns developed in games can transfer to mathematical problem-solving, with specific skills like expected value calculation and case analysis showing significant migration rates [18][19]. - The multi-game training approach leads to synergistic effects, enhancing performance in unfamiliar games compared to single-game specialists [21]. Group 5: Technical Innovations - The introduction of Role-Aware Advantage Estimation (RAE) prevents "thinking collapse," ensuring stable gradient updates and consistent reasoning generation throughout training [26][28]. - The SPIRAL framework has shown effectiveness even in strong models, with notable performance improvements in established benchmarks [30]. Group 6: Practical Implications - SPIRAL offers a novel approach for researchers and engineers aiming to enhance model reasoning capabilities without the need for extensive high-quality reasoning data [35]. - The findings suggest that pre-trained models already contain various reasoning patterns, and reinforcement learning can help identify and strengthen those that are truly generalizable [35]. Group 7: Limitations and Future Directions - Despite its successes, SPIRAL faces limitations such as the need for carefully designed game environments and high computational resource demands [38]. - Future research may explore hybrid game types and meta-game learning to cultivate more comprehensive reasoning abilities [37].
开出10亿美元天价,小扎挖人Mira创业公司惨遭拒:俺们不差钱
机器之心· 2025-07-30 05:13
编辑:杜伟、陈陈 Meta 下一个目标是谁? 七月马上过完,Meta 超级智能实验室的挖人仍然没有偃旗息鼓的迹象。 今日,据外媒 The Wired 的一篇专栏文章报道,扎克伯格这次将目标瞄向了 OpenAI 前首席技术官 Mira Murati 创立的公司 Thinking Machines Lab。 就在大约两周前,这家 AI 创业公司刚刚完成了 20 亿美元种子轮融资,由 a16z 领投,英伟达、AMD 等参 与了投资。 在这家 50 人的创业公司中,Meta 至少接触并向十几位员工发出了邀约。据一位了解谈判情况的消息人士透 露,其中 一份报价多年总额超过了 10 亿美元 。另外,据多个消息源证实,其他报价的 四年总额也在 2 亿 到 5 亿美元之间 。不仅如此, 仅仅是在第一年,Meta 就保证这些人可以拿到 5000 万到 1 亿美元 。 不过,截至目前, Thinking Machines Lab 没有任何一名员工接受这些报价 。 机器之心报道 Meta 的公关总监 Andy Stone 在回应 Wired 时对这一报道提出了异议,他表示,「我们只向 Thinking Machines Lab 的 ...
当智能成为主要生产资料,硅基经济学引爆「AI+金融」
机器之心· 2025-07-30 05:13
Core Viewpoint - The article discusses the concept of "Silicon Economics" proposed by Shao Yilei, which signifies a paradigm shift in the global economic system from carbon-based to silicon-based structures driven by artificial intelligence, large models, computing power, data, and chips [1][3][10]. Group 1: Concept of Silicon Economics - Silicon Economics is characterized by the reconstruction of production materials, where intelligence replaces energy as the primary production resource, necessitating the acceleration of silicon-based infrastructure development [3][10]. - Labor dynamics will also change, with a decrease in human labor's share and an increase in the use of labor robots or AI agents [3][10]. - Trade patterns will be redefined, focusing on which countries will be exporters or importers of intelligence, the pricing of these exports, and the currencies used for transactions [3][10]. Group 2: Implications of Silicon Economics - The mission of Silicon Economics is to provide a systematic cognitive map and policy framework for new productive forces and the global intelligent economy [10][11]. - The "Power Triangle" of the silicon world includes the rights to extract intelligence, determine pricing, and settle transactions, with only China and the U.S. currently capable of providing stable supplies of computing power, data, and algorithms [11][14]. - The article highlights a critical transitional phase from "pure carbon-based" to "silicon-carbon hybrid," presenting both opportunities and challenges [11][14]. Group 3: Future Predictions - In the next 500 days, AI is expected to significantly transform the world, with algorithms dominating global productivity, potentially increasing GDP growth from 3% to 10% driven by AI [13][14]. - Stablecoins are anticipated to be linked with intelligence, with the Chinese yuan stablecoin possibly leading a new order anchored in algorithms [13][14]. - Silicon Economics is projected to become the global standard for AI economic capabilities, promoting algorithm governance and competition for intelligent sovereignty [13][14]. Group 4: Development of Intelligent Financial Systems - The Shanghai Artificial Intelligence Finance Institute (SAIFS) has developed a comprehensive "Intelligent Financial New Prototype System" to address the real demands of the financial industry [16][24]. - The system includes the SmithRM financial reasoning model and the Silicon Fin financial intelligence engine, which integrates data, models, intelligent agents, and application scenarios [16][24]. - The financial intelligence engine is designed to enhance data perception, algorithm evolution, computing power metabolism, and internal control immunity, aiming for a collaborative growth with the market [24].
技术狂飙下的 AI Assistant,离真正的 Jarvis 还有几层窗户纸?
机器之心· 2025-07-30 01:30
Core Viewpoint - The article discusses the limitations of current AI Assistants, which primarily function as conversational agents, and emphasizes the need for the next generation of AI Assistants to evolve towards actionable intelligence, focusing on multi-modal interaction, real-time responsiveness, and cross-system execution capabilities [1]. Group 1: Limitations of Current AI Assistants - Current AI Assistants are still in the "dialogue" phase and are far from becoming true "universal agents" [2]. - The development challenges for AI Assistants are concentrated in four dimensions: intelligent planning and invocation, system latency and collaboration, interaction memory and anthropomorphism, and business models and implementation paths [2]. - Different technical paths are being explored, including general frameworks based on foundational models and scenario-specific closed-loop systems [2][4]. Group 2: Technical Pathways for AI Assistants - One core approach is to build a long-term, cyclical, and generalizable task framework that encompasses the entire process from goal understanding to task completion [3]. - The Manus framework exemplifies this approach by using a multi-step task planning and toolchain combination, where the LLM acts as a control center [4]. - MetaGPT emphasizes the need for components like code execution, memory management, and system calls to achieve cross-tool and cross-system scheduling capabilities [4]. Group 3: Scenario-Specific Approaches - Another technical path advocates for deep exploration within fixed scenarios, focusing on short-term task execution [4]. - Genspark, for instance, automates PPT generation by integrating multi-modal capabilities and deep reasoning modules [4]. - This scenario-specific approach is more stable and easier to deploy but struggles with non-structured tasks and domain transfer [4][5]. Group 4: Future Directions and Innovations - The Browser-Use approach aims to enhance agent capabilities by allowing them to interact with web interfaces like humans [6]. - Open Computer Agent can simulate mouse and keyboard operations for tasks like flight booking and web registration [6]. - No-Code Agent Builders are emerging as a recommended solution for the next generation of AI Assistants, enabling non-technical users to create and deploy workflows [7]. Group 5: System Optimization Challenges - AI Assistants must optimize for low-latency voice interaction, full-duplex voice capabilities, and the integration of hardware/system actions with application data and tool invocation [8].
凌晨,Qwen又更新了,3090就能跑,3B激活媲美GPT-4o
机器之心· 2025-07-30 00:48
Core Insights - The article discusses the release of the new AI model Qwen3-30B-A3B-Instruct-2507, which showcases significant improvements in performance and efficiency compared to its predecessor and other industry models [1][2]. Performance Improvements - The new model operates in a non-thinking mode, activating only 3 billion parameters while achieving performance comparable to leading closed-source models like Google's Gemini 2.5-Flash and OpenAI's GPT-4o [2][4]. - Performance metrics show substantial improvements, with AIME25 scores increasing from 21.6 to 61.3 and Arena-Hard v2 scores rising from 24.8 to 69.0 [3][4]. Benchmark Comparisons - In benchmark tests, the Qwen3-30B-A3B-Instruct-2507 model performs on par or surpasses models such as DeepSeek-V3-0324 across various categories [4][10]. - The model's average performance in knowledge benchmarks is 62.8, with specific scores like MMLU-Pro at 78.4 and GPQA at 70.4 [10]. Enhanced Capabilities - The model exhibits significant advancements in general capabilities, including instruction following, logical reasoning, text comprehension, mathematics, science, programming, and tool usage [13][27]. - It has improved multilingual knowledge coverage and can generate higher quality text aligned with user preferences [13][27]. Open Source and Accessibility - The model has been made open-source and is available on platforms like HuggingFace and QwenChat, allowing broader access and community support [16][17]. - Users have reported successful implementations of the model on consumer-grade GPUs, such as the RTX 3090, highlighting its accessibility [24][23]. Contextual Understanding - The model's long-context understanding has been enhanced, now supporting up to 256K tokens, which is a significant improvement for complex tasks [28][27]. Industry Impact - The rapid advancements in model efficiency and performance are noted as a significant trend in the AI industry, with the Qwen team setting a high bar for future developments [7][35].