Workflow
大语言模型
icon
Search documents
奖励模型也能Scaling!上海AI Lab突破强化学习短板,提出策略判别学习新范式
量子位· 2025-07-11 04:00
Core Viewpoint - The article discusses the introduction of a new reward modeling paradigm called Policy Discriminative Learning (POLAR), which enhances the post-training phase of large language models (LLMs) and addresses the limitations of traditional reward models in reinforcement learning [1][3][4]. Group 1: Challenges in Reward Modeling - The design and training of reward models have been a bottleneck in improving the effectiveness of post-training and model capabilities [2]. - Traditional reward models lack systematic pre-training and scaling methods, hindering their ability to improve alongside computational resources [2]. Group 2: Introduction of POLAR - POLAR decouples from absolute preferences and allows for efficient scaling of reward modeling, enabling adaptability to various customized needs based on reference answers [3][5]. - POLAR can assign different scores to model outputs based on varying reference styles without needing to retrain the reward model [7]. Group 3: Training Methodology of POLAR - POLAR employs a two-stage training process: pre-training and preference fine-tuning, utilizing a contrastive learning approach to measure the distance between training and target strategies [21][22]. - The pre-training phase uses a large amount of automated synthetic data, allowing for significant scalability [22][23]. Group 4: Performance and Scaling Effects - POLAR demonstrates scaling effects, with validation loss decreasing in a power-law relationship as model parameters and computational resources increase [28][29]. - In preference evaluation experiments, POLAR outperforms state-of-the-art reward models, showing significant improvements in various tasks, particularly in STEM-related tasks [32][34]. - POLAR's ability to learn subtle distinctions between strategy models enhances the generalization of reward signals in real-world applications [35].
是的,LeCun要向28岁的Alexandr Wang汇报!这是Meta新AI团队的一些独家内部消息
机器之心· 2025-07-11 02:43
Core Viewpoint - Meta's aggressive recruitment strategy in the AI sector has raised questions about its sustainability and the potential impact on company culture and performance [2][24]. Group 1: Recruitment and Team Structure - Meta has made headlines by offering exorbitant salaries, reportedly up to $200 million for key talent, to attract AI experts from competitors like OpenAI and Apple [3][4]. - The newly formed Meta Superintelligence Labs (MSL), led by Alexandr Wang, is a focal point of interest regarding its operational structure and research direction [5]. - There is a significant internal restructuring, with high-level executives being allowed to recruit their own teams, which may lead to internal competition and integration challenges [21][22]. Group 2: Internal Dynamics and Culture - Concerns have been raised about the impact of these changes on Meta's corporate culture, with reports of a "fear culture" emerging due to performance evaluations and ongoing layoffs [24]. - A lack of clear vision and strategic confusion has been noted, particularly within the Llama team, where many employees are unclear about the company's goals [24]. - The retention rate of top talent recruited from other companies is low, indicating potential issues with employee satisfaction and organizational stability [24]. Group 3: Research Focus and Distinctions - The Fundamental AI Research (FAIR) division operates independently from the GenAI and MSL teams, focusing on long-term foundational research rather than product development [8][16]. - The Llama team, initially part of FAIR, has been transitioned to the GenAI product group following the success of Llama1, highlighting the distinction between exploratory research and product-oriented development [15][16]. - The controversy surrounding the Llama 4 model, including allegations of "ranking cheating," has raised questions about Meta's technical reputation and credibility in the AI field [24].
7月19日,相聚北京!一起聊聊ACL 2025爆点研究
机器之心· 2025-07-10 08:35
Core Insights - The AI field continues to be an exciting area in 2025, with numerous research releases from major tech companies and institutions [1] - The rapid pace of technological advancements in AI is overwhelming, with new models and paradigms emerging almost weekly [3][4] - Developers and researchers are increasingly engaging in conferences and academic sharing to stay updated on cutting-edge research [5] Event Overview - The ACL conference, a significant event in the NLP field, received over 8,000 submissions this year, marking a historical high [6] - The ACL 2025 conference will take place from July 27 to August 1 in Vienna, Austria, featuring various activities such as keynote speeches, paper presentations, roundtable discussions, and poster sessions [6][7] - The event aims to provide a platform for domestic AI talent, with a full schedule of presentations and discussions announced [6] Keynote Speakers and Topics - The keynote address on "Trends and Outlook for ACL 2025" will be delivered by Che Wanxiang, a prominent professor from Harbin Institute of Technology [9][17] - Liu Pengfei from Shanghai Jiao Tong University will present on "Reinforcement Learning and Complex Reasoning in Large Models" [11][19] Paper Presentations - Various papers will be presented, covering topics such as the intrinsic self-correction of large language models and the acceleration of inference in large language models [9][12] - The event will also feature poster sessions and opportunities for industry engagement [21]
图书编辑要趁早转行吗?
Hu Xiu· 2025-07-10 07:47
Core Viewpoint - The publishing industry is undergoing an unprecedented paradigm shift due to the impact of generative artificial intelligence, leading to a decline in traditional reading and publishing practices [2][3][5]. Group 1: Industry Transformation - The traditional role of book editors and readers is diminishing as AI tools become more prevalent in content creation and consumption [4][5][8]. - The emergence of large language models has transformed knowledge access, making it easier for individuals to obtain information without traditional reading [6][7]. - The publishing industry's performance decline is attributed not only to economic cycles but also to a fundamental loss of its habitat, as many individuals no longer purchase books [8][9]. Group 2: Changes in Consumer Behavior - Readers are increasingly relying on AI for information, leading to a decline in the traditional book-reading culture [4][8]. - The shift in consumer behavior is evident as students and readers prefer quick AI-generated summaries and analyses over in-depth reading [4][6][7]. Group 3: Internal Industry Dynamics - Within publishing offices, there is a growing reliance on AI tools for content creation, editing, and marketing, leading to a sense of self-dissolution among professionals [9][10][11]. - The fear of being replaced by AI is prevalent among publishing professionals, as their core skills become less relevant in the face of advanced AI capabilities [12][13]. Group 4: Market Challenges - The traditional methods of promoting and selling books are becoming ineffective as the market shifts towards short-form content and AI-generated materials [16][17][18]. - The publishing industry is now competing for attention in an environment where AI can produce content at an unprecedented speed, leading to a fundamental change in content marketing dynamics [18][19]. Group 5: Future Outlook - The industry faces a critical juncture where professionals must adapt to the new reality of AI integration, requiring a reevaluation of their skills and roles [21][22][23]. - There is a pressing need for industry professionals to identify unique qualities that AI cannot replicate, such as deep insights and personal connections with authors [23][24]. - The overall sentiment suggests that the publishing industry may be heading towards a niche existence, akin to a cultural symbol rather than a mass-market force [14][15].
马斯克xAI发布Grok 4:训练算力提升100倍,多项测试中领先第二名一倍
Feng Huang Wang· 2025-07-10 06:20
Core Insights - xAI has launched its latest large language model, Grok 4, which shows significant performance improvements over its predecessor, Grok 3, with a 100-fold increase in training computational power [1] - Grok 4 achieved a 25% problem-solving rate in the "Humanities Last Exam" benchmark, while the multi-agent version, Grok 4 Heavy, exceeded 50% [1] - The company is focusing on enhancing multi-modal understanding capabilities and has released an API for Grok 4, supporting a context length of 256K [2] Model Performance - Grok 4 demonstrates superior reasoning capabilities in standardized tests, including GPQA and AIME, and achieved a perfect score in the Live Coding Bench test [2] - The model integrates tool usage directly into its training process, improving reliability in complex task handling [2] Commercialization Efforts - xAI has introduced a subscription service, Super Grok Heavy, allowing users to access both Grok 4 and Grok 4 Heavy [3] - The company plans to develop a dedicated programming model and initiate video generation model training using over 100,000 H200 GPUs in the coming weeks [3] - The release of Grok 4 marks a significant breakthrough in the competitive landscape of large language models, particularly in reasoning and multi-agent collaboration [3]
马斯克发布Grok 4:叫板GPT-5,首席科学家却临阵离职
Feng Huang Wang· 2025-07-10 05:31
Core Viewpoint - Elon Musk officially launched the latest language model from his xAI team, Grok 4, amidst controversies including the resignation of xAI's chief scientist and previous issues with the model generating racist content [1][2] Group 1: Model Features and Capabilities - Grok 4 showcases significant upgrades, including multi-modal capabilities for processing text and images, with potential future support for video processing [2] - The model introduces Grok 4 Code for code writing and debugging, and enhances voice interaction for a more natural conversational experience [2] - Grok 4 will utilize a tool called DeepSearch for real-time internet searches, integrating data from the X platform to provide up-to-date information [2] - A unique feature of Grok 4 is its enhanced understanding of internet culture, slang, and memes, aiming to be a more relatable AI assistant [2] Group 2: Market Position and Challenges - Despite its powerful features, Grok 4 faces a credibility crisis due to previous versions producing biased content, raising concerns about xAI's commitment to product safety and testing [2] - Musk positions xAI as a challenger to what he refers to as "woke" AI models like ChatGPT and Gemini, yet he remains largely silent on the current controversies [2] - In contrast to competitors like OpenAI and Google, which prioritize reliability and safety, xAI opts for a more avant-garde approach with fewer restrictions, which poses risks that remain to be evaluated by the market [3]
扩散语言模型写代码!速度比自回归快10倍
量子位· 2025-07-10 03:19
Core Viewpoint - The article discusses the launch of Mercury, a new commercial-grade large language model based on diffusion technology, which can generate code at a significantly faster rate than traditional models. Group 1: Model Innovation - Mercury breaks the limitations of autoregressive models by predicting all tokens at once, enhancing generation speed [2] - The model allows for dynamic error correction during the generation process, providing greater flexibility compared to traditional models [4][20] - Despite using diffusion technology, Mercury retains the Transformer architecture, enabling the reuse of efficient training and inference optimization techniques [6][7] Group 2: Performance Metrics - Mercury's code generation speed can be up to 10 times faster than traditional tools, significantly reducing development cycles [8] - On H100 GPUs, Mercury achieves a throughput of 1109 tokens per second, showcasing its efficient use of hardware [9][13] - In benchmark tests, Mercury Coder Mini and Small achieved response times of 0.25 seconds and 0.31 seconds, respectively, outperforming many competitors [16] Group 3: Error Correction and Flexibility - The model incorporates a real-time error correction module that detects and corrects logical flaws in code during the denoising steps [21] - Mercury integrates abstract syntax trees (AST) from programming languages like Python and Java to minimize syntax errors [22] Group 4: Development Team - Inception Labs, the developer of Mercury, consists of a team of experts from prestigious institutions, including Stanford and UCLA, with a focus on improving model performance using diffusion technology [29][34]
「世界模型」也被泼冷水了?邢波等人揭开五大「硬伤」,提出新范式
机器之心· 2025-07-09 07:10
机器之心报道 编辑:泽南、+0 现在的世界模型,值得批判。 我们知道,大语言模型(LLM)是通过预测对话的下一个单词的形式产生输出的。由此产生的对话、推理甚至创作能力已经接近人类智力水平。 但目前看起来,ChatGPT 等大模型与真正的 AGI 还有肉眼可见的差距。如果我们能够完美地模拟环境中每一个可能的未来,是否就可以创造出强大的 AI 了?回想 一下人类:与 ChatGPT 不同,人类的能力组成有具体技能、深度复杂能力的区分。 模拟推理的案例:一个人(可能是自私的)通过心理模拟多个可能结果来帮助一个哭泣的人。 人类可以执行广泛的复杂任务,所有这些任务都基于相同的人类大脑认知架构。是否存在一个人工智能系统也能完成所有这些任务呢? 论文:Critiques of World Models 论文链接:https://arxiv.org/abs/2507.05169 研究人员指出了构建、训练世界模型的五个重点方面:1)识别并准备包含目标世界信息的训练数据;2)采用一种通用表征空间来表示潜在世界状态,其含义可 能比直接观察到的数据更为丰富;3)设计能够有效对表征进行推理的架构;4)选择能正确指导模型训练的目标函数; ...
还在为AI数据发愁?张文涛和鄂维南院士团队推出Data-centric AI系统
机器之心· 2025-07-08 09:41
近年来,大模型发展主要由大型科技公司主导,其领先的核心在于规模庞大且高质量的数据资源。然而,这些公司通常并不公开其原始数据及数据处理工具,使 得学术界在大模型训练数据的构建与优化方面难以追赶,受制甚深。 尽管近年来开源了大量数据集,学术界在大模型数据准备方面仍面临诸多挑战。目前,大模型训练数据的清洗与构建仍主要依赖各个研究团队 "闭门造车",缺乏 系统化、高效的工具支持 。现有的数据处理工具如 Hadoop 和 Spark 等, 支持的操作算子大多偏向传统方法,尚未有效集成基于最新大语言模型(LLMs)的智能 算子,对于构建先进大模型的训练数据支持有限。 为此,张文涛和鄂维南院士团队提出了以数据为中心的 AI 系统 DataFlow 。它系统实现了 100 余个基于规则、本地大模型或大模型 API 的数据治理算子 (Operators),并在此基础上构建 8 条预设数据处理流水线(Pipeline),包括:大规模嘈杂数据(如 PDF 文档、纯文本、低质量问答数据、爬虫数据等)的清 洗、扩增与评估;带有思维链的强推理数据合成;RAG 数据提取与合成等等主流数据治理需求。该系统可供用户灵活组织现有算子,开发新算子 ...
美科技巨头角逐五角大楼大单,向AI要营收 | 企服国际观察
Tai Mei Ti A P P· 2025-07-08 03:43
Core Insights - OpenAI signed a $200 million contract with the U.S. Department of Defense to provide AI tools for addressing critical national security challenges [2] - The competition for government contracts in the AI and cloud computing sectors has intensified, with major tech companies vying for lucrative deals [2][3] - The U.S. government is increasingly integrating AI into military operations, with significant investments planned for the coming years [10][12] Government Contracts and Collaborations - OpenAI's contract with the Department of Defense is part of a broader trend where tech companies like Palantir and Snowflake are securing government contracts to enhance their AI capabilities [2][3] - Palantir has seen substantial revenue growth, with 60% of its income derived from government contracts, including a significant contract for Project Maven [2] - Snowflake obtained a $1 billion temporary authorization from the Department of Defense, allowing all military branches to utilize its enhanced data capabilities [3] Major Cloud Providers and AI Integration - The Department of Defense awarded a $9 billion Joint Warfighting Cloud Capability (JWCC) contract to major cloud providers including Amazon, Google, Microsoft, and Oracle [4] - Microsoft has been a key partner for the government, integrating OpenAI's GPT-4 model into various government agencies [4] - Oracle is also involved in providing cloud services to the military, aiming to simplify cloud management and reduce costs [10] Economic Implications of AI - The economic benefits of AI are under scrutiny, with predictions suggesting that generative AI could contribute $7 trillion to global GDP over the next decade [7] - However, some experts argue that the immediate economic impact of AI may be overstated, with many tasks requiring human intervention and expertise [8][9] Shifts in Corporate Policies - Major tech companies are shifting their policies regarding military applications of AI, with OpenAI and Google removing restrictions on the use of their technologies for military purposes [11][12] - This shift indicates a deeper involvement of tech companies in military operations, reflecting the growing importance of AI in national security [12]