Workflow
语言模型
icon
Search documents
搜索Agent最新高效推理框架:吞吐量翻3倍、延迟降至1/5,还不牺牲答案质量丨南开& UIUC研究
量子位· 2025-05-29 01:08
大语言模型(LLM)驱动的搜索智能体,通过动态拆解问题、交错执行"思考"(推理)和"查 找"(检索)来解决复杂任务,展现了惊人能力。 SearchAgent-X团队 投稿 量子位 | 公众号 QbitAI AI越来越聪明,但如果它们反应慢,效率低,也难以满足我们的需求。 然而,这种深度交互的背后,也隐藏着显著的效率痛点。 处理复杂任务时,查得慢、查得不准,都会拖慢整个流程。 来自南开大学和伊利诺伊大学厄巴纳-香槟分校的研究人员深入剖析了这些效率瓶颈,并提出 了一套名为 SearchAgent-X 的高效推理框架。 实践表明,SearchAgent-X实现了 1.3至3.4倍 的吞吐量提升, 延迟降至原来的 1/1.7至 1/5 ,同时不牺牲最终的答案质量。 解析搜索智能体中的两大效率瓶颈因素 研究者发现,看似简单的检索环节,隐藏着两大关键的效率制约因素: 检索精度:并非"越高越好"的微妙平衡 直觉上,检索越准,LLM获取信息质量越高,效率也应该越高。但实际情况是 非单调关系 过低精度 LLM需更多轮检索和推理弥补,总时间增加。 过高精度 检索本身计算资源消耗巨大,拖慢整体速度。 研究表明,系统吞吐量随近似检索 ...
小鹏汽车-W(09868):同级领先智能辅助驾驶,定价超预期
Changjiang Securities· 2025-05-28 23:30
Investment Rating - The investment rating for the company is "Buy" and is maintained [6]. Core Views - On May 28, 2025, the company launched the MONA M03 MAX version, which includes two models: the 502 Long Range Max priced at 129,800 yuan and the 600 Ultra Long Range Max priced at 139,800 yuan. These models feature the full-version AI Tianji system and Turing driving assistance, achieving the strongest urban intelligent driving assistance capabilities in their class. The company is expected to accelerate sales due to a strong new vehicle cycle, channel transformation, and enhanced marketing systems. Financial performance is anticipated to improve continuously due to scale enhancement, cost reduction from platforms and technologies, and the expansion of software profitability models alongside ongoing international growth [2][4][9]. Summary by Sections Event Description - The MONA M03 MAX version was officially launched on May 28, 2025, featuring two models with prices of 129,800 yuan and 139,800 yuan, equipped with advanced AI systems and driving assistance technologies [4]. Sales and Financial Projections - The expected delivery volume for Q2 2025 is between 102,000 and 108,000 units, representing a year-on-year growth of 237.7% to 257.5%. Projected revenue for this period is between 17.5 billion and 18.7 billion yuan, reflecting a year-on-year increase of 115.7% to 130.5%. The company anticipates a strong new vehicle cycle with multiple new models set to launch, which is expected to enhance sales further [6][9]. Competitive Advantage - The MONA M03 Max is the first in its class to feature dual Orin-X chips, providing a computing power of 508 TOPS, significantly surpassing competitors. The intelligent driving capabilities are designed to adapt to driver styles, allowing for seamless control transfer between the driver and the vehicle [9]. Future Outlook - The company expects to achieve a single-quarter profit turnaround by Q4 2025, with an overall positive cash flow for the year. The anticipated revenue for 2025 is projected to reach 99.1 billion yuan, corresponding to a price-to-sales ratio of 1.3X, indicating a significant improvement in financial performance as the company enters a new vehicle cycle [9].
三位顶流AI技术人罕见同台,谈了谈AI行业最大的「罗生门」
3 6 Ke· 2025-05-28 11:59
Core Insights - The AI industry is currently experiencing a significant debate over the effectiveness of pre-training models versus first principles, with notable figures like Ilya from OpenAI suggesting that pre-training has reached its limits [1][2] - The shift from a consensus-driven approach to exploring non-consensus methods is evident, as companies and researchers seek innovative solutions in AI [6][7] Group 1: Industry Trends - The AI landscape is witnessing a transition from a focus on pre-training to exploring alternative methodologies, with companies like Sand.AI and NLP LAB leading the charge in applying multi-modal architectures to language and video models [3][4] - The emergence of new models, such as Dream 7B, demonstrates the potential of applying diffusion models to language tasks, outperforming larger models like DeepSeek V3 [3][4] - The consensus around pre-training is being challenged, with some experts arguing that it is not yet over, as there remains untapped data that could enhance model performance [38][39] Group 2: Company Perspectives - Ant Group's Qwen team, led by Lin Junyang, has faced criticism for being conservative, yet they emphasize that their extensive experimentation has led to valuable insights, ultimately reaffirming the effectiveness of the Transformer architecture [5][15] - The exploration of Mixture of Experts (MoE) models is ongoing, with the team recognizing the potential for scalability while also addressing the challenges of training stability [16][20] - The industry is increasingly focused on optimizing model efficiency and effectiveness, with a particular interest in achieving a balance between model size and performance [19][22] Group 3: Technical Innovations - The integration of different model architectures, such as using diffusion models for language generation, reflects a broader trend of innovation in AI [3][4] - The challenges of training models with long sequences and the need for effective optimization strategies are critical areas of focus for researchers [21][22] - The potential for future breakthroughs lies in leveraging increased computational power to revisit previously unviable techniques, suggesting a cycle of innovation driven by advancements in hardware [40][41]
LLM加RL遭质疑:故意用错奖励,数学基准也显著提升,AI圈炸了
机器之心· 2025-05-28 08:09
机器之心报道 编辑:泽南、+0 我们训练了这么久,都在训练些什么? 这是今年最「好笑」的一篇论文。 本文一出,所有的大语言模型(LLM)+ 强化学习(RL)都要被质疑是否有意义了。 这周二,一篇来自华盛顿大学、艾伦人工智能实验室、伯克利的论文引爆了 AI 界。 作者驳斥了最近大模型领域盛行的强化学习方式,他们发现: 使用虚假奖励训练 Qwen2.5-Math-7B 模型也可以提高 MATH-500 的成绩,如果是随机奖 励,成绩能提高 21%,如果是错误奖励,成绩能提升 25%(真实奖励能提升 28.8%)。 这是怎么一回事?大模型的训练技巧真的有用吗?该工作的作者写了一篇博客进行了介绍: 质疑强化学习 (RLVR) 传统观点 近一段时间,可验证奖励强化学习(RLVR)已成为增强大型语言模型(LLM)推理能力的标准方法。传统观点认为,高质量的监督信号对于有效的 RLVR 训 练至关重要。最近的研究挑战了这一假设,表明使用 RLVR 对单个样本或无监督样本进行训练仍然可以在 Qwen-Math 模型上取得显著的进步。 但是,我们不禁要问:单样本或无监督 RLVR 中的训练信号来自哪里?为了提供有意义的 RLVR ...
北京大学发表最新Cell论文
生物世界· 2025-05-28 07:30
撰文丨王聪 编辑丨王多鱼 排版丨水成文 在活体动物体内精确控制蛋白质激活的通用策略,对于在体内环境中开展蛋白质功能增益 (gain-of-function) 研究至关重要。 2025 年 5 月 27 日,北京大学化学与分子工程学院 、北大 - 清华生命科学联合中心 陈鹏 团队 与 王初 团队合作 ,在国际顶尖学术期刊 Cell 期刊发表了题为: Machine-learning-assisted universal protein activation in living mice 的研究论文。 该研究利用 人工智能 (AI) 辅助,开发 了一种名为 CAGE-Prox vivo 的邻近激活策略,用于在活体动物中按需激活蛋白质以及调控蛋白-蛋白相互作用,为在活 体条件下的时间分辨生物学研究和按需治疗干预提供了一个通用平台。 $${\mathcal{O}}\;\mathbf{C e l P r e s s}$$ 对生物分子进行高选择性和高时间分辨率的功能增益 (gain-of-function ) 研究具有优势且至关重要,有助于剖析各种生物过 程以及揭示疾病病理。人们已开 展了大量的蛋白质工程研究来操控感兴趣 ...
Jeff Dean:一年内 AI 将取代初级工程师,网友:“Altman 只会画饼,Jeff 说的话才致命”
AI前线· 2025-05-28 05:17
作者 | Tina、核子可乐 最近,谷歌传奇工程师 Jeff Dean 在一次访谈中大胆预测:在一年之内,我们将拥有能够 24/7 全天 候运行、具备"初级工程师"能力的 AI 系统。 Jeff Dean 是现代计算领域的传奇人物,曾主导谷歌在大规模分布式系统和人工智能方面的诸多突 破。他不仅是 Google Brain 项目的联合创始人,还先后推动了 MapReduce、Bigtable、Spanner 和 TensorFlow 等关键系统的诞生,自 2018 年起担任 Google AI 负责人,2023 年在 DeepMind 与 Google Brain 合并后出任谷歌首席科学家。从参与 BERT 论文、主导 TPU 研发,到推动谷歌基础 AI 架构的演进,Dean 几乎见证并亲历了谷歌每一个关键的 AI 发展节点。 作为技术界最具影响力的人物之一,Jeff Dean 的这番言论一经发布,迅速在业内引发热议。虽然此 前包括 Sam Altman 在内的不少业内人士也曾表达过类似观点,但 Jeff Dean 的话语分量显然不同。 正如有网友所说:相比那个总在"兜售"某种概念的 Sam Altman,Je ...
腾讯AI,加速狂飙的这半年
雷峰网· 2025-05-27 13:15
" 从团队重构到业务狂飙,腾讯AI 驶入快车道。 " 作者丨胡敏 编辑丨周蕾 "不宜操之过急,还是要修炼好内功。" 去年,各家都在大张旗鼓地讲AI故事,然而腾讯却保持一贯的低调。这种状态,也让各种质疑声音接踵而 来,市场不乏有"腾讯可能会在这一波AI浪潮中掉队"的观点。 但这种预判并未维持多久,到今年,从腾讯元宝、QQ浏览器、ima并入CSIG事业部,再到腾讯混元大模 型相关团队的组织架构变革后,似乎腾讯AI的产业落地开始走上了快车道。 "今年上半年腾讯的AI战略落地速度远超我的预期。"一名二级市场分析师曾对雷峰网说道,今年第一季度 腾讯的资本开支达274.8亿元,同比增长91%。这预示着腾讯正在加速集聚资源、排兵布阵投入AI攻坚 战。 5月21日,腾讯云在北京举办了AI产业峰会,会上,腾讯集团高级执行副总裁、云与智慧产业事业群CEO 汤道生,腾讯云副总裁、腾讯混元大模型技术负责人王迪,以及腾讯云副总裁、腾讯云智能负责人、优图 实验室负责人吴运声几位AI落地的核心人物都出现在了现场,并且透露了不少有关腾讯上半年AI落地的进 度。 腾讯集团高级执行副总裁、云与智慧产业事业群CEO 汤道生 这场峰会让人进一步描摹 ...
如果AI能创造足够的资源,那工作还是必须的吗?
Hu Xiu· 2025-05-27 06:32
Core Viewpoint - The research indicates that AI language models primarily impact white-collar jobs, particularly those that are procedural and highly regulated, while blue-collar jobs that focus on human services are less affected [2][62]. Group 1: AI Exposure and Job Impact - The study utilized AI exposure metrics to analyze 1.25 million job postings from January 2018 to May 2024, calculating the AI language model exposure for various occupations [1][41]. - High AI exposure jobs are predominantly white-collar positions, such as accounting, auditing, editing, sales engineering, and computer programming, while low exposure jobs include cleaners, movers, restaurant cooks, and dishwashers [58][62]. - A negative correlation was observed between AI exposure and the number of new job openings, indicating that as AI exposure increases, the number of new positions decreases [63][71]. Group 2: Labor Market Trends - The average AI language model exposure in the Chinese labor market has shown a declining trend from January 2018 to May 2024, suggesting a reduction in AI-related job postings [66][68]. - The research highlights that while AI exposure is increasing in certain sectors, the overall labor market appears stable, masking underlying shifts in employment dynamics [36][37]. Group 3: Skills Demand Changes - The demand for certain skills is changing, with a decline in the need for communication skills due to the proficiency of AI language models in this area [75][76]. - Conversely, there is an increasing demand for skills such as professionalism, management ability, self-motivation, problem-solving, and collaboration, as AI serves as an assistant rather than a complete replacement [78][81]. Group 4: Future Work Landscape - The impact of technological advancement on jobs is characterized by a "double-edged sword" effect, where some jobs are destroyed while new ones are created [85][86]. - The labor market is experiencing polarization, with increasing demand for both high-skill and low-skill jobs, while middle-skill jobs are declining [103][104]. - The trend towards task-based work is also emerging, where simple, repetitive tasks are in high demand due to the limitations of automation [106][109].
形式化证明与大模型:共创可验证的AI数学未来|量子位直播
量子位· 2025-05-27 03:53
Core Viewpoint - The article discusses the advancements in AI's ability to solve mathematical problems, highlighting the competitive landscape among various teams and projects in this domain [1][2]. Group 1: AI Developments - Recent releases such as DeepSeek Prover V2, Terence Tao's AI math livestream, and Google's AlphaEvolve indicate significant progress in AI's mathematical capabilities [1]. - The FormalMATH benchmark test has gained attention for evaluating AI's performance in automated theorem proving [2]. Group 2: Upcoming Events - A livestream event is scheduled for May 29 at 20:00, featuring discussions on the frontier exploration of formal proofs by large language models, with participation from various project teams [2][4]. - Notable speakers include researchers and experts from institutions like the University of Edinburgh and Hong Kong Chinese University, as well as contributors from the 2077AI initiative [3][4]. Group 3: Community Engagement - The article encourages community interaction through comments and participation in AI discussions, promoting a collaborative environment for sharing insights and developments in AI [4][5].
扩散语言模型九倍推理加速!上海交大:KV Cache并非自回归模型的专属技巧
量子位· 2025-05-27 03:53
图1 不同dLLMs使用dLLM–Cache和不使用dLLM–Cache在速度和质量上的对比 dLLM-Cache具有几个重要的亮点: 1. 训练无关,即插即用。 dLLM-Cache完全在推理过程中工作,无需修改模型参数或重训练。dLLM-Cache可以在完全不损失模型输出质量 的前提下,带来最高9.1倍的推理速度提升 。 2. 通用于主流dLLM架构 ,如LLaDA、Dream以及LLaDA-V、MMaDA、Dimple等多模态模型。 EPIC Lab团队 投稿 量子位 | 公众号 QbitAI 首个用于加速 扩散式大语言模型 (diffusion-based Large Language Models, 简称 dLLMs)推理过程的 免训练 方法。 上海交通大学EPIC Lab团队提出了一种 无需训练、即插即用 的高效推理缓存机制: dLLM-Cache 。 其核心思想在于,在一个多步去噪过程中,复用相邻时间步上变化较小的特征,仅更新那些变化较大的特征,从而实现了计算量的大幅降低, 并保持了原有的生成质量。 3. 在推理过程中, 首次识别出 了prompt部分的Transformer中间层特征(Key、 ...