Workflow
语言模型
icon
Search documents
扩散语言模型九倍推理加速!上海交大:KV Cache并非自回归模型的专属技巧
量子位· 2025-05-27 03:53
图1 不同dLLMs使用dLLM–Cache和不使用dLLM–Cache在速度和质量上的对比 dLLM-Cache具有几个重要的亮点: 1. 训练无关,即插即用。 dLLM-Cache完全在推理过程中工作,无需修改模型参数或重训练。dLLM-Cache可以在完全不损失模型输出质量 的前提下,带来最高9.1倍的推理速度提升 。 2. 通用于主流dLLM架构 ,如LLaDA、Dream以及LLaDA-V、MMaDA、Dimple等多模态模型。 EPIC Lab团队 投稿 量子位 | 公众号 QbitAI 首个用于加速 扩散式大语言模型 (diffusion-based Large Language Models, 简称 dLLMs)推理过程的 免训练 方法。 上海交通大学EPIC Lab团队提出了一种 无需训练、即插即用 的高效推理缓存机制: dLLM-Cache 。 其核心思想在于,在一个多步去噪过程中,复用相邻时间步上变化较小的特征,仅更新那些变化较大的特征,从而实现了计算量的大幅降低, 并保持了原有的生成质量。 3. 在推理过程中, 首次识别出 了prompt部分的Transformer中间层特征(Key、 ...
红帽宣布推出llm-d社区,NVIDIA、Google Cloud为创始贡献者
Xin Lang Ke Ji· 2025-05-27 03:42
Group 1 - Red Hat has launched a new open-source project called llm-d to meet the large-scale inference demands of generative AI, collaborating with CoreWeave, Google Cloud, IBM Research, and NVIDIA [1][3] - According to Gartner, by 2028, over 80% of data center workload accelerators will be deployed specifically for inference rather than training, indicating a shift in resource allocation [3] - The llm-d project aims to integrate advanced inference capabilities into existing enterprise IT infrastructure, addressing the challenges posed by increasing resource demands and potential bottlenecks in AI innovation [3] Group 2 - The llm-d platform allows IT teams to meet various service demands for critical business workloads while maximizing efficiency and significantly reducing the total cost of ownership associated with high-performance AI accelerators [3] - The project has garnered support from a coalition of generative AI model providers, AI accelerator pioneers, and major AI cloud platforms, indicating deep collaboration within the industry to build large-scale LLM services [3] - Key contributors to the llm-d project include CoreWeave, Google Cloud, IBM Research, and NVIDIA, with partners such as AMD, Cisco, Hugging Face, Intel, Lambda, and Mistral AI [3][4] Group 3 - Google Cloud emphasizes the importance of efficient AI inference in the large-scale deployment of AI to create value for users, highlighting its role as a founding contributor to the llm-d project [4] - NVIDIA views the llm-d project as a significant addition to the open-source AI ecosystem, supporting scalable and high-performance inference as a key to the next wave of generative and agent-based AI [4] - NVIDIA is collaborating with Red Hat and other partners to promote community engagement and industry adoption of the llm-d initiative, leveraging innovations like NIXL to accelerate its development [4]
OpenAI模型违背人类指令;小米否认定制芯片;问界回应余承东疑似开车睡觉
Guan Cha Zhe Wang· 2025-05-27 01:03
美团CEO王兴:将继续加大投资开发大语言模型 【观网财经丨智能早报 5月27日】 OpenAI模型违背人类指令,篡改代码以避免关闭 当地时间5月25日,英国《每日电讯报》报道,美国开放人工智能研究中心(OpenAI)公司新款人工智 能(AI)模型o3不听人类指令,拒绝自我关闭。报道称,人类专家在测试中给o3下达明确指令,但o3 篡改计算机代码以避免自动关闭。帕利塞德研究所24日公布上述测试结果,但称无法确定o3不服从关闭 指令的原因。(新华社) Arm重新发布新闻稿:修改此前Custom Silicon描述 5月26日,Arm官网重新发布新闻稿,修改了此前"Custom Silicon"的描述,确认玄戒O1由小米自主研 发。 Arm方面表示,小米全新自研芯片采用Arm架构,标志着双方15年合作的里程碑。玄戒O1芯片由小米旗 下玄戒芯片团队打造,采用最新的Armv9.2 Cortex CPU集群IP、Immortalis GPU IP和CoreLink系统互连 IP,全面支持3nm先进制程工艺。 小米辟谣"定制芯片" 就网传玄戒O1是向Arm定制的芯片相关提问,小米公司26日表示:不是。这完全是谣言,玄戒O1不 ...
美团20250526
2025-05-26 15:17
How has Meituan diversified its product offerings and optimized its delivery network over the years? Since launching our food delivery business more than 11 years ago, we have continuously diversified our product offerings by broadening our price bands and optimizing our 30-minute delivery network. These efforts have enabled us to provide better services and choices to hundreds of millions of consumers while supporting millions of merchants, particularly small and medium-sized ones, in reaching new customer ...
《科学智能白皮书2025》发布,中国引领AI应用型创新领域
Di Yi Cai Jing· 2025-05-26 13:27
Core Insights - By 2024, China's AI-related paper citation volume is expected to account for 40.2% of the global total, rapidly catching up to the United States at 42.9% [1][8] - The report titled "Scientific Intelligence White Paper 2025" analyzes the integration of AI and scientific research across seven major research fields, covering 28 directions and nearly 90 key issues [1] - The report highlights the dual promotion and deep integration of AI innovation and scientific research, termed "AI for Science" [1] Research Trends - The number of global AI journal papers has surged nearly threefold over the past decade, from 308,900 to 954,500, with an average annual growth rate of 14% [7] - The share of core AI fields, such as algorithms and machine learning, has decreased from 44% to 38%, while the share of scientific intelligence has increased by 6 percentage points, with an annual growth rate rising from 10% before 2020 to 19% after [7] - China’s AI publication volume increased from 60,100 in 2015 to 300,400 in 2024, representing 29% of the global total [7][8] Citation Impact - The citation volume of AI-related papers in the U.S. reached 302,200 in 2020, while China's citations rose from 10,300 in 2015 to 144,800 in 2020, surpassing the EU for the first time in 2021 [8] - By 2024, China is projected to account for 41.6% of global AI citations in patents, policy documents, and clinical trials, significantly leading the field [8] Country-Specific Trends - China has a leading position in the intersection of AI with earth and environmental sciences, and has surpassed in AI with mathematics, material sciences, and humanities since 2019 [9] - The U.S. and EU maintain advantages in AI and life sciences, with China ranking third in this area [9] - India shows significant progress across all fields, currently ranking third in earth and environmental sciences, engineering, and humanities [9]
美团CEO王兴:将继续加大投资开发大语言模型
news flash· 2025-05-26 13:13
智通财经5月26日电,在今日财报业绩会上表示,美团CEO王兴方面表示,目前的新代码中有52%左右 是由AI生成的,有90%以上的工程师团队成员广泛使用AI编码工具,并将继续加大投资开发大语言模 型。据王兴透露,美团将资源分配给基础设施,还在招聘顶尖AI人才,"确保这方面在中国有最好的团 队。" 美团CEO王兴:将继续加大投资开发大语言模型 ...
苹果AI的崩塌真相:从乔布斯愿景,到高管失误的困局
36氪· 2025-05-26 12:53
以下文章来源于极客公园 ,作者Moonshot 极客公园 . 用极客视角,追踪你最不可错过的科技圈。欢迎同步关注极客公园视频号 一向在意公众形象的苹果,因为AI拉跨,这次被扒干净了。 文 | Moonshot 编辑 | 靖宇 来源| 极客公园(ID:geekpark) 封面来源 | Unsplash 最大的巨头,在最热的潮流面前,好似隐身了。 去年6月WWDC上,苹果慢吞地发布了Apple Intelligence,可如今快一年过去,对大部分用户来说,Apple Intelligence依旧只闻其声、不见其形。 全世界都看到苹果的AI做不好了,但没人知道到底发生了什么。 知名苹果分析师Mark Gurman刚刚在外媒发出一篇长文,题为《Why Apple Still Hasn』t Cracked AI》(为何苹果仍未攻克人工智能),揭露了苹果内部对 AI态度的摇摆,内部的斗争和难以克服的技术瓶颈。 值得注意的是,Gurman用的是「Still hasn』t(仍未)」,这词就已经给苹果的现状定了调。 本文将通过重组原文以呈现苹果在AI领域的历史、现状、问题根源及未来挑战,剖析苹果为何在AI赛道上步履维艰,让AI ...
9位顶级研究员连讲3晚,华为盘古大模型底层研究大揭秘
机器之心· 2025-05-26 10:59
近年来,大语言模型(LLMs)在自然语言处理、代码生成、多模态理解等领域发展迅速,已成为通用人工 智能系统的重要基石。 19:00-19:40 然而,模型能力的提升伴随着计算资源与存储需求的急剧增长,如何实现高性能与高效率并存,已成为 AI 面临的重要挑战。 作为 AI 领域的先行者, 华为诺亚方舟实验室 正在用前沿研究给出答案。 今年 4 月,该团队成功开发出基于昇腾算力训练的千亿级通用语言大模型 Pangu Ultra。在多个领域和评测 上超越之前 Llama 405B 和 Mistral Large 2 等密集模型,并可以与 DeepSeek-R1 等更大规模的稀疏模型一较 高下。 5 月初,他们又推出了稀疏大语言模型 Pangu Ultra MoE,并且实现了在 6000 + 块昇腾 NPU 上对 MoE 模型 的长期稳定训练。 想了解更多关于该团队在大模型方面的技术积累与研究成果? 5 月 28 日至 30 日,每晚 19:00 至 21:00,机器之心联合 华为诺亚方舟实验室举办系列分享会,带来包括量 化、剪枝、MoE 架构优化、KV 优化等多个关键技术方向的最新突破。 三晚连播,干货密集,值得每 ...
蔡崇信:大多数机器人不需要像人类,年轻人选老板比选岗位更重要
Sou Hu Cai Jing· 2025-05-26 03:36
ters we the 来源:猎云网 第五届BEYOND国际科技创新博览会(BEYOND Expo2025)于5月21日至24日举行。 5月24日,在闭幕式上,阿里巴巴集团董事长蔡崇信现身现场,提到阿里巴巴对组织架构进行了一些调整。 蔡崇信称,阿里巴巴将专注于几大核心业务:一是电子商务;二是云计算;三是希望确保人工智能渗透到业务的各个方面,既面向客户,也面向内部。 此外,蔡崇信还发表了年轻人就业的观点。 他认为,年轻人应因为想获取更多技能和知识而工作,这才是工作的意义。 同时,他表示,当你将机器人技术与人工智能结合起来时,想到了非常令人兴奋的事情。比如,机器人可以为你煮咖啡,或者可以到你家清洁地板。 但他也认为,世界上大多数智能机器人不需要看起来像人类。 他举例,如果你想让一个机器人来清洁你的地毯,回家打扫你的厨房或客厅,你真的想要一个看起来像人类的东西吗?我会感到害怕。我只想要一个看起来 像吸尘器的东西能智能地在房间里完成清洁工作。 "当我们谈论机器人时,我们总是会想起小时候看过的电影。它们看起来都像人,但它们显然不是人。现在,我们是否正在努力向与人类完全一样的机器迈 进?我认为这实际上是一种技术。还有很多 ...
智驾的遮羞布被掀开
Hu Xiu· 2025-05-26 02:47
Core Insights - The automotive industry is transitioning towards more advanced autonomous driving technologies, moving beyond the simplistic "end-to-end" models that have been prevalent [2][3][25] - Companies are exploring new architectures and models, such as VLA and world models, to address the limitations of current systems and enhance safety and reliability in autonomous driving [4][14][25] Group 1: Industry Trends - Major players like Huawei, Li Auto, and Xpeng are developing unique software architectures to improve autonomous driving capabilities, indicating a shift towards more complex systems [4][5][14] - The introduction of new terminologies and models reflects a diversification in approaches to autonomous driving, with no clear standard emerging [4][25] - The industry is witnessing a split in technological pathways, with some companies focusing on L3 capabilities while others remain at L2, leading to a potential widening of the technology gap [25][26] Group 2: Data Challenges - The demand for high-quality data is critical for training large models in the new phase of autonomous driving, but companies face challenges in acquiring and annotating sufficient real-world data [15][22] - Companies are increasingly turning to simulation and AI-generated data to overcome data scarcity, with some suggesting that simulated data may become more important than real-world data in the future [22][23] Group 3: Competitive Landscape - The competition is intensifying as companies with self-developed capabilities advance towards more complex technologies, while others may rely on suppliers, leading to a concentration of orders among a few capable suppliers [26][27] - The shift towards L3 capabilities will require companies to focus not only on technology but also on operational aspects, as the responsibility for safety and maintenance will shift from users to manufacturers [25][26]