大语言模型

Search documents
大模型智能体如何突破规模化应用瓶颈,核心在于Agentic ROI
机器之心· 2025-05-30 04:16
Core Viewpoint - The main barrier to the usability of large language model agents (LLM Agents) is not the capability of the models but rather the "Agentic ROI" which has not reached a practical threshold for widespread application [1][3][4]. Group 1: Agentic ROI Concept - Agentic ROI (Agentic Return on Investment) is a key metric that measures the ratio of "information yield" to "usage cost" for LLM Agents in real-world scenarios [4]. - Usability is achieved only when the quality of information exceeds a certain threshold and the ratio of time and cost saved by the agent is sufficiently high [4][5]. Group 2: Current Application Landscape - Most LLM Agents are currently applied in high human task time cost scenarios, such as research and programming, where human labor is intensive, thus allowing for significant efficiency improvements [7]. - In everyday applications with high user demand, such as e-commerce and personal assistants, the tasks are simpler, leading to lower marginal value from LLM Agents, which may introduce additional interaction costs and delays, resulting in low Agentic ROI [7]. Group 3: Development Trajectory - The development path of LLM Agents is characterized by a "zigzag" model of first scaling up to enhance information quality, followed by scaling down to reduce time and cost while maintaining quality [9]. - The evolution of foundational models, such as the OpenAI series, illustrates this zigzag trend, with significant performance improvements in larger models and the introduction of smaller models that maintain performance while reducing inference costs and delays [9]. Group 4: Scaling Up Information Quality - Pre-training scaling involves expanding model size, data volume, and computational resources to enhance foundational capabilities in language understanding and reasoning [11]. - Post-training scaling, including supervised fine-tuning and reinforcement learning, aligns the agent's performance with human needs and values, relying on extensive interaction data for continuous learning [12]. - Test-time scaling focuses on building a world model that supports multimodal interactions and can handle complex tasks while reflecting real-world uncertainties [13]. Group 5: Ensuring Robustness and Security - Ensuring the robustness and security of LLM Agents is crucial for enhancing information quality, preventing exploitation of reward mechanisms, and safeguarding against data contamination and feedback manipulation [16]. Group 6: Scaling Down to Reduce Time and Cost - Introducing memory mechanisms allows agents to skip redundant calculations, leveraging past knowledge to enhance processing speed [18]. - Model compression techniques can significantly reduce computational resources and inference delays without compromising performance [18]. - Optimizing reasoning strategies and infrastructure can further enhance the efficiency and responsiveness of LLM Agents [18]. Group 7: Cost Management - Reducing interaction time by enabling agents to proactively understand user intent can lower cognitive burdens and improve user experience [19]. - Managing operational costs effectively is essential, especially in large-scale deployments, by optimizing context management and controlling inference complexity [19]. - Agentic ROI serves as a framework for evaluating the real usability of LLM Agents, shifting focus from mere model performance to practical benefits and comprehensive efficiency [19].
谷歌之后,英伟达入局扩散大语言模型,Fast-dLLM推理速度狂飙27.6倍
机器之心· 2025-05-30 03:28
Core Viewpoint - The article discusses the breakthrough in inference speed for diffusion models through the introduction of Fast-dLLM, which utilizes a training-free acceleration approach, enhancing the practical application of large language models (LLMs) [2][20]. Group 1: Core Technology - Fast-dLLM employs a Block-Wise KV Cache mechanism, achieving over 90% activation reuse, significantly improving computational efficiency for long sequence inference [6][12]. - The Confidence-Aware Parallel Decoding method allows for parallel decoding while maintaining token dependency, ensuring coherent generation by filtering tokens based on confidence levels [9][13]. - The dual cache strategy enables simultaneous caching of prefix and suffix attention activations, reducing redundant calculations and enhancing performance [12]. Group 2: Performance Breakthrough - Fast-dLLM achieves a 27.6 times end-to-end acceleration for long text generation tasks, reducing single-step latency from 0.26 seconds to 0.09 seconds, and overall time from 266 seconds to 12 seconds [18]. - The accuracy loss in mainstream benchmark tests is kept under 2%, demonstrating the model's effectiveness in maintaining quality while improving speed [19]. Group 3: Application Value - Fast-dLLM's zero-training cost feature makes it an ideal tool for inference optimization, allowing for quick integration into existing systems without altering model architecture or training processes [20]. - The model shows compatibility with various existing models like LLaDA and Dream, achieving significant throughput improvements while maintaining competitive accuracy [21]. Group 4: Summary and Outlook - Fast-dLLM represents a significant advancement in inference efficiency for diffusion models while ensuring stable generation quality, paving the way for broader applications in real-time interaction and long text generation [23].
2025国际人形机器人技能大赛召开 业内呼吁理性包容机器人行业“成长的烦恼”
Zheng Quan Shi Bao Wang· 2025-05-29 14:07
以"具身智能,未来已来"为主题的2025张江具身智能开发者大会暨2025国际人形机器人技能大赛5月29 日在上海浦东张江举行。2025国际人形机器人技能大赛设置5大赛道,覆盖28个高难度场景,60余支顶 尖参赛队伍和国内外参赛选手共同参赛。本次大赛旨在于集中展示人形机器人解决实际问题的能力和场 景应用落地能力,助力机器人产业向"能看、会说、有智商"进阶。 国家地方共建人形机器人创新中心首席科学家江磊在接受证券时报记者采访时表示,上海聚焦人形机器 人在生产制造、服务场景中的实际应用,强调 "解决实际问题"。本次大赛设定商超、药店、工业制造 等真实场景任务,以场景驱动行业回归实用。江磊说,公众需以包容的心态看待机器人行业发展现状。 行业的快速发展仅有三年左右时间,如同"三岁儿童",需给予它更长的时间发育。 参赛者:理性看待机器人行业发展进度 当天,上海开普勒K2"大黄蜂"团队、北京理工大学急行智学团队、清华大学紫荆战队等60余支顶尖参 赛队伍和国内外参赛选手,在9个比赛场地同步展开激烈角逐。 记者了解到,此次赛事的所有项目均源自企业实际需求,每个赛道都还原了真实应用场景,总体难度较 大。因此,机器人未能完成任务并 ...
Linear-MoE:线性注意力遇上混合专家的开源实践
机器之心· 2025-05-29 11:38
线性序列建模的崛起 近年来随着大语言模型的爆火,旨在取代 Transformer 的高效模型架构及其预训练成为大模型领域的研究热点,主要包括线性序列建模(如 Linear Attention、SSM、Linear RNN 等)和混合专家(Mixture-of-Experts, MoE)两部分。这两部分分别都有了长足的进步,但两者的结合却鲜少有人研究, 两者结合后的 Linear-MoE 架构开源实现更是完全缺失。 值得一提的是,近期广受好评的 MiniMax-01 模型(使用 Lightning Attention-MoE)和腾讯混元 TurboS 模型(使用 Mamba2-MoE)均属于 Linear-MoE 架构。 来自上海人工智能实验室团队的最新成果 Linear-MoE,首次系统性地实现了线性序列建模与 MoE 的高效结合,并开源了完整的技术框架,包括 Modeling 和 Training 两大部分,并支持层间混合架构。为下一代基础模型架构的研发提供了有价值的工具和经验。 过去两年,线性序列建模技术取得了显著进展,其核心优势在于线性时间复杂度的训练和恒定内存占用的推理。 这类模型主要分为三大类:线性 ...
重新理解Agent的边界与潜力:AI转型访谈录
3 6 Ke· 2025-05-29 10:53
2025年被誉为"Agent元年",从企业级AI助手到个人规划工具,各类Agent如雨后春笋般涌现。然而,尽 管市场热情高涨,Agent仍未形成统一的定义——它究竟是"下一代App",还是更接近"智能协作者"?多 数人仍将其视为传统工具的升级版,但真正的变革潜力或许远超想象。 在这场Agent的探索浪潮中,AI Native公司正尝试突破传统框架,重新定义其边界。它们不再局限于"效 率工具"的定位,而是探索Agent在商业洞察、创意生成、组织变革等领域的深层价值。 在本次访谈中,特赞创始人范凌博士将分享他对Agent的独特见解——通过大语言模型模拟真实用户行 为,让AI不仅回答问题,更能主动构建用户画像、驱动决策流程,甚至暴露人类思维的盲区。这种创 新不仅挑战了我们对Agent的认知,也预示着人机协作的全新模式。 【 核心洞察 】 Atypica.ai与传统Agent最大的不同是什么? 范凌: 传统上,研究人员主要是通过模拟来解决这类复杂问题。以前的模拟主要关注群体行为,就像研究一群 小老鼠那样研究人群的整体趋势。但有了大语言模型后,我们现在可以更好地研究和模拟个人行为。这 就是为什么我们给产品取名叫"Aty ...
重新理解Agent的边界与潜力|AI转型访谈录
腾讯研究院· 2025-05-29 09:28
2025年被誉为"Agent元年",从企业级AI助手到个人规划工具,各类Agent如雨后春笋般涌现。然而, 尽管市场热情高涨,Agent仍未形成统一的定义——它究竟是"下一代App",还是更接近"智能协作 者"?多数人仍将其视为传统工具的升级版,但真正的变革潜力或许远超想象。 在这场Agent的探索浪潮中,AI Native公司正尝试突破传统框架,重新定义其边界。它们不再局限 于"效率工具"的定位,而是探索Agent在商业洞察、创意生成、组织变革等领域的深层价值。 在本次访谈中, 特赞创始人范凌博士 将分享他对Agent的独特见解——通过大语言模型模拟真实用户 行为,让AI不仅回答问题,更能主动构建用户画像、驱动决策流程,甚至暴露人类思维的盲区。这种 创新不仅挑战了我们对Agent的认知,也预示着人机协作的全新模式。 【 核心洞察 】 Atypica.ai与传统Agent最大的不同是什么? 徐思彦: 产品创新: 与传统AI相比,Atypica.ai的创新点是模拟真实的人,用大语言模型研究典型用户,多 个AI助手协同高效低成本进行大规模用户访谈。 发散优先模型: 在推理层做发散优先模型,适合处理商业问题的非共识 ...
小鹏汽车-W(09868):同级领先智能辅助驾驶,定价超预期
Changjiang Securities· 2025-05-28 23:30
Investment Rating - The investment rating for the company is "Buy" and is maintained [6]. Core Views - On May 28, 2025, the company launched the MONA M03 MAX version, which includes two models: the 502 Long Range Max priced at 129,800 yuan and the 600 Ultra Long Range Max priced at 139,800 yuan. These models feature the full-version AI Tianji system and Turing driving assistance, achieving the strongest urban intelligent driving assistance capabilities in their class. The company is expected to accelerate sales due to a strong new vehicle cycle, channel transformation, and enhanced marketing systems. Financial performance is anticipated to improve continuously due to scale enhancement, cost reduction from platforms and technologies, and the expansion of software profitability models alongside ongoing international growth [2][4][9]. Summary by Sections Event Description - The MONA M03 MAX version was officially launched on May 28, 2025, featuring two models with prices of 129,800 yuan and 139,800 yuan, equipped with advanced AI systems and driving assistance technologies [4]. Sales and Financial Projections - The expected delivery volume for Q2 2025 is between 102,000 and 108,000 units, representing a year-on-year growth of 237.7% to 257.5%. Projected revenue for this period is between 17.5 billion and 18.7 billion yuan, reflecting a year-on-year increase of 115.7% to 130.5%. The company anticipates a strong new vehicle cycle with multiple new models set to launch, which is expected to enhance sales further [6][9]. Competitive Advantage - The MONA M03 Max is the first in its class to feature dual Orin-X chips, providing a computing power of 508 TOPS, significantly surpassing competitors. The intelligent driving capabilities are designed to adapt to driver styles, allowing for seamless control transfer between the driver and the vehicle [9]. Future Outlook - The company expects to achieve a single-quarter profit turnaround by Q4 2025, with an overall positive cash flow for the year. The anticipated revenue for 2025 is projected to reach 99.1 billion yuan, corresponding to a price-to-sales ratio of 1.3X, indicating a significant improvement in financial performance as the company enters a new vehicle cycle [9].
Jeff Dean:一年内 AI 将取代初级工程师,网友:“Altman 只会画饼,Jeff 说的话才致命”
AI前线· 2025-05-28 05:17
作者 | Tina、核子可乐 最近,谷歌传奇工程师 Jeff Dean 在一次访谈中大胆预测:在一年之内,我们将拥有能够 24/7 全天 候运行、具备"初级工程师"能力的 AI 系统。 Jeff Dean 是现代计算领域的传奇人物,曾主导谷歌在大规模分布式系统和人工智能方面的诸多突 破。他不仅是 Google Brain 项目的联合创始人,还先后推动了 MapReduce、Bigtable、Spanner 和 TensorFlow 等关键系统的诞生,自 2018 年起担任 Google AI 负责人,2023 年在 DeepMind 与 Google Brain 合并后出任谷歌首席科学家。从参与 BERT 论文、主导 TPU 研发,到推动谷歌基础 AI 架构的演进,Dean 几乎见证并亲历了谷歌每一个关键的 AI 发展节点。 作为技术界最具影响力的人物之一,Jeff Dean 的这番言论一经发布,迅速在业内引发热议。虽然此 前包括 Sam Altman 在内的不少业内人士也曾表达过类似观点,但 Jeff Dean 的话语分量显然不同。 正如有网友所说:相比那个总在"兜售"某种概念的 Sam Altman,Je ...
腾讯AI,加速狂飙的这半年
雷峰网· 2025-05-27 13:15
Core Viewpoint - Tencent's AI strategy has accelerated significantly in 2023, with substantial investments and organizational restructuring leading to rapid advancements in AI model capabilities and product applications [2][19][26]. Group 1: AI Model Development - Tencent's mixed Yuan language model, TurboS, has achieved a ranking among the top eight global models, with improvements in reasoning, coding, and mathematics capabilities [6][5]. - The TurboS model has seen a 10% increase in reasoning ability, a 24% improvement in coding skills, and a 39% enhancement in competition mathematics scores [6][8]. - The mixed Yuan T1 model has also improved, with an 8% increase in competition mathematics and common-sense question answering capabilities [7]. Group 2: Multi-Modal Technology Breakthroughs - Tencent has made significant advancements in multi-modal generation technology, achieving "millisecond-level" image generation and over 95% accuracy in GenEval benchmark tests [8]. - The company has introduced a game visual generation model that enhances game art design efficiency by several times [9]. Group 3: Productization and Application - Tencent is focusing on providing tools that integrate AI capabilities into customer scenarios, rather than just offering raw models [11][12]. - The Tencent Cloud Intelligent Agent Development Platform has been upgraded to support multi-agent collaboration and zero-code development, making it easier for enterprises to implement AI solutions [12][13]. Group 4: Knowledge Base and Intelligent Agents - Tencent emphasizes the importance of knowledge bases for AI applications, as they help in efficiently collecting and categorizing enterprise knowledge [17][18]. - The company has upgraded its knowledge management product, Tencent Lexiang, to better serve enterprise needs, resulting in significant efficiency improvements for clients like Ecovacs [18]. Group 5: Acceleration Factors - The rapid development of Tencent's AI capabilities is attributed to the success of the DeepSeek model, which has catalyzed resource mobilization within the company [21][22]. - Organizational restructuring has led to the establishment of new departments focused on large language models and multi-modal models, enhancing research and product development efficiency [22][24].
美中嘉和(02453) - 自愿公告质子治疗大模型正式发佈
2025-05-27 09:37
香港交易及結算所有限公司及香港聯合交易所有限公司對本公告的內容概不負責,對其準確性 或完整性亦不發表任何聲明,並明確表示,概不對因本公告全部或任何部分內容而產生或因倚 賴該等內容而引致之任何損失承擔任何責任。 CONCORD HEALTHCARE GROUP CO., LTD. 美中嘉和醫學技術發展集團股份有限公司 (於中華人民共和國註冊成立的股份有限公司) (股份代號:2453) 自願公告 質子治療大模型正式發佈 本公告乃由美中嘉和醫學技術發展集團股份有限公司(「本公司」)董事會(「董事 會」)自願刊發。 本公司於腫瘤精準診療技術領域取得重要進展,本公司自主研發的質子治療垂直 領域大語言模型正式發佈,並成功部署於廣州泰和腫瘤醫院。自廣州泰和腫瘤醫 院質子治療開診以來,質子治療已完成多例高質量患者治療案例,展現出了治療 精準、療效顯著、副作用降低等突出優勢。 本公司股東及潛在投資者於買賣本公司股份時務請審慎行事。 承董事會命 美中嘉和醫學技術發展集團股份有限公司 董事長兼執行董事 楊建宇 中國北京,2025年5月27日 於本公告日期,董事會包括(i)執行董事楊建宇博士、付驍女士及常亮先生;(ii)非 執行董事 ...