Workflow
大语言模型
icon
Search documents
11.98万元起售,小鹏MONA M03加推四款新版型升级上市
Bei Jing Shang Bao· 2025-06-04 04:11
北京商报讯(记者 刘晓梦)5月28日,小鹏MONA M03升级上市,并加推四款全新版型,分别为小鹏MONA M03 502长续航Max、小鹏MONA M03 600超长 续航Max,以及小鹏MONA M03 515长续航 Plus、小鹏MONA M03 620超长续航Plus,官方指导价为11.98万—13.98万元。作为MONA系列的进阶产品,本次 更新在智能驾驶、座舱系统、外观配置等多方面集中升级,强化小鹏纯电市场的智能化竞争力。 与此同时,新车首次搭载全球首发的人机共驾功能,实现智能辅助驾驶过程中不强制接管控制,提升人机协同的平滑性与驾驶者掌控感。泊车能力也同步增 强,支持极窄车位、自主出库与全场景车位到车位路径规划,强调智能体验的日常可用性。 智能座舱方面,MONA M03 Max首发天玑系统5.7.0版本,新增超300项功能,语音控制覆盖率超过90%。依托小鹏自研的大语言模型XGPT,座舱实现推 理、百科查询、连续对话等复杂交互,语音响应时间控制在0.9秒内。系统兼容多家主流手机品牌,进一步拓展车机生态的使用边界。 在造型与舒适性方面,新车提供星暮紫、微月灰、星雨青三款原厂新车色,新增两种轮毂设计, ...
超越GPT-4o!华人团队新框架让Qwen跨领域推理提升10%,刷新12项基准测试
量子位· 2025-06-04 00:17
General-Reasoner团队 投稿 量子位 | 公众号 QbitAI 一项新的强化学习方法,直接让Qwen性能大增,GPT-4o被赶超! 来自加拿大滑铁卢大学与TikTok新加坡,M-A-P的华人团队提出了一种全新训练框架: General- Reasoner 。 结果直接让Qwen系列大模型的跨领域推理准确率提升近10%,在多个基准测试中甚至超越GPT-4o。 上图显示出General-Reasoner在多项跨领域评测中显著提升基础模型推理能力。 当前,强化学习(RL)被视为提升模型推理能力的关键手段。其中,Zero-RL方法通过直接训练基础 模型,已在数学和编程等结构化任务上展现出强大效果。 问题是,这些方法往往局限于数据丰富、答案结构清晰的领域,在面对物理、金融或人文社科等更广 泛的领域时,模型难以有效泛化。 接下来看看研究团队是如何解决这些推理难题的? 相较现有方法的关键革新 目前的Zero-RL框架如SimpleRL通常聚焦于单一领域数据,采用简单的规则式答案验证,存在以下不 足: 数据单一 多为数学竞赛或代码任务,泛化能力有限; 验证方式僵化 仅能识别明确结构化答案,无法灵活处理多样化的答 ...
工银瑞信马丽娜:两大方向布局AI核心主线
券商中国· 2025-06-03 23:15
Core Viewpoint - The article emphasizes the ongoing investment trend in artificial intelligence (AI) led by DeepSeek since 2025, with a focus on public funds, particularly the upcoming launch of the 工银科技先锋混合发起式基金 managed by Marina, which targets high-quality companies in the AI industry chain [1][2]. Investment Focus - The new fund will concentrate on two main areas: AI infrastructure and AI semiconductors, as well as AI applications, reflecting the current technological trends driven by large language models [2][8]. - Marina's investment strategy involves identifying companies that benefit from industry trends, focusing on those with high performance growth, valuation flexibility, and competitive barriers [5][6]. Fund Management Background - Marina has a strong academic background in microelectronics and computer science from Peking University and has been with 工银瑞信基金 for 10 years, specializing in technology industry research and investment [3]. - The 工银科技先锋 fund represents Marina's latest move in the AI industry chain, differing from her previous fund, 工银新兴制造, by having a broader investment scope that includes more AI applications [3][4]. Market Trends and Predictions - The article outlines that the current AI investment wave is characterized by the development of large language models, with significant advancements expected in AI applications over the next 3-5 years as model capabilities improve and costs decrease [4][8]. - The article also highlights that the hardware infrastructure in China is catching up, and the gap in model development between China and the US is narrowing, suggesting a potential advantage for domestic applications due to a large internet market and a well-established robotics industry [8][9].
“互联网女皇”AI报告图解版:AI采用速度前所未有,推理成本暴跌99.7%
3 6 Ke· 2025-06-03 12:14
在隐退五年后,被誉为"互联网女皇"的传奇风险投资家玛丽·米克尔于近日发布长达340页的《AI趋势报告》。这份被业界称为"AI圣 经"的文档,用51次"前所未有"的表述宣告:人工智能革命已进入不可逆的爆发期,人类正站在技术奇点的临界点。 在报告中,米克尔利用大量图表详尽呈现了人工智能技术在开发速度、应用广度、资金投入和使用规模方面的爆炸性增长,并质疑 OpenAI等AI巨头的"烧钱模式"是否能持续下去。 下面,就让我们以图表的形式解读下这份报告的核心内容: 用户的AI采用速度是前所未有的 报告显示, 人工智能时代的来临标志,是AI用户群的激增。 与互联网1.0革命的技术起步于美国,然后稳步向全球扩散不同的是,ChatGPT一下子登上了世界舞台,并在全球大部分地区同时增长。 作为衡量算力的基本计量单位,浮点运算次数在2010年以后开始增速显著增加,年增长率达到360%。 如果以美国计算相关专利授权数量为例,可以发现,第一次加速是在1995年,标志着互联网时代的开始。2004年起,其增速放缓,标志 着互联网时代的发展也开始变慢。在2022年ChatGPT发布之后,专利数量又一次开始爆发式增长,而且比1995年那次更 ...
“不用 Cursor和 ChatGPT、手写代码的开发者,怕不是疯了?”
3 6 Ke· 2025-06-03 08:53
Core Viewpoint - The article discusses the contrasting perspectives on AI, particularly large language models (LLMs), in software development, highlighting the divide between supporters and skeptics [3][10][26]. Group 1: Supporters' Perspective - Supporters argue that AI tools have significantly improved efficiency in software development, with examples such as Kenton Varda from Cloudflare completing a project in days that would have taken weeks or months without AI assistance [7]. - The use of AI in programming is seen as a major technological breakthrough, with the potential to transform the development process and reduce the barriers to entry for new developers [2][12]. - AI tools can handle repetitive coding tasks, allowing developers to focus on more complex problems and enhancing overall productivity [13][15]. Group 2: Skeptics' Perspective - Skeptics believe that AI is overhyped and that many developers still prefer traditional coding methods, viewing reliance on AI as a sign of incompetence [4][8]. - Concerns are raised about the quality of AI-generated code, with some experienced developers dismissing it as "garbage" and expressing reluctance to use AI tools [8][21]. - The debate on AI's role in programming has sparked extensive discussions online, indicating a significant divide in the developer community [6][10]. Group 3: The Role of AI in Programming - The article emphasizes that while AI can assist in coding, it is crucial for developers to understand the code being generated to ensure quality and reliability [16][17]. - AI's ability to automate mundane tasks is highlighted as a way to free developers from repetitive work, allowing them to engage in more meaningful and creative aspects of software development [23][25]. - The emergence of asynchronous AI agents represents a new frontier in programming, enabling developers to explore multiple solutions simultaneously and improve workflow efficiency [31][32].
重磅报告下载 | 2025生成式AI: 当DeepSeek颠覆行业, 近2万亿美元的市场有哪些机遇?
彭博Bloomberg· 2025-06-03 06:30
本文节选自彭博终端"彭博行业研究《2025年生成式AI展望》",彭博终端用户可运行{NSN SWJ7Y1DWX2PS0 }阅读。如您还不是终 端用户,您可在文末"阅读原文"联系我们预约产品演示。 彭博行业研究 2025年生成式AI展望 生成式人工智能(AI)和大语言模型(LLM)的应用已经渗透到科技领域的各个环节并迅速发 展。预计到2032年, 这个市场将创造约1.8 万亿美元的收入。 彭博行业研究认为,随着由思维链和强化学习加持的推理模型更受青睐,LLM的应用可能从基 于文本的搜索扩大至各种图片、音频和视频的分析;除了LLM赋能的合同审查和客服聊天机器 人等现有用例外,集成写作和编程助手以及利用文本和语音提示词生成图像和视频的工具,也 将推动生成式 AI智能体在消费端和企业端的部署;DeepSeek问世后,大多数LLM公司都致力 于提高模型效率,从而实现大规模推理。 核心议题: 长按或扫描二维码 阅读完整报告 推理超过训练的时间有望提前: 推理支出超过训练支出的时间可能比我们之前的预测至 少提前三年。 大语言模型之间的差距缩小: OpenAI的GPT、谷歌的Gemini、Meta的Llama、 Anthro ...
思维链也会「跳帧」?浙大团队提出CoT-Bridge,显著提升数学推理性能
机器之心· 2025-06-03 06:26
在大语言模型(LLM)飞速发展的今天,Chain-of-Thought(CoT)技术逐渐成为提升复杂推理能力的关键范式,尤 其是在数学、逻辑等结构化任务中表现亮眼。 本文的共同第一作者是徐皓雷和颜聿辰。徐皓雷是浙江大学的一年级硕士生,主要研究兴趣集中在大模型推理和可解释 性研究;颜聿辰是浙江大学博士三年级研究生,主要研究兴趣集中在大模型推理和智能体。本文通讯作者是浙江大学鲁 伟明教授和沈永亮研究员。 但你是否注意到:即使是精心构建的 CoT 数据,也可能存在 "跳跃式" 推理,缺失关键中间步骤。对人类专家来说这 些步骤或许 "理所当然",但对模型而言,却可能是无法逾越的鸿沟。 为了解决这一问题,浙江大学联合微软亚洲研究院、香港中文大学提出了 Thought Leap Bridge 任务,并开发了思维 链修复方法:CoT-Bridge。实验显示,该方法显著提升了多个数学与逻辑任务中的推理准确率,并能作为 "即插即用" 的模块嵌入到知识蒸馏、强化学习等流程中。 CoT 不等于 Coherent-of-Thought 思维跳跃是如何破坏推理链的? CoT 的设计初衷是让大模型像人一样 "按步骤思考",然而研究团队发 ...
四月游戏收入同比增长超20%,游戏ETF(516010)涨超3%
Mei Ri Jing Ji Xin Wen· 2025-06-03 03:01
相关机构表示,人工智能持续发展有望提振游戏板块。游戏领域属于比较成熟的人工智能应用领域,未 来结合大语言模型能否产生新的玩法也是游戏行业的增长点之一。例如剧本编写方面,给大语言模型一 个提纲去编写剧本,再给它一些新的指引,让它再去调优,未来或可产生新的玩法。未来可能通过大语 言模型直接赋予游戏内人物独立人格,可以让它自己在游戏的世界里面去完成自己的动作和行为,也即 真正能够在虚拟世界里面再去创造出虚拟世界。 注:指数/基金短期涨跌幅及历史表现仅供分析参考,不预示未来表现。市场观点随市场环境变化而变 动,不构成任何投资建议或承诺。文中提及指数仅供参考,不构成任何投资建议,也不构成对基金业绩 的预测和保证。如需购买相关基金产品,请选择与风险等级相匹配的产品。基金有风险,投资需谨慎。 (文章来源:每日经济新闻) 消息面,伽马数据显示,2025年4月中国游戏市场规模达273.51亿元,同比增长21.93%,其中移动游戏 同比增长28.41%,出海收入同比增长9.62%。 中信建投表示,Deepseek R1深度思考能力全球领先。R1在数字测试AIME2024和代码测试 LiveCodeBench中均超越o3和Gemi ...
揭开大模型“伪遗忘”,港理工等团队:结构不变就是没忘
量子位· 2025-06-01 03:40
训练中暴露的敏感信息往往被模型"记住",引发广泛关注。 Machine Unlearning团队 投稿 量子位 | 公众号 QbitAI 近年来,大语言模型(LLMs)的能力突飞猛进,但随之而来的隐私风险也逐渐浮出水面。 在此背景下, 机器遗忘(Machine Unlearning) 技术应运而生,目标是在不影响整体能 力的前提下,有选择性地抹除特定知识。 来自香港理工大学、卡内基梅隆大学和加州大学圣克鲁兹分校的研究团队通过构建一套表示 空间的诊断工具,系统性地区分了 "可逆性遗忘"与"灾难性不可逆遗忘" ,并首次揭示了遗 忘现象背后的表示结构变化规律—— 真正的遗忘只有在多个网络层发生协同且大幅度扰动时才会出现;而相比之下,在高敏感区 域(如输出logits)中进行轻微更新虽然会显著降低准确率或提高困惑度,但模型内部表示 结构仍可保持完整。 研究人员整理成了一个统一的表示层分析工具箱,支持诊断LLM在 Unlearning/Relearning/Finetuning等过程中的内在变化。 真正的遗忘,是结构性的抹除,而非行为的抑制 研究者提出:"一个模型若仅仅在token输出上'忘记',而其内部结构几乎未变, ...
大模型智能体如何突破规模化应用瓶颈,核心在于Agentic ROI
机器之心· 2025-05-30 04:16
Core Viewpoint - The main barrier to the usability of large language model agents (LLM Agents) is not the capability of the models but rather the "Agentic ROI" which has not reached a practical threshold for widespread application [1][3][4]. Group 1: Agentic ROI Concept - Agentic ROI (Agentic Return on Investment) is a key metric that measures the ratio of "information yield" to "usage cost" for LLM Agents in real-world scenarios [4]. - Usability is achieved only when the quality of information exceeds a certain threshold and the ratio of time and cost saved by the agent is sufficiently high [4][5]. Group 2: Current Application Landscape - Most LLM Agents are currently applied in high human task time cost scenarios, such as research and programming, where human labor is intensive, thus allowing for significant efficiency improvements [7]. - In everyday applications with high user demand, such as e-commerce and personal assistants, the tasks are simpler, leading to lower marginal value from LLM Agents, which may introduce additional interaction costs and delays, resulting in low Agentic ROI [7]. Group 3: Development Trajectory - The development path of LLM Agents is characterized by a "zigzag" model of first scaling up to enhance information quality, followed by scaling down to reduce time and cost while maintaining quality [9]. - The evolution of foundational models, such as the OpenAI series, illustrates this zigzag trend, with significant performance improvements in larger models and the introduction of smaller models that maintain performance while reducing inference costs and delays [9]. Group 4: Scaling Up Information Quality - Pre-training scaling involves expanding model size, data volume, and computational resources to enhance foundational capabilities in language understanding and reasoning [11]. - Post-training scaling, including supervised fine-tuning and reinforcement learning, aligns the agent's performance with human needs and values, relying on extensive interaction data for continuous learning [12]. - Test-time scaling focuses on building a world model that supports multimodal interactions and can handle complex tasks while reflecting real-world uncertainties [13]. Group 5: Ensuring Robustness and Security - Ensuring the robustness and security of LLM Agents is crucial for enhancing information quality, preventing exploitation of reward mechanisms, and safeguarding against data contamination and feedback manipulation [16]. Group 6: Scaling Down to Reduce Time and Cost - Introducing memory mechanisms allows agents to skip redundant calculations, leveraging past knowledge to enhance processing speed [18]. - Model compression techniques can significantly reduce computational resources and inference delays without compromising performance [18]. - Optimizing reasoning strategies and infrastructure can further enhance the efficiency and responsiveness of LLM Agents [18]. Group 7: Cost Management - Reducing interaction time by enabling agents to proactively understand user intent can lower cognitive burdens and improve user experience [19]. - Managing operational costs effectively is essential, especially in large-scale deployments, by optimizing context management and controlling inference complexity [19]. - Agentic ROI serves as a framework for evaluating the real usability of LLM Agents, shifting focus from mere model performance to practical benefits and comprehensive efficiency [19].