大语言模型
Search documents
中金::人工智能十年展望):越过“遗忘”的边界,模型记忆的三层架构与产业机遇
中金· 2026-02-24 14:20
证券研究报告 2026.02.11 人工智能十年展望(二十七):越过"遗忘" 的边界,模型记忆的三层架构与产业机遇 SAC 执证编号:S0080518070011 SFC CE Ref:BOP246 于钟海 分析员 韩蕊 分析员 王之昊 分析员 SAC 执证编号:S0080523070010 SFC CE Ref:BXD683 rui.han@cicc.com.cn SAC 执证编号:S0080522050001 SFC CE Ref:BSS168 zhihao3.wang@cicc.com.cn 纵轴:相对值(%) 88 100 112 124 136 148 2025-02 2025-05 2025-08 2025-10 2026-01 沪深300 中金软件及服务 投资建议 大模型的演进史,本质上是一部与"遗忘"抗争的历史。当我们惊 叹于模型的推理能力时,往往忽视了一个重要短板:在缺乏记忆留 存的架构下,模型每一次对历史信息的处理,本质上都是一次昂贵 的"重复计算"。这种以高昂算力对抗遗忘的粗放模式,正面临着显 存墙与上下文窗口的物理极限。我们认为,2026 年及之后的AI Infra主战场将增加"模型记 ...
“抢镜”春节档 人形机器人马年能否“见真章”?
Shang Hai Zheng Quan Bao· 2026-02-23 18:37
丙午马年伊始,上海证券报推出《上证聚焦》栏目,意在将目光投向财经风云变幻的交汇点,透视产经 变革的最前沿。 聚焦全球重大财经事件,我们不满足于呈现"发生了什么",更致力于挖掘"为什么会发生"以及"它将把 我们带向何方";我们关注商业逻辑的流变,追踪技术趋势的跃迁,剖析市场投资的风向,聚焦时代潮 变的力量,看它们如何潜移默化地改写人们的生活图景。 开栏语 这里有投资的罗盘,也有风险的哨声;有新闻的锐度,亦不乏人文的温度。 《上证聚焦》愿做您在复杂的财经世界中的一座灯塔,为您照亮那些值得被看见的真相与逻辑。让我们 一同,在深入调研中洞察关键,看见未来。 ◎记者 孙小程 马年春节档,人形机器人登台"唱大戏"。 大小荧屏上,宇树科技、松延动力、魔法原子、银河通用轮番上场,各显神通。 古镇庙会中,机器人提笔挥毫,"福"字跃然纸上;推掌亮拳,成为熙攘人潮中的新风景。 人形机器人正褪去概念外衣,逐渐变为可感可触的新物种,进一步踏入公众视野。 但聚光灯以外,围绕人形机器人的疑虑并未平息:除了表演,它能真正"干活"吗?技术瓶颈如何突破? 资本是否持续买单?社会又是否准备好接纳它? 产业总是在质疑与期待的交织中前进。马年新春之际 ...
智谱致歉信:GLM-5发布后流量超预期,扩容节奏没有跟上
Xin Lang Cai Jing· 2026-02-22 01:25
Core Viewpoint - The company issued an apology regarding the GLM-5 release, acknowledging issues with transparency and slow rollout, and has implemented a tiered access strategy for users [1][12]. Group 1: User Experience and Access - GLM-5 has a parameter scale more than double that of GLM-4.7, designed for complex tasks, with a tiered usage strategy where GLM-4.7 is recommended for simple tasks and GLM-5 usage is calculated at 3x during peak times and 2x during off-peak times [2][13]. - The rollout of GLM-5 is staggered, with Max users fully open, Pro users experiencing potential throttling during peak times, and Lite users to be gradually opened after the holiday [4][16]. - The company has optimized the dashboard refresh rate from one hour to ten minutes to provide users with more timely information [3][14]. Group 2: Refund and Compensation Policy - A refund policy has been established for affected Lite and Pro users, allowing for self-initiated refunds based on a principle of covering all costs from January 1, 2026, to the present [6][18]. - Refunds for old users (subscribed before January 1, 2026, and valid after February 12) will include all amounts from January 1, 2026, to the present plus remaining days, while new users (subscribed after January 1, 2026, and valid after February 12) will receive a full refund for the current subscription period [6][18]. - A refund window will open within a week after the holiday, closing on March 6, 2026, and users will receive a 15-day extension on their subscription period as compensation [18][7]. Group 3: Upgrade Rollback - Users who inadvertently upgraded from old to new packages between February 12 and 16 will be offered a one-click rollback to their previous package, with the same tier and cycle rights [19]. - Any price differences resulting from the rollback will be absorbed by the company, ensuring that users are not financially disadvantaged [20].
编码新王登基!Gemini 3.1 Pro 血洗 Claude 与 GPT,12 项基准测试第一!
AI前线· 2026-02-20 02:43
Core Insights - Google has launched Gemini 3.1 Pro, a significant upgrade that enhances reasoning capabilities and is designed for practical applications in various fields, including development tools and enterprise services [2][4][20]. Technical Overview - Gemini 3.1 Pro utilizes a mixed expert architecture, activating only a portion of its parameters during prompt responses, allowing for input of up to 1 million tokens and output of up to 64,000 tokens [2]. - The model has achieved a verified score of 77.1% in the ARC-AGI-2 abstract reasoning puzzles, indicating a substantial improvement in abstract reasoning and adaptability to new problems [9][12]. - Compared to its predecessor, Gemini 3 Pro, which scored 31.1% in the same test, Gemini 3.1 Pro has more than doubled its reasoning performance in just three months [16][12]. Benchmark Performance - Gemini 3.1 Pro ranks first in 12 out of 16 benchmark tests, outperforming competitors like Claude Opus 4.6 and GPT-5.2 in various categories, including academic reasoning and coding tasks [17][18]. - In the MCP Atlas test, which evaluates AI models' ability to execute tasks using third-party services, Gemini 3.1 Pro scored 69.2%, leading over Claude Sonnet 4.6 [17]. User Accessibility - The model is being rolled out to developers, enterprise users, and consumers through various platforms, including Google AI Studio, Vertex AI, and the Gemini App [7][24]. - Gemini 3.1 Pro is available for free to developers, marking a strategic move by Google to democratize access to advanced AI capabilities [15][24]. Practical Applications - The model is designed for complex tasks that require advanced reasoning, such as generating dynamic SVG animations for websites and creating modern personal portfolio sites based on literary themes [20][21][22]. - It bridges the gap between complex APIs and user-friendly design, exemplified by its ability to create real-time dashboards and immersive experiences [23]. Industry Implications - The release of Gemini 3.1 Pro signals a shift in the AI landscape, focusing on practical task completion and stability rather than merely increasing model size [27][30]. - The rapid iteration and deployment of Gemini 3.1 Pro reflect Google's response to the competitive pressures in the AI market, emphasizing the importance of reasoning capabilities and operational efficiency [28][30].
一个模型统一所有离线任务!微软用671B大模型重构广告推荐「推理大脑」
Sou Hu Cai Jing· 2026-02-18 05:37
AdNanny团队 投稿 量子位 | 公众号 QbitAI 微软用一个671B的"推理中枢",把广告系统的脏活累活都管了,性能还全面碾压一众前辈。 在工业级广告推荐系统中,普遍正面临一个吊诡的现状:在通用大语言模型(LLM)的推理能力已经登峰造极的同时,为了追求毫秒级的 响应,通常无法直接把LLM用到线上而是在离线端堆积了成百上千个"小模型"——有的管相关性标注,有的管用户画像,等等。 范式转移:从"模型森林"到"智能中枢化" 在现代广告推荐技术栈中,依赖大量离线任务支撑,如:query-ad相关性标注、用户画像生成、关键词扩写、创意优化……这些离线任务 通常用来为在线模型提供特征、数据和标签,工程师们为每个子任务都微调专属的BERT或小型LLM。这种"一任务一模型"的体系存在很 多痛点,如: 知识孤岛:尽管任务间共享广告领域知识共享底层语义,但在碎片化模型下,知识被重复学习,重复造轮子,效率极低。 性能瓶颈:受限于成本,各任务专属模型通常规模较小,面对长尾流量和复杂语义时,容易出现"理解偏差"。而且决策往往是黑盒的,输 出不提供解释。当模型判断错误时,算法工程师无法溯源,人工审核也无从下手。 维护成本高企: ...
星河问界申请基于大语言模型的用户心理状态监测方法专利,有效提高用户心理状态评估结果的识别精度
Jin Rong Jie· 2026-02-17 04:27
声明:市场有风险,投资需谨慎。本文为AI基于第三方数据生成,仅供参考,不构成个人投资建议。 专利摘要显示,本发明公开了一种基于大语言模型的用户心理状态数据监测方法、系统与终端,所述方 法包括:获取用户语音信号,并进行预处理和文本转换处理,得到语音文本序列;获取用户人脸图像, 并通过预设神经网络模型进行特征提取处理和表情分类处理,得到表情分类结果;获取用户视频流,并 进行边界框预测处理和损失函数计算,得到行为关键点;对语音文本序列、表情分类结果以及行为关键 点进行融合处理,得到融合向量,并将融合向量与预设的心理维度提示模板进行对比,得到用户心理状 态对比结果。本发明通过将语音信号、人脸图像以及视频流进行多源融合,并通过大语言模型进行深度 分析,有效提高了用户心理状态评估结果的识别精度。 天眼查资料显示,星河问界(长春)数字技术有限公司,成立于2025年,位于长春市,是一家以从事软 件和信息技术服务业为主的企业。企业注册资本200万人民币。通过天眼查大数据分析,星河问界(长 春)数字技术有限公司专利信息2条。 国家知识产权局信息显示,星河问界(长春)数字技术有限公司申请一项名为"一种基于大语言模型的 用户心理状态 ...
一个模型统一所有离线任务!微软用671B大模型重构广告推荐「推理大脑」
量子位· 2026-02-17 03:58
范式转移:从"模型森林"到"智能中枢化" 在现代广告推荐技术栈中,依赖大量离线任务支撑,如:query-ad相关性标注、用户画像生成、关键词扩写、创意优化……这些离线任务 通常用来为在线模型提供特征、数据和标签,工程师们为每个子任务都微调专属的BERT或小型LLM。这种"一任务一模型"的体系存在很多 痛点,如: AdNanny团队 投稿 量子位 | 公众号 QbitAI 微软用一个671B的"推理中枢",把广告系统的脏活累活都管了,性能还全面碾压一众前辈。 在工业级广告推荐系统中,普遍正面临一个吊诡的现状:在通用大语言模型 (LLM) 的推理能力已经登峰造极的同时,为了追求毫秒级的 响应,通常无法直接把LLM用到线上而是在离线端堆积了成百上千个"小模型"——有的管相关性标注,有的管用户画像,等等。 这种 "模型森林" 范式正逐渐成为进化的阻碍。模型间知识割裂、运维成本高昂、决策过程黑盒化。 近日,微软Bing Ads与DKI团队发表论文《AdNanny: One Reasoning LLM for All Offline Ads Recommendation Tasks》,宣布基于 DeepSeek-R1 6 ...
今日财经要闻TOP10|2026年2月16日
Xin Lang Cai Jing· 2026-02-16 11:41
Group 1: Israel-Iran Negotiations - Israeli Prime Minister Netanyahu sets a clear red line regarding Iran's nuclear facilities, stating that any agreement must include the complete dismantling of Iran's nuclear infrastructure, not just a halt to uranium enrichment [1] - Netanyahu expresses skepticism about the upcoming US-Iran negotiations, emphasizing that Iran must not retain any uranium enrichment capabilities [1] - He also demands the complete disarmament of Hamas, estimating that they still possess around 60,000 rifles, which must be surrendered [1] Group 2: Market Updates - The Hong Kong Hang Seng Index closed up by 0.52%, with notable gains in MINIMAX-WP, which rose by 24.56%, and other companies like Old Puhua Gold and Luoyang Molybdenum [4][12] - The adjustment to the Hang Seng Index will increase the number of constituent stocks from 88 to 90, with companies like CATL and Luoyang Molybdenum benefiting from this change [7][16] - The US stock market will be closed for Presidents' Day, affecting trading schedules for various futures and commodities [2][10] Group 3: Technology Developments - Alibaba has launched two new models, Qwen3.5-Plus and Qwen3.5-397B-A17B, with the former boasting 397 billion parameters and significantly improved performance compared to previous models [3][11] - The API pricing for Qwen3.5-Plus is set at 0.8 yuan per million tokens, which is only 1/18th of the price of Gemini 3 Pro, indicating a competitive edge in the market [3][11] Group 4: Warner Bros. Negotiations - Warner Bros. is considering restarting sale negotiations with Paramount after receiving a revised acquisition offer, which may lead to renewed competition with Netflix [8][17] - Paramount's revised terms include covering a $2.8 billion fee if Warner Bros. terminates its agreement with Netflix, highlighting its commitment to a swift regulatory approval process [8][17]
阿里发布千问3.5
财联社· 2026-02-16 10:43
Core Insights - Alibaba has launched two new models on the chat.qwen.ai platform: Qwen3.5-Plus and Qwen3.5-397B-A17B [1] - Qwen3.5-Plus is positioned as the latest large language model in the Qwen3.5 series, while Qwen3.5-397B-A17B is the flagship model of the open-source Qwen3.5 series [1] - Both models support text and multimodal tasks [1]
阿里发布千问3.5:性能媲美Gemini 3,Token价格仅为其1/18
Xin Lang Cai Jing· 2026-02-16 09:13
Core Insights - Alibaba has launched the new generation large model Qwen3.5-Plus, claiming it rivals Gemini 3 Pro and is the strongest open-source model globally [1][4] - The Qwen3.5-Plus model features a total of 397 billion parameters, with only 17 billion activated, outperforming the trillion-parameter Qwen3-Max model while reducing deployment memory usage by 60% and significantly enhancing inference efficiency [1][4] - The API pricing for Qwen3.5-Plus is set at 0.8 yuan per million tokens, which is only 1/18th of the cost of Gemini 3 Pro [1][4] Model Architecture and Performance - Qwen3.5 represents a generational leap from pure text models to native multimodal models, utilizing a mixed token pre-training approach that includes visual and text data [1][4] - The model has been trained with a substantial increase in multilingual, STEM, and reasoning data, allowing it to acquire denser world knowledge and reasoning logic [1][4] - Qwen3.5 achieves top-tier performance with less than 40% of the parameters of the Qwen3-Max model, excelling in inference, programming, and agent intelligence evaluations [1][4] Benchmark Performance - In the MMLU-Pro knowledge reasoning evaluation, Qwen3.5 scored 87.8, surpassing GPT-5.2 [2][5] - The model achieved 88.4 in the PhD-level GPQA assessment, outperforming Claude 4.5 [2][5] - Qwen3.5 set a record with a score of 76.5 in the instruction-following IFBench, and it also exceeded Gemini 3 Pro and GPT-5.2 in various agent evaluations [2][5]