腾讯研究院

Search documents
腾讯研究院AI速递 20250919
腾讯研究院· 2025-09-18 16:01
生成式AI 一、 华为昇腾AI芯片4年5款产品路线图,搭载自研HBM 1. 华为发布昇腾AI芯片4年5款产品路线图,包括2026年Q1推出的昇腾950PR、2026年Q4的昇腾950DT、2027年 Q4的昇腾960和2028年Q4的昇腾970; 2. 新芯片系列支持低精度数据格式,昇腾950PR在FP8/MXFP8/HiF8精度下算力达1PFLOPS,MXFP4下达 2PFLOPS,采用自研HiBL 1.0内存; 3. 华为同时推出全球最强算力超节点和集群,包括Atlas 950 SuperPoD支持8192张卡无收敛全互联,Atlas 960 SuperCluster算力规模可达百万卡。 https://mp.weixin.qq.com/s/dJGuwC2Fd4kSI_c47kjtYg 二、 OpenAI在ICPC编程赛上满分登顶,Gemini同获金牌 1. OpenAI在ICPC 2025编程竞赛中5小时内解决全部12个问题,成绩相当于人类排名第1位,其中使用GPT-5和一 款实验性推理模型共同完成; 2. 谷歌Gemini 2.5 Deep Think解决10个问题,总用时677分钟,达到金牌级表现, ...
腾讯研究院AI速递 20250918
腾讯研究院· 2025-09-17 16:01
生成式AI 一、 李飞飞空间智能新成果,3D世界生成进入无限探索时代? 1. 李飞飞创业公司World Labs发布空间智能模型Marble,能仅用一张图片或文本提示生成持久存在的大规模3D世 界; 2. 相比之前产品,Marble生成的3D世界规模更大、风格更多样化、几何结构更干净,且支持浏览器中自由视角导 航; 3. 用户可将生成世界导出为高斯点云并集成到Three.js中,实现在桌面、移动设备和VR头显上的高效运行,已开放白 名单测试。 https://mp.weixin.qq.com/s/-hw_l9Pk72IIify0WUYZJA 二、 Agent进入支付时代?谷歌联手60+巨头官宣AI支付协议 1. 谷歌联合美国运通、PayPal、Mastercard等60多家机构推出代理支付协议(AP2),旨在创建AI代理支付的安全标 准框架; 2. 两家公司正在使用"强化学习环境"(模拟企业应用)训练AI模型操作各种专业软件,如Salesforce、Zendesk、 Cerner; 3. 或 将 聘请领域专家示范任务执行,通过训练AI成为"虚拟同事",开辟新的盈利渠道。 https://mp.weixin.q ...
产业数字化就业调研报告:全国产业数字化就业总量约6千万,集中于小微市场主体
腾讯研究院· 2025-09-17 09:44
数字经济 是基于互联网,由云计算、区块链、人工智能等新一代通用技术驱动的新经济形态,包括 数 字产业化 (通信、互联网、IT设备制造、软件和信息技术服务等数字技术核心产业部门)和 产业数字 化 (传统产业与数字技术结合产生的新模式、新业态、新产业部门)两部分。 产业数字化就业调研报告 (2025年二季度) 数字就业 可分为 数字产业化 和 产业数字化 两部分,对后者的统计较为薄弱。自2024年三季度开始,腾 讯研究院携手企鹅有调、微众银行企业金融部联合进行《产业数字化就业调研》,采取线上问卷方式, 通过多渠道获取样本,对产业数字化就业做出估计。 根据估算结果 ,2024年底全国产业数字化就业总量为6195.1万,占全国就业人员的8.4%。 之后这一数字 连续下降, 至2025年二季度末降至6000.9万 ,其中企业创造的数字化就业数为2083.1万,个体户创造 3917.7万。批发零售业是数字化就业体量最大的行业,2025年二季度为2513.8万,占所有产业数字化岗位 的41.1%;文化娱乐业是数字就业渗透率最高的行业,2025年二季度渗透率为29.8%。 综合来看,大部分传统产业"触网"的部分仍是薄薄的一层 ...
腾讯研究院AI速递 20250917
腾讯研究院· 2025-09-16 16:01
生成式AI 一、 OpenAI发布GPT-5-Codex:可独立连续工作超7小时 2. 新模型针对人物生成专项优化,实现精细面部重塑,告别"抽象脸"问题,大幅提升人物真实感与美观度,达到真人 手办级别效果; 3. 同步上线腾讯云API和专业级混元3D Studio工作台,覆盖3D管线七大核心环节,已成为全球下载量超260万的最 受欢迎开源3D模型之一。 https://mp.weixin.qq.com/s/XzJIt8glOd82pVs_YXjf6w 三、 昆仑万维上线「Agent Studio」功能,私人音乐工作室 1. OpenAI发布GPT-5-Codex专为智能体编程优化,可自主连续工作超7小时,已在Codex所有使用场景上线并整合 ChatGPT账号体系; 2. 该模型在SWE-bench Verified和代码重构两大基准测试中性能超越GPT-5(high),可根据任务复杂度动态调整思 考时间; 3. GPT-5-Codex具备代码审查能力,能主动发现漏洞,上线仅两小时半流量已占Codex总量40%,支持多种工具调 用并计划API开放。 https://mp.weixin.qq.com/s/f6z ...
腾讯汤道生:全面开放AI能力,助力产业增长
腾讯研究院· 2025-09-16 06:43
汤道生 腾讯集团高级执行副总裁、云与智慧产业事业群CEO 9月16日,2025腾讯全球数字生态大会举行,腾讯集团高级执行副总裁、云与智慧产业事业群CEO汤道 生表示,"向智能化要产业效率,向全球化要收入规模",已经成为企业增长的两大核心动力。腾讯将打 造"智能化"与"全球化"两大效率引擎,助力企业稳健和可持续增长。 智能化方面,腾讯云正式发布腾讯云智能体战略全景图,全面开放AI能力、C端和B端优势场景。通过 智能体解决方案、SaaS+AI、大模型技术三大升级,激发企业的创新潜能。"腾讯将立足'以人为本',构 建'好用的AI',让AI服务于场景中的人,满足人的需求,提升工作的效率,优化交互的体验,甚至提供 情绪价值。"汤道生表示。 据介绍,AI已经成为腾讯的"新业务基因"。腾讯元宝上线一年多,已经成为国内DAU排名前三的AI原生 应用,用户现在每天向腾讯元宝的提问量,已经达到年初一个月的总量;ima知识库文件数量已经突破1 亿;QQ浏览器的AI月活数比4月增长了17.8倍。同时,AI也助力腾讯广告、游戏等业务实现双位数增 长。 全球化方面,腾讯云将从基础设施、技术产品以及服务能力三个方向,助力企业扎根本地、拓展 ...
腾讯研究院AI速递 20250916
腾讯研究院· 2025-09-15 16:01
生成式AI 一、 Google Gemini 凭Nano Banana登顶 App Store 免费榜 1. Google Gemini通过爆火的Nano Banana图像编辑功能一举超越ChatGPT登顶App Store免费榜; 2. Gemini成为完整AI工具集,包含画布、Veo3视频生成、Storybook故事板及Deep Research等多功能; 3. Google AI全家桶还包括NotebookLM知识库(最多可上传300个文件)、Flow视频生成(支持1080p高清)、AI Mode搜索和Gemini CLI本地助手。 https://mp.weixin.qq.com/s/gdSkrm95Mq1RORe-sIoK4A 二、 马斯克的最快AI模型,75 token/秒,比标准版快10倍! 1. xAI发布Grok 4 Fast模型,生成速度高达每秒75个token,比标准版快10倍,具有明显的实时交互优势; 2. 网友测试显示,新模型在编程题、初中数学题等任务上准确且速度惊人,能在不到2秒内解决LeetCode题目; 3. 尽管速度领先,Grok 4 Fast仍有准确性妥协,适合简单查询或工 ...
AI信仰正在推动经济增长
腾讯研究院· 2025-09-15 08:31
Group 1 - The article discusses the lagging effect of productivity in relation to the adoption of AI as a general-purpose technology, highlighting that significant improvements in productivity take time to materialize after the technology is commercialized [3][6][10] - Historical examples show that technologies like the steam engine, generator, and computer took many years after their invention and commercialization to noticeably enhance productivity [3][5] - Current productivity growth rates in the EU and the US are below historical averages, with EU labor productivity declining by 0.6% in 2023 and expected to grow by only 0.4% in 2024, while US productivity growth since 2020 averages 1.8%, below the long-term average of 2.2% [6][10] Group 2 - AI adoption rates are still low, with the EU's enterprise AI adoption rate averaging 13.5% and the US at 9.2%, indicating that AI's impact on economic growth will not be significant in the short term [7][10] - Despite the low profitability of AI model companies, there is a high expectation for future returns, leading to increased capital expenditures among major internet companies in the US and China [11][13] - In 2024, capital expenditures for major US internet companies are projected to reach $245 billion, significantly contributing to GDP growth, with AI data center spending surpassing consumer spending for the first time [13][15] Group 3 - The article draws parallels between the current AI wave and historical technological expectations, suggesting that belief in AI's potential is driving economic growth more than the technology itself [18][19] - The discussion extends to nuclear fusion as a future energy source, with significant investments being made in fusion technology, indicating a similar pattern of high expectations and investment as seen with AI [20][24] - The article concludes by highlighting the dichotomy of belief in technological advancements, questioning whether the current AI and nuclear fusion trends will fulfill their promises or follow historical patterns of delayed realization [27][29]
腾讯研究院AI速递 20250915
腾讯研究院· 2025-09-14 16:01
Group 1 - OpenAI and Microsoft have released a non-binding cooperation memorandum addressing key issues such as cloud service hosting, intellectual property ownership, and AGI control, but the final cooperation agreement is still pending [1] - OpenAI plans to establish a public benefit corporation (PBC) with a valuation exceeding $100 billion, where a non-profit organization will hold equity and maintain control, becoming one of the most resource-rich charitable organizations globally [1] - OpenAI faces significant cost pressures, expecting to burn through $115 billion before 2029, with $100 billion needed for server leasing in 2030, leaving little room for error in the coming years [1] Group 2 - Utopai, the world's first AI-native film studio founded by a former Google X team, has generated $110 million in revenue from two film projects and secured a spot at the Cannes Film Festival [2] - Utopai has overcome three major challenges in AI video generation: consistency, controllability, and narrative continuity, achieving millisecond-level lip-sync precision with 3D data training [2] - The company positions itself as a content + AI provider rather than a pure tool supplier, receiving support from top Hollywood resources, including an Oscar-nominated screenwriter for the film "Cortes" [2] Group 3 - MiniMax has launched its new music generation model, Music 1.5, capable of creating complete songs up to 4 minutes long, featuring strong control, natural-sounding vocals, rich arrangements, and clear song structure [3] - The model supports customizable music features across "16 styles × 11 emotions × 10 scenes," enabling the generation of different vocal tones and the inclusion of Chinese traditional instruments [3] - MiniMax's multi-modal self-developed capabilities are now available to global developers via API, applicable in various scenarios such as professional music creation, film and game scoring, and brand-specific audio content [3] Group 4 - Meituan's first AI Agent product, "Xiao Mei," has entered public testing, allowing users to order coffee, find restaurants, and plan breakfast menus through natural language commands, significantly simplifying the ordering process [4] - "Xiao Mei" is based on Meituan's self-developed Longcat model (with 560 billion total parameters), capable of fully automating the selection to payment process based on user preferences and location [4] - Despite the advancements, the AI Agent currently has limitations, such as handling complex ambiguous requests and lacking voice response capabilities, with plans for future optimization in personalization and proactive service [4] Group 5 - Xiaohongshu's audio technology team has released the next-generation dialogue synthesis model, FireRedTTS-2, addressing issues like poor flexibility, frequent pronunciation errors, unstable speaker switching, and unnatural prosody [5][6] - The model has been trained on millions of hours of voice data, supporting sentence-by-sentence generation and multi-speaker tone switching, capable of mimicking voice tones and speaking habits from a single audio sample [6] - FireRedTTS-2 has achieved industry-leading levels in both subjective and objective evaluations, supporting multiple languages including Chinese, English, and Japanese, and serves as an industrial-grade solution for AI podcasting and dialogue synthesis applications [6] Group 6 - Bilibili has open-sourced its new zero-shot voice synthesis model, IndexTTS2, addressing industry pain points by achieving millisecond-level precise duration control for AI dubbing [7] - The model employs a "universal and compatible autoregressive architecture for voice duration control," achieving a duration error rate of 0.02%, and utilizes a two-stage training strategy to decouple emotion and speaker identity [7] - The system consists of three core modules: T2S (text to semantics), S2M (semantics to mel-spectrogram), and BigVGANv2 vocoder, allowing for emotional control in a straightforward manner, with significant implications for cross-language industry applications [7] Group 7 - Meta AI has released the MobileLLM-R1 series of small parameter-efficient models, including sizes of 140M, 360M, and 950M, optimized for mathematics, programming, and scientific questions [8] - The largest 950M model was pre-trained using approximately 2 trillion high-quality tokens (with a total training volume of less than 5 trillion), achieving performance comparable to or better than the Qwen3 0.6B model trained on 36 trillion tokens [8] - The model outperforms Olmo 1.24B by five times and SmolLM2 1.7B by two times on the MATH benchmark, demonstrating high token efficiency and cost-effectiveness, setting a new benchmark among fully open-source models [8] Group 8 - An AI agent named "Gauss" completed a mathematical challenge that took Terence Tao's team 18 months to solve, formalizing the strong prime number theorem (PNT) in Lean in just three weeks [9] - Developed by a company founded by Christian Szegedy, an author of the ICML'25 time verification award, Gauss generated approximately 25,000 lines of Lean code, including thousands of theorems and definitions [9] - Gauss can assist top mathematicians in formal verification, breaking through core challenges in complex analysis, with plans to increase the total amount of formalized code by 100 to 1,000 times in the next 12 months [9] Group 9 - Sequoia Capital USA has interpreted the new AI landscape following the release of GPT-5 by OpenAI, which allows for a more natural interaction resembling conversations with a PhD-level expert, incorporating "thinking" capabilities and a unified model to reduce hallucinations [10][11] - Other players have also launched strategic new products ahead of the release, including Anthropic's Claude Opus 4.1 targeting high-risk enterprise scenarios and Google's Gemini 2.5 Deep Think and Genie 3 enhancing reasoning and simulation capabilities [10][11] - The new AI landscape has been reshaped, with OpenAI dominating both open and closed AI ecosystems, Anthropic focusing on enterprise-level precision and stability, and Google emphasizing long-term foundational research [11] Group 10 - DeepMind's science lead, Pushmeet Kohli, revealed that the team targets three types of problems: transformative challenges, those recognized as unsolvable in 5-10 years, and those that DeepMind is confident it can quickly tackle [12] - The team has successfully transferred capabilities from specialized models like AlphaProof to the Gemini general model, achieving International Mathematical Olympiad gold medal levels with DeepThink [12] - The future goal is to create a "scientific API" that allows global scientists to share AI capabilities, lowering research barriers and enabling ordinary individuals to contribute to Nobel-level achievements [12]
腾讯研究院AI每周关键词Top50
腾讯研究院· 2025-09-13 02:33
每周50关键词 把握全局AI动态 点击 关键词 可查看资讯概述 | 类别 | Top关键词 | 主体 | | --- | --- | --- | | 芯片 | Rubin CPX GPU | 英伟达 | | 芯片 | AI5和AI6 | 特斯拉 | | 模型 | Qwen3-Max-Preview | 阿里 | | 模型 | rStar2-Agent | 微软 | | 模型 | 文心大模型X1.1 | 百度 | | 模型 | checkpoint-engine | Kimi | | 应用 | 服务限制 | Meta Anthropic | | 应用 | AI Key | iPhone | | 应用 | 混元游戏2.0 | 腾讯 | | 应用 | Robix | 字节 | | 应用 | AR+AI眼镜 | Rokid | | 应用 | REFRAG框架 | Meta | | 应用 | GPT-5黑客松 | OpenAI | | 应用 | AI电影 | OpenAI | | 应用 | 3D规模重建 | 港科大 | | 应用 | 参考生图 | Vidu | | 应用 | Qwen3-ASR-Flash | 阿里 | ...
GPT-5 为啥不 “胡说” 了?OpenAI 新论文讲透了
腾讯研究院· 2025-09-12 08:58
Core Viewpoint - The article discusses the advancements and challenges of OpenAI's GPT-5, particularly focusing on the significant reduction in hallucination rates compared to previous models, while also highlighting the underlying mechanisms and implications of these changes [5][6][25]. Group 1: Hallucination Rates and Mechanisms - GPT-5 has a hallucination rate that is approximately 45% lower than GPT-4 and about 80% lower than OpenAI's earlier models [6]. - The reduction in hallucination rates is attributed to enhanced reinforcement learning techniques that allow models to refine their reasoning processes and recognize their errors [8][9]. - The paper published by OpenAI indicates that hallucinations are an inevitable byproduct of the statistical learning nature of language models, making it more challenging to generate reliable information than to assess its reliability [12][16]. Group 2: Theoretical Framework - OpenAI introduces a theoretical "Is-It-Valid" (IIV) judgment mechanism that determines the validity of generated sentences based on their internal probabilities [13]. - The model's tendency to generate plausible-sounding but incorrect information is exacerbated by data sparsity, complexity, and noise in training data [14][16]. - The mathematical conclusion presented in the paper suggests that the error rate of generative models is at least double that of the IIV judgment errors, indicating a compounding effect of judgment mistakes on hallucinations [15][16]. Group 3: Post-Training Challenges - Post-training processes have not effectively mitigated hallucinations, as current evaluation metrics tend to reward models for providing confident but potentially incorrect answers [18][24]. - The article critiques the binary scoring systems used in mainstream AI evaluations, which penalize uncertainty and discourage models from expressing "I don't know" [21][24]. - The reinforcement learning processes that utilize binary reward paths may inadvertently promote overconfidence in models, leading to increased hallucination rates [27][29]. Group 4: Future Directions and Solutions - The article suggests that introducing a penalty-based scoring mechanism during post-training could help models better calibrate their confidence levels and reduce hallucinations [33]. - A shift from a score-optimization focus to a truth-oriented approach is proposed as a potential solution to the hallucination problem [34].