腾讯研究院
Search documents
腾讯研究院AI每周关键词Top50
腾讯研究院· 2026-02-14 02:33
Group 1: Core Insights - The article highlights the top 50 keywords in AI for the week, providing a comprehensive overview of the latest developments in the AI industry [2] - Key models mentioned include Claude Opus 4.6 by Anthropic and GPT-5.3-Codex by OpenAI, indicating advancements in AI model capabilities [3] - Various applications of AI are showcased, such as Seedance 2.0 by ByteDance and WorkBuddy by Tencent, reflecting the growing integration of AI in different sectors [3][4] Group 2: Models - Claude Opus 4.6 and Opus 4.6极速模式 are significant models from Anthropic, showcasing their focus on enhancing AI performance [3] - OpenAI's GPT-5.3-Codex represents a notable evolution in generative AI technology [3] - Other models like 2Bit量化端侧模型 by Tencent and Ming-flash-omni 2.0 by Ant Group highlight the competitive landscape in AI model development [3] Group 3: Applications - The article lists various innovative applications, including DreamZero by NVIDIA and AI女友Clawra by OpenClaw, indicating diverse use cases for AI technology [3][4] - The introduction of tools like CodeBrain-1 by Feeling AI and 深度研究智能体 by Meituan reflects the trend towards specialized AI applications in research and development [4] - The presence of applications like FireRed-Image-Edit by Xiaohongshu and 自定义智能体 by Rokid shows the focus on user-friendly AI solutions [4] Group 4: Technology and Trends - The article discusses technological advancements such as AI绘制脑图 by the University of California and the concept of a robot fighting league, indicating the innovative directions in AI research [4] - Insights from figures like Elon Musk on the concept of a robot perpetual motion machine suggest ongoing debates and explorations in AI capabilities [4] - The mention of AI基建支出 by major US tech companies highlights the increasing investment in AI infrastructure [4]
腾讯研究院2026年新春书单:值得读的10本书
腾讯研究院· 2026-02-13 07:03
Core Insights - The article emphasizes the transformative impact of artificial intelligence (AI) on daily life, highlighting its role in various creative and decision-making processes while also addressing the existential questions it raises for humanity [1] Group 1: Book Recommendations - The article presents a unique book recommendation format where AI characterizes five contemporary individuals and suggests books based on their identities and philosophical cores [2] - The recommended books include "要有光" by 梁鸿, which explores the psychological crises faced by youth in contemporary China, emphasizing the importance of understanding and listening to their struggles [4] - "文字的力量" by 马丁·普赫纳 examines the historical significance of literature and its role in shaping civilization, arguing that literature is a powerful tool for societal change [6][7] - "匠人" by 理查德·桑内特 redefines the concept of craftsmanship in the digital age, advocating for a return to the intrinsic joy of doing work well, beyond mere efficiency [9][10] - "系统之美" by 德内拉·梅多斯 offers insights into systems thinking, emphasizing the interconnectedness of events and the importance of understanding systemic structures to address complex issues [11][12] - "即使以最微弱的光" by 崔恩荣 delves into the emotional struggles of East Asian women, portraying their resilience in the face of societal pressures [13][15] - "生命3.0" by 迈克斯·泰格马克 discusses the implications of AI on human existence and the ethical considerations of its development [17][18] - "AI之镜" by 香农·瓦洛尔 critiques the limitations of AI as a reflection of past data, urging a reevaluation of how technology can enhance human values and growth [19][20] - "AI文明史·前史" by 张笑宇 presents a philosophical exploration of AI's evolution, warning against the potential dangers of unregulated technological advancement [23][24] - "苏格拉底的方法" by 沃德·法恩斯沃斯 revives the Socratic method as a means to combat cognitive closure in the age of AI [25][26] - "AI群星闪耀时" by 清华大学刘知远团队 provides a historical perspective on AI, emphasizing the philosophical and humanistic aspects of its development [27][28]
腾讯研究院AI速递 20260213
腾讯研究院· 2026-02-12 16:13
Group 1 - Zhipu released the open-source GLM-5 model with a parameter scale expanded to 744 billion (activated 40 billion), ranking fourth globally in the Artificial Analysis leaderboard and first in open-source, with coding and agent capabilities approaching Claude Opus 4.5 [1] - The model achieved scores of 77.8 and 56.2 in SWE-bench-Verified and Terminal Bench 2.0, respectively, setting new open-source SOTA records, excelling in complex systems engineering and long-range agent tasks [1] - GLM-5 has been adapted to domestic chips such as Huawei Ascend, Cambricon, and Kunlun, and introduced Z Code full-process programming tools and AutoGLM universal agent assistant [1] Group 2 - MiniMax launched the M2.5 model with only 10 billion activated parameters, achieving flagship-level reasoning speed three times faster than Opus [2] - The model completed a full-stack learning website in 9 minutes and can independently perform physical simulations and enterprise-level CMS system setups, supporting cross-platform development for PC/App/React Native [2] - It utilizes a native agent RL training framework and CISPO algorithm, achieving approximately 40 times training acceleration and is compatible with mainstream development tools like Claude Code and OpenClaw [2] Group 3 - Xiaohongshu's foundational model team released the open-source FireRed-Image-Edit, achieving SOTA in multiple authoritative rankings such as ImgEdit and GEdit, with code and technical reports now available [3] - The model employs a three-stage training process to enhance capabilities and innovatively introduces Layout-Aware OCR-based Reward, significantly improving text editing accuracy and style retention [3] - It supports various complex editing scenarios, including instruction-following consistency, text editing, style transfer, multi-image fusion, and old photo restoration, with model weights set to be open-sourced [3] Group 4 - Xiaomi released the open-source VLA model Xiaomi-Robotics-0 with 4.7 billion parameters, excelling in visual language understanding and real-time execution capabilities, achieving optimal results in comparisons across 30 models including LIBERO, CALVIN, and SimplerEnv [4] - The model uses a Mixture-of-Transformers architecture, where the VLM brain understands instructions and the Diffusion Transformer generates high-frequency smooth actions [4] - It addresses action discontinuity issues through asynchronous reasoning and Λ-shape attention masks, enabling real-time inference on consumer-grade graphics cards, and has been open-sourced on GitHub and HuggingFace [4] Group 5 - Gaode launched the ABot series of embodied base models, with ABot-M0 responsible for operations and ABot-N0 for navigation, achieving comprehensive SOTA across 10 global authoritative evaluations [5][6] - ABot-M0 integrates 6 million cross-platform trajectory data through action language and proposes an action manifold learning algorithm, achieving an 80.5% success rate on Libero-Plus, surpassing pi0 by nearly 30% [6] - ABot-N0 unifies five core navigation tasks within a single VLA architecture, constructing 8,000 high-fidelity 3D scenes and 17 million expert examples, with a 40.5% improvement in SocNav success rate [6] Group 6 - Rokid Glasses launched the "customizable agent" feature on the Lingzhu platform, allowing integration with OpenClaw or privately deployed models like DeepSeek R1 and Qwen3 through a standard SSE interface [7] - Users can achieve local closed-loop processing of private data and switch model bases with one click, leveraging the ClawHub skill ecosystem to execute capabilities like file systems, browsers, and IM messaging [7] - The platform empowers users by allowing them to summon private agents via voice commands or shortcuts, creating a 24/7 intelligent assistant [7] Group 7 - Google DeepMind released the AI mathematician Aletheia based on Gemini Deep Think, achieving a score of 91.9% on IMO-ProofBench, setting a new SOTA and capable of independently writing and publishing academic papers [8] - Aletheia systematically evaluated 700 open problems in the Erdős conjecture database and autonomously solved 4 unsolved mysteries, demonstrating self-correction and acknowledgment of limitations [8] - Gemini Deep Think collaborated with experts to tackle 18 long-stagnant research challenges, resolving a decade-long submodel optimization conjecture, with one paper accepted by ICLR 2026 [8] Group 8 - HyperWrite's CEO published an article that garnered 70 million views, stating that the release of GPT-5.3-Codex and Claude Opus 4.6 marks a qualitative change in AI [9] - AI can now independently complete the workload of human experts in 5 hours, with this capability doubling every 4-7 months, and GPT-5.3 plays a crucial role in its self-training process, initiating a recursive self-improvement cycle [9] - Almost all cognitive work performed in front of screens will be affected, and it is advised to spend one hour daily experimenting with AI, as the current cognitive window period will not last long [9] Group 9 - Anthropic released a 53-page report warning that the risks associated with Claude Opus 4.6 are approaching ASL-4 levels, outlining 8 potential risk pathways that could lead to catastrophic harm, including autonomous escape and autonomous operation [10][11] - The report concludes that current models do not exhibit "sustained consistent malicious intent," and the risk of catastrophic damage is "very low but not zero," entering a "gray area" of capability assessment [10] - The head of Anthropic's safety research team resigned, stating that "the world is in crisis," and xAI co-founder predicts that recursive self-improvement cycles may be launched within 12 months [11]
我们正在迎来“硅基化”社交时代
腾讯研究院· 2026-02-12 09:13
人类社交网络的历史,或许会由此开始改变。 从"连接型社交"到"生成式社交" 社交这件事,从来就不是互联网时代的专属名词。 从人类这个物种走出非洲大草原开始,社交就是联结个体、维系族群、传递经验的基础因子。正如罗宾·邓 巴所言,人类大脑的演化很大程度上是为了处理复杂的社会关系。 章文龙 腾讯研究院特约作者 2026年1月28日,程序员Peter Steinberger开发了Clawdbot ( 即O penCla w) ,并推出了智能体社交平台 Moltbook。在这个平台上,Agent之间可以自由讨论,自由发言,作为创造者的人类,却只能在一旁围观。 这只是一个开始,短短数日时间,超过百万量级的AI Agent涌入Moltbook,并在无人类干预的情况下,自发 演化出了包括宗教崇拜、阶级分化乃至加密通讯在内的复杂社会结构雏形。 而在这几天爆火的AI社交软件Elys,也正在展现一种全新的、由AI主导的社交网络形态,由人类用户建 立"分身",人类的分身(AI)发布内容、相互点赞、评论、聊天。 长期被讨论的"AI社交",进入了全新的发展阶段。 究其原因,在AI Agent真正实现全天候运作之前,AI Agent之于使 ...
腾讯研究院AI速递 20260212
腾讯研究院· 2026-02-11 16:08
Group 1: Google Chrome and WebMCP Protocol - Google Chrome team has released the WebMCP (Web Model Context Protocol), allowing AI agents to interact directly with website kernels via the navigator.modelContext API, bypassing human user interfaces [1] - WebMCP addresses the high costs and low stability issues of traditional agent screenshot recognition, marking a transition from "visual simulation" to "logical direct connection," referred to as "API in UI" [1] - This standard is being jointly promoted by Google and Microsoft, indicating a potential future division of the internet into UI layers for humans and tool layers for agents, heralding the arrival of the "Agentic UI" era [1] Group 2: Runway's Financing and Model Development - Video generation unicorn Runway has secured $315 million in Series E funding, achieving a valuation of $5.3 billion, with participation from Nvidia, AMD, and Adobe, bringing total funding to $815 million [2] - Runway's Gen-4.5 ranks third in the AI-generated video leaderboard, surpassing models like Google Veo 3 and OpenAI Sora 2 Pro [2] - The new funding will be used to train the next generation of world models, having already launched the general world model GWM-1, which includes variants for explorative environments, dialogue characters, and robotic operations [2] Group 3: xAI Leadership Changes - xAI co-founders Jimmy Ba and Wu Yuhua announced their departures within 48 hours, with 6 out of 12 founding team members having left, including 5 in the past year [3] - Responsibilities of the departing co-founders have been redistributed among other co-founders, and SpaceX's acquisition of xAI has been completed, with an IPO plan set to advance in the coming months [3] - xAI's flagship product Grok has recently exhibited strange behaviors, and the talent loss poses challenges for the upcoming IPO [3] Group 4: DeepSeek's New Model - DeepSeek has quietly launched a new model supporting a 1 million token context window, with knowledge cutoff in May 2025, capable of processing content equivalent to the entire "Three-Body Problem" trilogy [4] - This model remains a pure text model, unable to view images directly but capable of reading text from images and documents, with enhanced Agentic Coding capabilities [4] - The industry trend is shifting from LLM reasoning to Agentic reasoning, as indicated by the latest models from Anthropic and OpenAI, suggesting humans will act as architects directing AI teams in software development [4] Group 5: Zhiyu's GLM-5 Model - Zhiyu has confirmed that the mysterious model "Pony Alpha," which topped the OpenRouter popularity chart, is its new model GLM-5, achieving state-of-the-art performance in coding and agent capabilities [5] - GLM-5's performance in real programming scenarios closely approaches that of Claude Opus 4.5, excelling in complex systems engineering and long-range agent tasks with high tool invocation accuracy [5] Group 6: Ant Group's Omni Model - Ant Group has open-sourced the full-modal model Ming-flash-omni 2.0, the first in the industry to generate voice, environmental sound effects, and music simultaneously on the same audio track [7] - This model excels in visual language understanding, controllable speech generation, and image editing, surpassing capabilities of Gemini 2.5 Pro and Qwen3-Omini-30B-A3B-Instruct [7] - The model employs a unified architecture for deep multi-modal integration, supporting zero-shot voice cloning and fine attribute control, and has been open-sourced on platforms like HuggingFace [7] Group 7: iFlytek's Starfire X2 Model - iFlytek has released the Starfire X2 model, trained on entirely domestic computing power, with overall capabilities matching international top levels, particularly in mathematics, reasoning, and agent tasks [8] - Starfire X2 utilizes a 293 billion MoE sparse architecture, improving inference performance by 50% compared to X1.5, and continues to enhance capabilities in over 130 languages, maintaining industry leadership in key languages for Latin America and ASEAN [8] - Industry applications have been significantly upgraded, with medical capabilities passing authoritative evaluations and educational applications achieving personalized learning through error analysis [8] Group 8: Meituan's LongCat Research Agent - Meituan's LongCat has launched a "deep research" feature, scoring 73.1 in the BrowseComp evaluation, approaching top closed-source models, supporting up to 400 interactions and 256K context [9] - Leveraging Meituan's native capabilities in local life, it creates a real training environment and employs a Rubrics-as-Reward mechanism to address AI hallucination issues, ensuring all recommendations are verifiable [9] - The model utilizes a multi-agent specialized division of labor, automating the entire process from information gathering to research analysis and visualization, capable of generating professional reports for restaurant recommendations and travel planning [9] Group 9: ByteDance's Protenix-v1 Model - ByteDance's Seed team has released Protenix-v1, an open-source model that matches the performance of AlphaFold 3 under strict training data and model size constraints [10] - This model successfully unlocks scaling capabilities during inference, with the prediction success rate for antibody-antigen complexes increasing from 36% with a single seed to 47.68% with 80 seeds [10] - The team has adopted a dual-version strategy, with the standard version aligning with academic benchmarks and the extended version utilizing data from June 2025 for practical drug discovery applications, along with the launch of the PXMeter evaluation toolkit [10]
2025年,微短剧精品化走到哪步了?
腾讯研究院· 2026-02-11 08:57
Core Insights - The micro-short drama market has rapidly expanded from 3.68 billion yuan in 2021 to 50.44 billion yuan in 2024, marking a growth of over 13 times [2] - By June 2025, the user base for micro-short dramas is expected to reach 696 million, accounting for nearly 70% of internet users [4] Group 1: Quality and Content Evolution - The production quality of micro-short dramas has shifted towards a more "cinematic" standard, moving away from the initial reliance on rapid production and traffic-driven models [4][5] - The average production cost for micro-short dramas has increased from 200,000-300,000 yuan to 400,000-700,000 yuan, with top productions reaching investment levels in the millions [5] - Notable directors and actors have entered the micro-short drama space, enhancing the artistic quality and depth of content [5] Group 2: Thematic and Narrative Development - Micro-short dramas are increasingly focusing on mainstream themes, with a push towards realism and relatable storytelling as guided by regulatory bodies [6] - Successful examples include "Home and Away," which combines historical context with personal stories, and "Parrot," which addresses themes of loyalty and national unity [6] Group 3: Market Diversification and Innovation - The industry has seen a diversification of formats, including horizontal and vertical screen dramas, as well as the integration of animation [7] - AI technology is being leveraged to enhance content creation, with significant growth in AI-generated short dramas, showing a compound annual growth rate of over 80% [12] Group 4: Platform Strategies and Business Models - Video platforms are adopting differentiated strategies to promote the quality of micro-short dramas, emphasizing a consensus that "quality is the lifeline" [9] - The commercial model is evolving, with platforms implementing revenue-sharing mechanisms to incentivize high-quality content production [11] - The rise of free short dramas is shifting the focus from mere viewership to creative content, fostering a healthier industry ecosystem [11] Group 5: Future Opportunities and Challenges - The integration of micro-short dramas with various sectors, such as tourism and sports, is seen as a potential growth area for service consumption [16] - The industry is encouraged to further explore the use of AI in content creation, aiming for a balance between efficiency and quality [17]
腾讯研究院AI速递 20260211
腾讯研究院· 2026-02-10 16:11
生成式AI 一、ChatGPT正式测试广告功能,OpenAI承诺不干预回答内容 1. OpenAI正式在美国地区测试ChatGPT广告功能,面向免费用户及月付8美元的Go订阅用户开放,Pro、Business 等高级订阅方案不展示广告; 2. 广告将标注为"赞助内容",基于对话主题和历史记录智能匹配,用户聊天内容不向广告商开放,仅提供聚合层面效 果数据; 3. OpenAI承诺广告不干预回答逻辑,用户可自主管理广告设置,目标是通过广告模式资助免费服务实现AI普及。 https://mp.weixin.qq.com/s/UmNZi0fUXYh-dbnPWuBuEQ 二、腾讯混元开源首个产业级2Bit量化端侧模型,仅0.3B参数 1. 腾讯混元推出HY-1.8B-2Bit模型,通过2比特量化感知训练实现等效参数量仅0.3B,内存占用仅600MB,是首个 产业级2bit端侧模型实践; 2. 对比原始精度模型参数量降低6倍,在真实端侧设备上生成速度提升2至3倍,同时沿用全思考能力支持简洁和详细 思维链切换; 3. 模型已在Arm等计算平台完成适配,团队未来将通过强化学习与模型蒸馏进一步缩小低比特量化模型与全精度模型 ...
科技创新如何助力中华优秀传统文化传承发展?|来自“腾讯探元计划2024”的启示
腾讯研究院· 2026-02-10 09:03
以下文章来源于可持续发展经济导刊 ,作者文|孙怡 王朝阳 可持续发展经济导刊 . 以"全球视野,合作共赢"宗旨,立足"面向未来的思想和行动",搭建国内外可持续发展信息沟通和经验 交流平台。 探元计划以"中华文化数智焕活"与"全球文明交流互鉴"为核心愿景,采取技术创新与模式创新双轮驱动:一 方面,敏锐捕捉以生成式人工智能为代表的新一代信息技术和前沿交叉技术,旨在破解文化遗产保护传承的 深层难题;另一方面,超越传统项目制,着力构建连接文博机构、高校院所、科技企业与社会资本"共建共 享、共创共益"的创新生态,培育社会价值实现与可持续发展的内生动力。 2.文化科技创新方法:探 元"五步 法" 第一步:聚焦真问题。 探元计划深入文博一线,与遗产地管理者、研究者共同凝练出具有行业代表性、技术 挑战性及价值紧迫性的真问题,确保技术研发有的放矢。 第二步:探寻真解法。 面向全社会"揭榜挂帅",征集前沿适用的创新技术,通过独立、跨学科专家评审团, 精准匹配潜力技术团队。 第三步:共创真方案。 对入选的"技术—场景"组队进行共创孵化,通过共创营等形式,帮助技术方深入理解 文博需求,将技术应用转化为符合一线产业需求的可落地推广方案 ...
个人的科技未来生存指南|附《2026 前沿科技趋势》电子版下载
腾讯研究院· 2026-02-10 09:03
刘莫闲 腾讯研究院高级研究员 《2026 前沿科技趋势》电子版来了,快到文末扫二维码,领取每个人的科技未来生存指南吧! 陆、海、空三栖行动力更强,但前提是安全 人类从未放弃通过各种仿生方式,去延展自身的身体能力,让人能够像山羊一样翻山越岭,像鸟一样翱翔 于天空,像鱼一样潜游于江海。面向 2030,外骨骼、飞行器和潜水装备技术的进步和应用仅仅是开始,如 何针对新技术应用的特点,构建安全、公平和可持续的服务生态,则需要工程师、监管者和公众共同的智 慧。 智力翻倍以及未来推演 用科技,守护生命质量与尊严 随着基因疗法和人工智能技术在延长健康寿命方面的不断实践和成熟,我们将从"听天由命"的被动医疗,逐 步走向"掌控生命"的主动进化。期待在2030年开启的下一个十年里,让80岁的人拥有60岁的体魄和活力,将 不再是科幻小说的构想,而是触手可及的现实。这便是"生命力 2030"的含义:不追求无限延长的数字,而 守护每一个数字背后的生命质量与尊严。 当 AI 成为人类外脑,人的智力翻倍不会太远。而那之后,人们可能会探索通过脑机接口与 AI 协作:你在 脑中构思一封邮件的大致内容,AI智能体通过脑机接口"读取"你的意图,自动 ...
腾讯研究院AI速递 20260210
腾讯研究院· 2026-02-09 16:03
https://mp.weixin.qq.com/s/vPp0aFcc1QJZ2l0D4qFH8A 二、小红书内测AI视频剪辑应用OpenStoryline,对话驱动 生成式AI 一、 实 测 神秘模型Pony Alpha,Opus级智能 , 架构师思维 1. Pony Alpha在OpenRouter走红,无发布会无论文,却凭超强编程能力引发开发者圈热议,有人连续编程3小时做 出可玩的Pokemon Ruby; 2. 实测表现惊艳,能从零复刻《星露谷物语》,自主完成需求分析、架构设计到功能实现全流程,展现出系统级工程 理解与长时间推理能力; 3. 模型身世成谜,有人猜测是Anthropic Sonnet 5、DeepSeek-V4或智谱GLM-5,若为国内厂商作品,意味着国 产模型在高阶编程领域已进入新阶段。 1. 小红书正在研发AI视频剪辑应用OpenStoryline,采用"非线性编辑+对话驱动"模式,用户上传图片通过自然语言 即可完成视频剪辑; 2. 技术上采用DeepSeek和Qwen 3开源模型,结合小红书自有的dots.lm文本大模型和FireRedASR音频模型实现生 态适配; 3. 小红书近 ...