腾讯研究院

Search documents
游戏音乐,正走向舞台中心|浪潮论坛跨界对谈
腾讯研究院· 2025-07-03 09:49
Core Viewpoint - Game music, which accounts for less than 5% of production budgets but carries 30% of the narrative function, is gaining more attention from the mainstream music industry, highlighted by the Grammy Awards introducing a Best Video Game Score category starting in 2023 [1][2][3] Group 1: Development and Evolution of Game Music - The development of game music is closely tied to technological advancements, with early limitations in sound quality evolving significantly since the introduction of CD media around 1994, allowing for richer audio experiences [4][5] - Despite its growth, game music remains somewhat marginalized within the broader music discourse, yet its impact on players' mental engagement is profound, suggesting it should occupy a more central role [5][6] Group 2: Industry Insights and Changes - The Chinese game music industry is evolving, with aspirations to "catch up" to more developed markets, as exemplified by projects like "Black Myth: Wukong," which aims to involve musicians more deeply in the creative process [6][11] - The number of professionals in the game music sector has increased from a handful to potentially over a thousand, indicating significant growth in the industry [11][12] Group 3: Creative Collaboration and Challenges - Successful game music creation requires close collaboration between music producers and game developers, emphasizing the importance of building personal relationships to enhance creative synergy [29][30] - The dynamic nature of game music allows it to serve both as standalone works and as integral components of the gaming experience, showcasing its unique appeal [25][26] Group 4: Cultural and Artistic Expression - Game music is characterized by its inclusivity of various musical styles, allowing composers to explore and integrate diverse influences, which can enhance the emotional connection players have with games [18][20] - The industry is moving towards a more collaborative model, where musicians are encouraged to participate actively in the creative process rather than merely serving as external contributors [16][30] Group 5: Future Directions and Opportunities - There is a growing recognition of the need to avoid over-labeling game music, as this can create psychological barriers for artists, limiting their willingness to engage with the medium [64][65] - The potential for game music to enhance the value of game IPs is significant, with high-quality compositions contributing to broader marketing and cultural outreach efforts [61][62]
腾讯研究院AI速递 20250703
腾讯研究院· 2025-07-02 15:52
Group 1 - Cursor's developer Anysphere has poached two key figures, Boris Cherny and Cat Wu, from Claude Code, despite their close partnership [1] - Anthropic's annual revenue has reached $4 billion with a valuation of $61.5 billion, and its Claude model is regarded as the best programming model [1] - Anysphere's revenue has doubled within three months to an annual income of $500 million, with a valuation of $9.9 billion, intensifying competition in the AI programming market [1] Group 2 - Zhizhu has released the open-source GLM-4.1V-Thinking visual reasoning model, which surpasses an 8x parameter 72B model in 18 authoritative evaluations [2] - The model architecture integrates ViT visual encoders, MLP adapters, and GLM language decoders, enhancing processing capabilities with 2D-RoPE and 3D-RoPE positional encodings [2] - The training process consists of four stages: multi-modal pre-training, long-context continuous training, supervised fine-tuning, and curriculum sampling reinforcement learning, significantly improving logical reasoning abilities [2] Group 3 - Sakana AI has introduced the Adaptive Branch Monte Carlo Tree Search (AB-MCTS) algorithm, enhancing large model reasoning capabilities through flexible dual-directional search [3] - The Multi-LLM AB-MCTS system allows multiple cutting-edge models (Gemini 2.5 Pro, o4-mini, DeepSeek-R1-0528) to collaborate, achieving a 30% performance improvement on the ARC-AGI-2 benchmark test [3] - This algorithm dynamically selects the optimal model based on the problem, enabling collective intelligence to surpass the limitations of individual models, with the underlying framework TreeQuest open-sourced for user applications [3] Group 4 - HeyGen has launched a "product placement" feature that generates realistic promotional videos by simply uploading a character's avatar and product images, with Elon Musk promoting Labubu as a notable case [4] - Founded by two alumni from Tongji University, HeyGen is valued at $500 million with an annual revenue nearing $80 million, expected to surpass $100 million [5] - Compared to competitors like Topview, HeyGen excels in model expression naturalness and lip-sync accuracy, offering unlimited short video production for a monthly fee of $29 [5] Group 5 - Baidu has undergone its most significant self-revolution in nearly a decade by upgrading its search function to an AI smart box that supports ultra-long text, while still retaining the traditional search mode [6] - The introduction of the "Bai Kan" feature innovates the way search results are displayed, prioritizing the most useful rich media content such as video explanations and intelligent summaries [6] - The search functionality has evolved from simple information retrieval to task delivery, allowing users to obtain ratings, locations, and travel plans directly, even supporting one-click taxi booking or package purchases [6] Group 6 - Microsoft has released the MAI-DxO medical AI system, which boasts an accuracy rate of 85.5%, outperforming a professional doctor with 10 years of experience by four times [7] - MAI-DxO simulates a real medical team's sequential diagnostic process through collaboration among five virtual doctor roles [7] - The system offers five diagnostic modes to meet various scenario needs and has introduced a professional medical sequential diagnostic benchmark, SDBench, featuring 304 challenging diagnostic cases [7] Group 7 - Baidu has launched its self-developed multi-modal generative large model MuseSteamer and the "Hui Xiang" platform, supporting high-quality video generation at resolutions from 720p to 1080p, setting a new record on the VBench-I2V video generation leaderboard [8] - The model is available in four versions: Lite (720p fast speed), Turbo (720p excellent character motion), Pro (1080p cinematic quality), and Voice (automatically generates sound effects and dialogue), catering to different creative needs [8] - Key technological highlights include precise understanding of Chinese semantics, structured video description language, cinematic dynamic beauty generation, and integrated audio-video generation, already applied in advertising creativity and short drama production [8] Group 8 - Cloudflare has introduced the "Pay Per Crawl" experimental feature, allowing websites to set permissions, fees, or blocks for AI crawlers, granting content creators bargaining power over their content [10] - Data indicates a significant disparity between AI crawlers and traditional search engines: Google returns one click for every 6-7 crawls, while OpenAI requires 1,500 crawls and Anthropic 73,300 crawls for a single click, disrupting the existing ecological balance [10] - This feature implements fee control through HTTP 402 status codes and digital signature authentication mechanisms, currently in beta testing, potentially creating a new monetization model for internet content creators from "advertising monetization" to "content licensing monetization" [10] Group 9 - Chai Discovery, supported by OpenAI, has launched the Chai-2 multi-modal generative model, achieving a 16% hit rate in de novo antibody design, improving over 100 times compared to previous SOTA technologies [11] - Chai-2 can identify effective antibodies for 26 out of 52 test targets (50%) within a 24-well plate (≤20 designs) and can generate various forms of sequences, including scFv antibodies, VHH domains, and mini-binding sites [11] - The model employs a controllable model-driven framework, reducing the development cycle from months to two weeks, achieving a 68% success rate in wet lab experiments for micro-protein design, potentially unlocking drug development capabilities beyond traditional technologies [11] Group 10 - The New Yorker highlights that AI teaches humans to write "good" articles but causes truly good articles to disappear [12] - The article points out that AI is reconstructing culture with an "average" logic, leading to standardization and loss of uniqueness in writing, with MIT experiments showing a significant reduction in brain activity levels among students using ChatGPT for writing [12] - Research indicates that AI leads to cultural homogenization, with Cornell University experiments confirming that AI-assisted writing styles of users from India and the US converge towards a "Western paradigm," with common references to pizza and Christmas [12]
《纽约客》最新撰文:AI教会人类如何写“好”文章,却让真正的好文章消失了
腾讯研究院· 2025-07-02 09:01
无忌 海伦 腾讯科技特约编译 本文转载自"腾讯科技" 《纽约客》杂志日前撰文指出, AI不仅正在改变我们的写作方式,更在潜移默化地重塑我们的思维结 构——以"效率"为名,牺牲原创性;以"智能"之名,统一表达的风格与内容。 当我们越来越频繁地借助ChatGPT等AI工具完成各类创意任务,我们是否正在失去属于人类的多样性、 深度与表达欲? AI正以"平均值"的逻辑重构文化——训练自海量数据的语言模型,天生倾向于重复、模仿和压缩,而不 是质疑、颠覆和发明。它带来的不是思想的火花,而是"看起来还行"的合格产物,是安全、标准化、去 棱角的表达。这种自动生成的平庸感,既舒适又危险:降低了原创的门槛,也降低了对原创的期待。 当所有人都写出"像样"的文章时,真正的好文章就难以诞生。这场由AI引发的"平庸化革命",值得我们 需要比那些对技术热情更多的理性反思。 以下为文章全文: 去年,麻省理工学院进行了一项实验,找来美国波士顿地区多所大学的50多名学生,分为三组,让他们 根据SAT考试写作题写一篇议论文,题目是《我们取得的成就是否必须惠及他人,才能让我们真正感到 幸福?》 第一组只能靠自己的脑力完成写作;第二组可以使用谷歌搜索 ...
腾讯研究院AI速递 20250702
腾讯研究院· 2025-07-01 16:38
Group 1: Chinese Chip Industry - Domestic chip companies are racing to go public, with nearly 10 firms, including Moore Threads and Muxi, entering the IPO process despite showing revenue growth but continued losses [1] - The Chinese AI chip market is projected to reach 350 billion RMB, theoretically accommodating 35 GPU companies with annual revenues of 10 billion RMB each, but limited production capacity poses a common challenge for the industry [1] - Domestic GPU manufacturers face challenges such as limited foundry capacity and insufficient ecosystem development, necessitating differentiation in B-end AI applications or C-end graphics sectors [1] Group 2: Meta's AI Initiatives - Meta has established the "Super Intelligence Lab" (MSL) to integrate foundational AI research, large language model development, and AI product teams, led by newly appointed Chief AI Officer Alexandr Wang [2] - The lab has successfully recruited 11 top AI talents from OpenAI, Anthropic, and Google, with over half being Chinese, including core members of GPT-4o and Gemini [2] - Meta plans to invest tens of billions of dollars in AI infrastructure, model training, and talent acquisition over the next few years, aiming to launch a next-generation model that surpasses the Llama series within a year [2] Group 3: Microsoft's GitHub Copilot Chat - Microsoft has open-sourced GitHub Copilot Chat, featuring powerful AI agent automation programming capabilities, announced by CEO Satya Nadella [3] - Key features include agent programming mode, human-machine collaboration, code completion, natural language interaction, and intelligent custom operations, capable of executing multi-step coding tasks and automatically handling errors [3] - The platform supports MCP protocol for third-party integration, allowing users to maintain control over the AI agent, and has quickly gained 1,200 stars on GitHub post-release [3] Group 4: AI Assistant Upgrades - Tencent's AI assistant, Yuanbao, has launched a new feature upgrade that enables document summarization with visual elements, extracting key information and intelligently matching original images [4][5] - This feature is based on the DeepSeek model and is applicable in various scenarios, including industry reports, foreign materials, public account articles, and installation manuals [5] - The usage is straightforward: users can switch to the DeepSeek model, upload files or paste links, and the system will automatically generate a visual summary, supporting one-click export to Tencent Docs [5] Group 5: AI Achievements at Shanghai Jiao Tong University - The AI team at Shanghai Jiao Tong University has developed an agent, ML-Master, achieving a 29.3% medal rate, topping the OpenAI MLE-bench and surpassing Microsoft and OpenAI, reaching Kaggle Master level [6] - The innovation combines "exploration-reasoning deep integration" mechanisms, utilizing multi-trajectory exploration, controllable reasoning, and adaptive memory to address core AI4AI challenges [6] - The agent has made 93.3% effective submissions across 75 real machine learning tasks, doubling computational efficiency and leading across all difficulty levels [6] Group 6: Huawei's Open Source Project - Huawei has launched the Omni-Infer open-source project, providing a "inference framework + acceleration suite" compatible with mainstream frameworks like vLLM and supporting Ascend hardware platforms [7] - The framework features an xPyD scheduling system, load balancer, MoE model optimization support, intelligent resource allocation, and enhanced attention mechanisms, achieving PD separation deployment and system-level QPM optimization [7] - Several institutions, including Beijing Zhiyuan Research Institute and Shanghai AI Laboratory, have joined the collaboration, with the project adopting an open community governance model for transparent decision-making [7] Group 7: Amazon's AI Strategy - AWS CEO Matt Garman detailed Amazon's AI strategy, noting that AI business has generated tens of billions in revenue, with inference workloads expected to exceed training workloads, potentially accounting for 80-90% of AI workloads in the future [11] - AWS is collaborating with Anthropic to build the largest AI training cluster in history (Project Rainier), deploying Tranium Two processors that are five times more powerful than previous generations, while also maintaining partnerships with NVIDIA for P6 instances [11] - AWS believes that reducing AI costs requires a multi-faceted approach, including chip innovation, software optimization, and algorithm improvements, and is actively expanding data centers, with plans to launch a "European Sovereign Cloud" to address data sovereignty issues [11] Group 8: Peter Thiel's Views on AI - Peter Thiel maintains a "technological stagnation theory," arguing that since the 1970s, breakthroughs have only occurred in the digital realm, while progress in the physical world (transportation, energy, medicine) has slowed, threatening social stability [12] - He advocates for radical disruption of the status quo, supporting Trump to break the deadlock, and emphasizes the need to take more risks in fields like biotechnology and nuclear energy to overcome excessive regulatory culture [12] - Thiel holds a cautious view on AI, recognizing it as the only significantly advancing field, but questions whether it can truly end stagnation, emphasizing that its real value lies in solving physical world problems [12]
如何与外星人沟通?
腾讯研究院· 2025-07-01 08:24
追问nextquestion . 以下文章来源于追问nextquestion ,作者追问 科研就是不断探索问题的边界 NikhilMahant 瑞典乌普萨拉大学哲学系语言哲学家 王百臻 编译 在电影《降临》 (Arrival ,20 16) 中,一批拥有七条肢体的外星生命造访地球,并带来了一种无人能 解的语言。这些外星生命被戏称为"七肢桶" (Heptapods) ,他们慷慨地在飞船上腾出空间与人类进行 语言交流,负责翻译的团队却一头雾水。七肢桶书写的句子由墨迹氤氲的圆形符号组成,迥异于地球上 的任何文字。 该电影改编自姜峯楠 (Ted C hiang ) 的小说,其戏剧冲突建立在前所未见的七肢桶语言之上。 然而, 七肢桶语还不算彻彻底底的外星语言。除了习得七肢桶语就能掌握特殊能力这一科幻设定外,这种语言 与普通的人类语言并没有显著差异。 圆形符号确实奇特,但同样表示名词、动词等常见语法范畴的词 语,且可以被翻译成英语。实际上,影片中的一段关键情节讲述的就是译者将七肢桶语当中的名词"工 具"误译成了"武器"。 《降临》剧照。图中圆圈状的图案就是"七肢桶"的文字。 第二层面是结构,涉及词语结构、语法和句法。 词 ...
腾讯研究院AI速递 20250701
腾讯研究院· 2025-06-30 15:51
Group 1: OpenAI Custom Services - OpenAI has launched a custom AI consulting service starting at ten million dollars, with engineers assisting clients in model fine-tuning and application development [1] - The U.S. Department of Defense (contract worth $200 million) and Singapore's Grab are among the first clients, with services extending to military strategy and map automation [1] - This move positions OpenAI in competition with consulting firms like Palantir and may pose a threat to smaller startups focused on specific AI applications [1] Group 2: Gemini 2.5 Pro API - The Gemini 2.5 Pro API has returned to free usage, offering five requests per minute, 250,000 tokens per minute, and 100 requests per day [2] - Users can obtain an API Key by logging into Google AI Studio, creating the key, and saving it, with more lenient usage restrictions compared to OpenAI's o3 model [2] - The API can be accessed through third-party clients like Cherry Studio or Chatbox, supporting text Q&A, image analysis, and built-in internet search functions [2] Group 3: LeCun's PEVA World Model - LeCun's team has released the PEVA world model, achieving coherent scene prediction for 16 seconds, enabling embodied agents to possess human-like predictive capabilities [3] - The model combines 48-dimensional human joint kinematics data with conditional diffusion Transformers, trained using first-person perspective videos and full-body pose trajectories [3] - PEVA demonstrates intelligent planning abilities, selecting optimal solutions among multiple action options for complex tasks, outperforming baseline models by over 15% [3] Group 4: Huawei's Open Source Models - Huawei has open-sourced two large models: the 720 billion parameter mixed expert model "Pangu Pro MoE" and the 70 billion parameter dense model "Pangu Embedded 7B" [4][5] - The Pangu Pro MoE is trained using 4,000 Ascend NPUs, with an activated parameter count of 16 billion, achieving performance comparable to Qwen3-32B and GLM-Z1-32B models, with single-card inference throughput reaching 1,528 tokens/s [5] - The Pangu Embedded 7B employs a dual-system architecture of "fast thinking" and "slow thinking," automatically switching based on task complexity, outperforming similarly sized models like Qwen3-8B and GLM4-9B [5] Group 5: Baidu's Wenxin Model 4.5 Series - Baidu has officially open-sourced the Wenxin model 4.5 series, launching ten models with parameter scales ranging from a 47 billion mixed expert model to a 0.3 billion lightweight model, along with API services [6] - The series adopts the Apache 2.0 open-source protocol and introduces a multi-modal heterogeneous model structure, enhancing multi-modal understanding capabilities while maintaining high performance in text tasks [6] - The models have been benchmarked against DeepSeek-V3 and provide support through the ERNIEKit development suite and FastDeploy deployment suite [6] Group 6: Zhihu's Knowledge Base Upgrade - Zhihu has completed a significant upgrade to its knowledge base, allowing for public subscription and link sharing, deeply integrating with community content for an immersive reading experience [7] - The knowledge base capacity has expanded to 50GB, supporting various file formats for upload, and increasing exposure scenarios such as knowledge squares and personal homepages [7] - Zhihu has initiated an incentive program to encourage users to create and share vertical knowledge bases, with awards for "most valuable" and "prompt creativity," running until July 18 [7] Group 7: EVE 3D AI Companion - EVE is a 3D AI companion application designed with gamified elements, a favorability system, and interactive features, creating a strong sense of "human-like" presence and proactivity [8] - The AI can perform cross-dimensional interactions, such as delivering milk tea to users' homes and creating personalized songs, blurring the lines between virtual and real experiences [8] - EVE enhances the AI companionship experience through detailed expressions (emojis, trending topics) and a memory system, representing a significant breakthrough in the AI entertainment sector [8] Group 8: Apple's XR Devices - Apple is reportedly developing at least seven head-mounted devices, including three Vision series and four AI glasses, with the first AI glasses expected to launch in Q2 2027, targeting annual shipments of 3 to 5 million units [10] - The lightweight Vision Air is anticipated to begin mass production in Q3 2027, being over 40% lighter than the Vision Pro and significantly cheaper, while XR glasses with display features are expected by late 2028 [10] - The development of these devices is expected to ignite the AI glasses market, potentially exceeding 10 million units in sales [10] Group 9: Insights from Iconiq Capital's AI Report - A survey of 300 AI companies indicates a shift from conceptual hype to practical implementation, with OpenAI and Claude leading in enterprise AI selection, and nearly 90% of high-growth startups deploying intelligent agents [12] - The structure of AI spending shows that data storage and processing costs far exceed training and inference, with companies transitioning from traditional subscription models to usage-based hybrid pricing [12] - Among AI-native companies, 47% have reached critical scale, while only 13% of AI-enhanced companies have done so, with 37% of rapidly growing companies focusing on AI, making code intelligent agents the primary productivity application [12]
拉布布走红启示,数字时代文化IP孵化新密码
腾讯研究院· 2025-06-30 08:21
Group 1: Core Insights - The rise of Labubu represents a unique path of IP incubation, diverging from traditional methods reliant on media like film and animation, instead leveraging innovative operational mechanisms and digital platforms for influence and breakout effects [1][3][7] - Labubu's design and character traits resonate with contemporary user aesthetics and emotional projections, showcasing a rebellion against perfect imagery, which aligns with current trends in IP development [4][5][6] - The success of Labubu is significantly attributed to its effective social media strategy, particularly the viral promotion by celebrities, which has expanded its reach and popularity across various markets [6][7] Group 2: Media Environment Changes - The evolution of media environments has transformed IP incubation from a content-first approach to an interaction-first model, emphasizing the role of social platforms and user co-creation in building IP popularity [9][10] - Traditional IP incubation relied heavily on large-scale productions in film, animation, and gaming, while Labubu's success illustrates a shift towards utilizing social media and short video content for community building and engagement [10][11] - China's content industry is gradually developing its own innovative paths for IP creation, moving towards a more integrated approach that combines various media forms [12][13] Group 3: Evolution of IP Functions and Future Industry Trends - The value of IP has evolved from being mere extensions of single works to becoming core assets that connect communities and embody cultural identity, highlighting the importance of commercial viability [15][16] - Successful IPs today serve as powerful commercial amplifiers, with significant revenue generated from merchandise and licensing, as seen in the case of major franchises like Star Wars and Disney [15][16] - The future of IP development in China is expected to leverage its growing digital content ecosystem, fostering a multi-faceted approach that integrates social media, literature, short dramas, and gaming to create a robust IP narrative system [13][17]
肖仰华教授:具身智能距离“涌现”还有多远?|Al&Society百人百问
腾讯研究院· 2025-06-27 06:59
Core Viewpoint - The article discusses the transformative impact of generative AI and embodied intelligence on technology, business, and society, emphasizing the need for a multi-faceted exploration of AI's opportunities and challenges [1]. Group 1: AI Development Trends - The development of AI in recent years has followed two clear trajectories: generative AI (AIGC) and embodied intelligence [5][9]. - Generative AI aims to equip machines with human-like cognitive abilities, while embodied intelligence focuses on enabling machines to mimic human sensory and action capabilities [10][11]. - The current AI landscape highlights the importance of data quality and training strategies over sheer data volume and computational power [6][19]. Group 2: Embodied Intelligence - The next phase of embodied intelligence is expected to involve mind-body coordination, reflecting the philosophical inquiry into how human-level intelligence arises [6][11]. - The application of embodied intelligence in consumer markets hinges on the machine's ability to empathize and understand human emotional needs [6][10]. - There is a significant gap in the data required for embodied intelligence to reach its potential, with current datasets lacking the scale necessary for generalization [7][24]. Group 3: AI as a Technological Revolution - Generative AI is characterized as a technological revolution based on three criteria: foundational nature, exponential productivity enhancement, and profound societal impact [13][14]. - The societal implications of AI's cognitive capabilities are vast, potentially affecting all human activities and leading to concerns about cognitive laziness among humans [14][16]. - In contrast, the impact of embodied intelligence on productivity is seen as limited compared to the cognitive advancements of generative AI [15][16]. Group 4: Data and Model Relationships - The relationship between model algorithms and data is crucial, with algorithms determining the lower limit of model performance and data defining the upper limit [20][21]. - The current focus in AI development is on enhancing data quality and training strategies, particularly in the context of embodied intelligence [19][22]. - The industry faces challenges in data acquisition for embodied intelligence, necessitating innovative approaches to data collection and synthesis [25][26]. Group 5: Future Directions - To overcome the data scarcity in embodied intelligence, strategies such as leveraging real, simulated, and synthetic data are being explored [25][26]. - The development of wearable devices capable of capturing real-world actions could provide a substantial data foundation for embodied intelligence [26]. - The complexity of human experience and environmental interaction presents significant challenges for the data-driven advancement of embodied intelligence [34][35].
腾讯研究院AI每周关键词Top50
腾讯研究院· 2025-06-27 05:22
AI前沿每周关键词Top50 ( 0623-0627 ) 每周50关键词 把握全局AI动态 点击 关键词 可查看资讯概述 事件 何恺明加入 谷歌 扫码 加入AGI数据库,AI智能问答 ( 腾讯研究院ima AGI知识库二维码) 推 荐 阅 读 王强 白惠天: 《万字解读"智能+":加什么,怎么加?》 点个 "在看" 分享洞见 | 类别 | Top关键词 | 主体 | | --- | --- | --- | | 算力 | MPK编译器 | CMU | | 模型 | Keye-VL | 快手 | | 模型 | Mu模型 | 微软 | | 模型 | Kimi-VL开源 | 月之暗面 | | 模型 | 强化学习教师 | Sakana AI | | 应用 | AI应用构建 | Anthropic | | 应用 | Gemini CLI | 谷歌 | | 应用 | AI单元故事集 | 快手 | | 应用 | 声音复刻升级 | 科大讯飞 | | 应用 | 小米AI眼镜 | 小米 | | 应用 | AlphaGenome | 谷歌 | | 应用 | 具身Gemini | 谷歌 | | 应用 | Imagen 4 | 谷歌 | ...
从语言到意识的“一步之遥”,AI究竟要走多远?
腾讯研究院· 2025-06-26 07:58
以下文章来源于追问nextquestion ,作者追问 追问nextquestion . 科研就是不断探索问题的边界 George Musser 作者 张旭晖 编译 人工智能的终极梦想,从来不局限于打造一个能击败国际象棋特级大师的博弈引擎,或是设计出花言巧 语蛊惑人心的聊天机器人。它的真正使命,是成为一面映照人类智慧的明镜,帮助我们更深刻地认识自 我。 科研工作者的目标,也不止于是狭义的人工智能,他们追求的是通用型人工智能 (A GI ) ——一种具有 类人的适应力与创造力的智能系统。 诚然,如今大语言模型 (LLM) 的问题解决能力已然让大多数研究者刮目相看,但它们依然有着明显的 短板,例如缺乏持续学习的能力——一旦完成基于书籍、网络文本等材料的训练后,它们的知识库就被 冻结了,再也无法"更新"。正如AI公司SingularityNET的本·格策尔 (Ben Goertzel) 形象地比喻:"你没法 让大语言模型去上大学,甚至连幼儿园都进不了。"它们通过不了有"机器人高考"之名的综合测验。 "掌握"了语言,离模拟思维还有多远? 在语言处理方面,目前的LLM确实展现出了专家所称的AGI"形式能力":即使你提供 ...