Workflow
腾讯研究院
icon
Search documents
青年和技术,如何改变了博物馆?|2025国际博物馆日
腾讯研究院· 2025-05-16 06:53
2025年国际博物馆日以"快速变化社会中的博物馆未来" (The Future of Museums in Rapidly Changing Communiti es) 为主题,聚焦博物馆如何在社会、科技和环境加速变革的当下持续发挥影响力,凝聚共识。这不仅 是对博物馆传统使命的肯定,更是对其未来发展方向的前瞻。 面对全球社会和科技的深刻变革,博物馆必须坚持文化核心价值,同时积极拥抱数字技术创新,推动文 化传承与社会进步的协同发展,塑造更加包容、智能和可持续的文化生态系统,持续发挥其文化守护者 和创新引擎的双重角色。 国家文物局在今年的 "组织开展2025国际博物馆日活动的通知" 里特别强调,"应重视和发动青年群体力 量。""博物馆通过精品展览和活动,阐释历史文化、呈现社会变迁、凝聚集体认同,受到广大青少年的 青睐与欢迎。同时,突破单向教育模式,将博物馆空间转化为青年表达文化主张、探索社会议题的试验 室,为文化传承提供可持续的动力和活力。" 1 童祁 腾讯研究院特约作者 在数字化转型和人工智能技术日益渗透文化领域的当下,博物馆正从传统的文物收藏与展示场所,转变 为连接过去与未来、促进社会对话和文化传承、本土文化 ...
腾讯研究院AI速递 20250516
腾讯研究院· 2025-05-15 14:38
Group 1: Regulatory Developments - The U.S. Senator proposed a bill requiring companies like NVIDIA and AMD to embed geolocation tracking in high-end GPUs and AI chips, effective in six months [1] - The regulation covers AI processors, high-performance servers, and high-end graphics cards like the RTX 5090, aimed at preventing strategic hardware from flowing to unauthorized countries [1] - Chip manufacturers will be responsible for product tracking, and the bill mandates annual assessments for three years, potentially leading to more restrictions [1] Group 2: AI Model Updates - OpenAI officially launched the GPT-4.1 model in ChatGPT, available for Plus, Pro, and Team users, with enterprise and education users to gain access in the coming weeks [2] - GPT-4.1 shows excellent performance in coding tasks and instruction adherence, with significantly improved generation speed, serving as an ideal replacement for previous models [2] - The context window for ChatGPT's GPT-4.1 is limited to 128k tokens, falling short of the promised 1 million tokens in the API version, disappointing users [2] Group 3: New AI Models and Features - Anthropic plans to release new versions of Claude Sonnet and Opus, featuring "extreme reasoning" capabilities that establish a dynamic loop between reasoning and tool usage [3] - The new models can autonomously pause, reassess problems, and adjust strategies, with capabilities to automatically test and correct errors in code generation tasks [3] - A new model, codenamed Neptune, is reportedly in testing, supporting a maximum context length of 128k tokens [3] Group 4: Advancements in Voice Technology - MiniMax's new voice model, Speech-02, surpasses OpenAI and ElevenLabs in metrics like word error rate and speaker similarity, achieving state-of-the-art levels [4][5] - Speech-02 enables true zero-shot voice cloning and employs an innovative Flow-VAE architecture, requiring only a few seconds of audio to replicate speaker characteristics [5] - The model supports 32 languages and allows flexible control over voice tone and emotional modulation, costing only a quarter of ElevenLabs' competitors, marking a shift towards personalized AI voice technology [5] Group 5: Browser and Audio Innovations - Tencent launched the Yuanbao browser plugin for Chrome, offering features like word highlighting for questions, content summarization, foreign webpage translation, and one-click bookmarking [6] - The plugin includes a floating ball and sidebar for easy access to screenshot questions, file uploads, and content searches, enhancing web browsing efficiency [6] - Stability AI partnered with Arm to introduce the Stable Audio Open Small model, the fastest audio generation model for mobile, capable of generating 11 seconds of audio in 8 seconds [7] - The model, with 341 million parameters, is designed for short audio and sound effect generation, using data from copyright-free sources, but currently only supports English prompts [7] Group 6: Video Generation and Gaming AI - Alibaba released the open-source Wan2.1-VACE video generation model, supporting multiple tasks like text-to-video and image reference generation, usable on consumer-grade graphics cards [8] - The model comes in two versions: 1.3B (supporting 480P) and 14B (supporting 720P), utilizing an innovative video condition unit for various input types [8] - Tencent's mixed Yuan model developed an intelligent NPC system for the game "BUD," enabling autonomous actions, personalized interactions, emotional expression, and memory reasoning [10] - The game achieved over 20 million AI dialogues within three months, with the upcoming release of mixed image version 2.0 aimed at enhancing the AI product matrix [10] Group 7: AI Opportunities and Challenges - Sequoia Capital detailed the "trillion-dollar AI opportunity," emphasizing that AI is disrupting both software and service profit pools, with the application layer being the most valuable [12] - The emerging economy of intelligent agents will not only convey information but also facilitate transactions, track relationships, and build trust, leading to a nested economic network of human-machine collaboration [12] - The industry faces three major technical challenges: persistent identity authentication for intelligent agents, seamless communication protocol development, and security assurance, entering a new era of "high leverage, low certainty" [12]
美国住房援助体系的历史、现状及启示
腾讯研究院· 2025-05-15 09:49
Core Viewpoint - The article discusses the U.S. housing assistance system, which primarily relies on the private housing market and has a low coverage of social security functions, benefiting only 2.7% of the total population. Despite its small scale, the system has nearly a century of history, undergoing multiple revisions and improvements, and has developed some equitable and efficient institutional arrangements worth studying and learning from [2][4][26]. Group 1: Overview of the U.S. Housing Assistance System - The U.S. housing assistance system is funded by the federal government and executed by state and local governments, providing support to low-income families through three main forms: public rental housing, project-based rental assistance, and housing vouchers [2][5][6]. - The system has evolved since the 1930s, with significant changes in the 1960s to incorporate the private sector, leading to a shift towards a model where private housing sources dominate, and public housing plays a supplementary role [6][9]. - As of 2023, approximately 5.13 million units are included in the housing assistance system, accounting for 3.6% of the total housing stock, with public housing making up only 17.3% of the assistance forms [9][12]. Group 2: Evaluation and Management of Public Housing - The federal government has established a multi-dimensional public housing evaluation system to monitor and assess local public housing agencies, ensuring efficiency and quality in operations [3][15]. - Local public housing agencies are responsible for managing applications and setting rent standards, with eligibility typically requiring income below 80% of the area median income [15][16]. - Due to insufficient funding and limited housing stock, many eligible families face long waiting times, averaging 25 months, to receive assistance [16]. Group 3: Financing Support for Homebuyers - Beyond public housing, the federal government has set up official or semi-official institutions to provide mortgage insurance and support mortgage securitization, helping homebuyers improve financing conditions and reduce costs [18][20]. - The establishment of the Home Owner's Loan Corporation in 1933 and the Federal Housing Administration in 1934 marked significant steps in providing long-term, fixed-rate mortgage products to stabilize the housing market [19][20]. - By 2023, the U.S. housing mortgage market has grown to nearly $14 trillion, with the mortgage-to-GDP ratio exceeding 50%, indicating a robust financing environment for homebuyers [20][23]. Group 4: Lessons and Insights - The U.S. housing assistance system, while limited in scope, has developed effective practices over nearly a century that balance equity and efficiency, such as the division of responsibilities between federal and local governments [26][30]. - The establishment of a comprehensive evaluation and incentive mechanism by the Department of Housing and Urban Development (HUD) helps prevent local agencies from neglecting management in favor of supply [31]. - The relationship between government and the market is crucial, as the system relies heavily on private housing resources while the government provides necessary support to facilitate homeownership [32].
腾讯研究院AI速递 20250515
腾讯研究院· 2025-05-14 13:51
Group 1: AI Product Developments - Notion launched three new AI features, including an AI meeting notes function that integrates seamlessly with calendar systems [1] - Tencent's CodeBuddy 3.0, a code assistant, is now available as a plugin that integrates with various IDEs and is deeply connected with WeChat developer tools [2] - Step1X-3D, an open-source 3D model by Jueyue Star, features 4.8 billion parameters and enhances 3D content generation with a 20% improvement in geometric conversion success rate [3] Group 2: AI Model Innovations - ByteDance introduced the lightweight multi-modal inference model Seed1.5-VL, which refreshes 38 benchmark tests with only 532 million visual encoder parameters [4][5] - Qwen's Deep Research assistant system automates complex research tasks, reducing hours of work to minutes and generating comprehensive reports with citations [6] - OpenMemory MCP allows for 100% local operation and memory sharing among different AI tools, addressing the issue of session memory loss [7] Group 3: AI in Education and User Engagement - Duolingo achieved significant progress by generating 148 courses in one year using AI, shifting its strategy to "All in AI" [8] - The platform's design encourages continuous learning, successfully maintaining over 10 million users with a 365-day learning streak [8] Group 4: AI and Brain-Computer Interfaces - Apple partnered with Synchron to develop a brain-computer interface that allows users to control iPhones using brain waves, targeting individuals with mobility impairments [10] - The technology utilizes a non-invasive method for electrode implantation, making it safer compared to other methods [10] Group 5: Robotics and AI Integration - Tesla showcased advancements in its Optimus robot, demonstrating "zero-shot transfer" capabilities for executing complex dance moves [11] - The training method used for the robot emphasizes efficiency and safety, although challenges remain in bridging the gap between simulation and real-world performance [11] Group 6: AI Usage Trends - Poe's report indicates a decline in DeepSeek usage from 7% to 3%, while OpenAI's GPT-4o surged due to new features [12] - The competition in image generation is intensifying, with GPT-Image-1 achieving a 17% usage rate within two weeks [12]
如何应对无聊,是后稀缺时代的最大挑战
腾讯研究院· 2025-05-14 08:35
Core Viewpoint - The book "Deep Utopia: Life and Meaning in a Solved World" by Nick Bostrom explores the potential for an ideal society in the context of rapid technological advancement, questioning how such a society could be achieved and what it would mean for humanity [3][4][14]. Summary by Sections Author Background - Nick Bostrom, born in 1973 in Sweden, has a diverse academic background including degrees in philosophy, physics, and computational neuroscience, and has focused on existential risks and the future of humanity [1][2]. Concept of Negative Entropy - Bostrom's engagement with "Extropianism" suggests that technology could eventually allow for infinite human life, leading to significant political and economic changes [2]. Shift in Focus - Unlike his previous work on the dangers of superintelligent AI, "Deep Utopia" revives discussions on ideal societies, drawing from historical philosophical traditions [3][4]. Technological Progress and Society - Bostrom acknowledges that technological advancements do not guarantee a better society, citing historical examples where progress led to increased oppression [3][4]. Imagining a Solved World - The book hypothesizes a world where technological issues are resolved, exploring the implications and desirability of such a scenario [4][5]. Structure of the Book - The narrative is structured around a series of lectures by Bostrom, interspersed with discussions from his audience and fictional correspondence, creating a philosophical dialogue [5][13]. Key Themes Discussed 1. The source of progress in a society with surplus wealth [5]. 2. The balance between leisure and productivity in a future society [5]. 3. The significance of meaningful living [5]. 4. Addressing boredom in a leisure-rich society [5]. Paradox of Equality and Progress - Bostrom identifies a paradox where a society that achieves equality may lose the motivation for progress, leading to a potential decline in innovation [6][7]. New Forms of Consumption - He proposes three potential new consumption forms to stimulate progress: 1. New products unaffected by diminishing returns [8]. 2. Public projects that absorb social capital [8]. 3. Status competition in an equal society [8]. Addressing Deep Redundancy - Bostrom outlines five mechanisms to counteract the loss of purpose in a post-work society, including pleasure, quality of experience, self-justifying activities, artificial purposes, and cultural engagement [9][10][11]. The Challenge of Boredom - The book emphasizes the need to create engaging experiences to combat boredom, which is seen as a significant challenge in a post-scarcity society [11][12]. Philosophical Implications - The discussions in the book reflect on the nature of happiness and fulfillment, suggesting that true enjoyment comes from deeper engagement with experiences [12][14]. Conclusion - Bostrom's work serves as a reflection on the potential paths humanity may take in the face of technological advancement, emphasizing the importance of choice and the ongoing nature of these discussions [14][15].
腾讯研究院AI速递 20250514
腾讯研究院· 2025-05-13 15:57
Group 1: OpenAI Developments - OpenAI has launched a new PDF export feature for Deep Research, which supports tables, images, and clickable reference links, receiving positive feedback from users [1] - This update marks the first action under the new head of the application division, Fidji Simo, indicating OpenAI's acceleration towards enterprise market transformation [1] - The competition among AI research assistants is intensifying, shifting from feature comparison to optimizing user experience and workflow integration, with PDF export becoming a basic requirement for enterprise-level AI tools [1] Group 2: Lovart Design Agent - Lovart is the first design-specific agent that can generate design specifications, images, and execute plans based on professional design knowledge [2] - The product supports a full design workflow, integrating various tools to convert static images into dynamic videos [2] - This signifies a major transformation in design workflows, moving from mere creation to complete product asset delivery, with vertical agents likely becoming a trend in the industry [2] Group 3: Kunlun Wanwei's Matrix-Game - Kunlun Wanwei has open-sourced Matrix-Game, an interactive world model capable of generating coherent game interaction videos based on user input, surpassing existing open-source models in visual quality and physical consistency [3] - The model employs a two-phase training process and a unique architecture for high-precision action response and scene generalization [3] - This represents a significant breakthrough in spatial intelligence, applicable not only in game development but also in film, advertising, and XR content production [3] Group 4: Tencent's Unified Reward Model - Tencent has launched the UnifiedReward-Think, a unified multi-modal reward model with long-chain reasoning capabilities, enhancing evaluation ability through a three-phase training process [4][5] - This model addresses the limitations of existing reward models, demonstrating explicit and implicit reasoning capabilities, significantly improving performance in image generation and understanding tasks while maintaining high interpretability [5] - UnifiedReward-Think has been fully open-sourced, marking a shift from simple scoring systems to intelligent evaluation systems with cognitive understanding [5] Group 5: Manus AI's Free Access - Manus AI has removed the invitation system, allowing free access for all users, with each user receiving daily free task credits and a one-time bonus [6] - The platform offers three paid subscription tiers, unlocking additional features and priority services, while free credits are valid for one day only [6] - Manus AI recently completed a $75 million funding round, raising its valuation to $500 million, with plans to expand into overseas markets [6] Group 6: US AI Regulation Changes - The US Department of Commerce has repealed the Biden-era AI diffusion rules, citing concerns over innovation and diplomatic relations, while proposing new simplified regulations [7] - The new rules will strengthen controls on overseas AI chip exports, particularly targeting Huawei's Ascend chips, and may push tech giants towards Chinese AI technologies [7] - Saudi Arabia has pledged to invest $600 billion in various sectors, including AI data centers, leading to a surge in tech stocks like NVIDIA [7] Group 7: OpenAI's HealthBench - OpenAI has introduced the HealthBench, a medical evaluation benchmark developed with the participation of 262 doctors, containing 5,000 real dialogues for comprehensive AI model assessment [8] - The latest model, o3, scored 60%, significantly outperforming earlier GPT models, with notable performance improvements in smaller models and reduced costs [8] - The project has been open-sourced, providing a complete evaluation tool that aligns model scoring with physician judgments [8] Group 8: NVIDIA's AI Factory Vision - NVIDIA's CEO Jensen Huang believes AI factories will lead the next industrial revolution, with plans to invest $50-60 billion in building large-scale AI factories over the next decade [9] - AI is seen as a true digital labor force expansion, impacting nearly all industries and becoming a new generation of infrastructure following information and energy [9] - NVIDIA is transitioning from a chip company to an AI infrastructure company, investing $20-30 billion annually in R&D to establish global AI ecosystem standards [9] Group 9: Future of AI Agents - OpenAI aims to develop ChatGPT into a personalized AI service, with predictions of widespread AI agent applications by 2025 and capabilities for knowledge discovery by 2026 [10] - The team focuses on maintaining an efficient structure and rapid iteration, positioning itself as a core AI subscription service provider [10] - Different age groups perceive AI applications differently, with younger generations viewing AI as an operating system [10]
人类技能的奇幻未来
腾讯研究院· 2025-05-13 08:06
Group 1 - The article discusses the future of skill development, emphasizing the integration of technology and artificial intelligence to enhance human skills [2][3] - It presents a vision for 2037 where a platform called SkillNet, driven by AR and AI, enables rapid skill acquisition [2][4] - The impact of quantum computing on accelerating scientific discovery and machine learning is highlighted, indicating a growing demand for skills [2][4] Group 2 - The challenges of skill development include skill inequality, where technological advancements may exacerbate disparities, particularly in low-wage and repetitive jobs [2][3] - The phenomenon of de-skilling and job simplification is discussed, where industrial engineers redesign work to reduce technical contact, leading to skill degradation among workers [2][3] - The social and economic implications of skill inequality are emphasized, calling for measures to prevent such outcomes [2][3] Group 3 - Proposed solutions include digital apprenticeship programs that leverage digital technology and AI to create new skill development infrastructures [2][3] - The potential of hybrid systems, combining human and AI capabilities, to enhance productivity and skills in complex tasks is introduced [2][3] - The need for open and global learning platforms to facilitate knowledge sharing and collaboration is advocated [2][3] Group 4 - The article illustrates a futuristic scenario where a skilled worker named Sara uses SkillNet to learn a new skill in ultrasonic welding, showcasing the platform's capabilities [4][5] - Sara's experience highlights the importance of real-time mentorship and feedback from experts, facilitated by the SkillNet platform [6][7] - The narrative emphasizes the collaborative learning environment created by SkillNet, benefiting both experts and novices [8][9] Group 5 - The article argues that the future of skill development will be hybrid, involving a network of human experts, novices, and AI focused on building capabilities in work settings [25][26] - It discusses the concept of "chimera," where human and AI collaboration enhances learning and productivity beyond what either could achieve alone [27][28] - The need for a digital apprenticeship system to preserve human capabilities in the age of intelligent machines is stressed [28][29]
腾讯研究院AI速递 20250513
腾讯研究院· 2025-05-12 14:46
Group 1 - Sakana AI introduces Continuous Thinking Machine (CTM) which synchronizes neuronal activity to achieve complex reasoning similar to human thought processes [1] - CTM demonstrates human-like reasoning in tasks such as maze solving and image recognition, with accuracy improving as thinking time increases [1] - Apple launches FastVLM, a mobile visual language model that processes images efficiently, achieving 85 times faster token output compared to LLaVA [2][2] Group 2 - Tencent upgrades its Hunyuan T1-Vision model to enhance image understanding and supports multi-modal reasoning, improving response speed by 1.5 times [3] - Perplexity's Comet AI browser, based on Chromium, is set to enter beta testing, featuring AI agent capabilities to automate complex tasks [4][5] - Kuaishou releases Poify, an AI image generation tool focused on e-commerce, offering features like background replacement and AI model fitting [6] Group 3 - ByteDance open-sources the 8B parameter code model Seed-Coder, which utilizes a "LLM teaches LLM" approach for data selection and supports 89 programming languages [7] - The model surpasses 70B models in performance on certain tests, indicating strong potential in code generation [7] - Reverse engineering reveals the hidden personas of major AI systems, influencing user interaction and model behavior [8] Group 4 - A high school student discovers 1.5 million unknown celestial bodies using AI on NASA's NEOWISE data, showcasing the potential of AI in astronomical research [10] - The student developed the VARnet model, achieving rapid identification of celestial variability with a processing speed of 53 microseconds per object [10] - The research contributes to a comprehensive infrared variability survey project, aiding in the exploration of cosmic origins [10] Group 5 - AI product pricing is evolving from usage-based to more sophisticated models aligned with customer value, including workflow and outcome-based pricing [11] - AI applications are best suited for sectors reliant on business process outsourcing rather than high-salary jobs, where AI serves as an auxiliary tool [11] - Paid companies emerge to address AI product pricing challenges, providing backend systems for billing and pricing [11] Group 6 - a16z predicts a transformation in software development around AI agents, with new trends including intent-driven version control replacing Git [12] - Development approaches are shifting from bottom-up to top-down, allowing developers to describe intentions for AI agents to execute tasks [12] - The Model Context Protocol (MCP) is anticipated to become a universal standard for AI agent capabilities, facilitating direct tool and service integration [12]
通用人工智能何时到来?
腾讯研究院· 2025-05-12 08:11
闫德利 腾讯研究院资深专家 一、AI已在诸多任务领域超越人类 AI发展日新月异,在许多任务上已经陆续超越人类基线水平。如2015年图像分类,2018年中等水平阅读 理解,2020年视觉推理、英语语言理解,2023年多任务语言理解、竞赛级数学,2024年博士级科学问 题。下图所示的8项关键任务技能中,AI仅在多模态理解和推理能力上还略逊人类一筹,但从2023年开 始就加速提升。我们有望很快见证AI 能力在现有主流基准上"全部超越人类水平"的奇点时刻。 图 选定的 AI 指数技术性能基准与人类表现对比 二、AGI的终极目标或于年内实现 我们已经构建了无数在特定任务上超越人类水平的AI系统,但它们缺乏通用性,无法应对超出预定任务 之外的问题,尚处于"狭义人工智能 (Narrow AI) "阶段。随着AI性能的大幅提升,具备跨领域能力、在 多个方面媲美甚至超越人类的、更强大的AI被提上日程。 人们常将之命名为"通用人工智能(AGI)" 。 各国高度重视AGI。2023年4月28日中共中央政治局会议提出:"要重视通用人工智能发展";英国《国家 人工智能战略》 (2021 ) 对AGI进行了专门强调,指出"必须认真对待A ...
腾讯研究院AI速递 20250512
腾讯研究院· 2025-05-11 14:17
生成式AI 一、 OpenAI强化微调终于上线,几十个样本可轻松打造AI专家 1. OpenAI正式发布RFT(强化微调)功能,通过思维链推理和专属评分机制,可用极少样本快 速提升模型在特定领域的专业表现; 2. RFT主要应用于三大场景:指令转代码、文本精华提取、复杂规则应用,已有ChipStack 等多家公司取得显著成效; 3. 实施RFT前必须创建评估体系,需要明确任务定义和强化评分方案,避免模棱两可的任务 目标。 https://mp.weixin.qq.com/s/c7RfeoWNwh3NZDeuTCXXLw 二、 Gemini 2.5实现视频理解重大突破:一口气处理6小时视频 1. Gemini 2.5 Pro突破视频处理长度限制,通过低媒体分辨率技术可处理长达6小时视频, 在多个学术基准测试中创下新纪录; 2. 实现视频内容与代码无缝结合,能将视频直接转化为交互式网页应用、p5.js动画等创新应 用形式; 3. 具备精准的视频片段检索和时序推理能力,可实现复杂场景计数、时间戳定位等高级分析 功能。 https://mp.weixin.qq.com/s/FkaOacVuVCS7wzny5l1jFQ ...