Workflow
PhysRig
icon
Search documents
腾讯研究院AI速递 20250721
腾讯研究院· 2025-07-20 16:02
Group 1 - Kimi K2 surpasses DeepSeek to become the top open-source model globally, ranking fifth overall and closely following leading closed-source models [1] - K2 inherits the DeepSeek V3 architecture with parameter adjustments, including an increase in expert numbers and a reduction in attention heads [1] - Two of the top 10 open-source models are from China, challenging the perception that "open-source equals weak performance" [1] Group 2 - Decart releases MirageLSD, the first real-time, unlimited diffusion video model capable of processing any video stream with a 40-millisecond delay [2] - Karpathy invests as an angel investor, foreseeing broad applications in real-time film production, game development, and AR [2] - The breakthrough lies in the real-time stream diffusion architecture, addressing error accumulation through frame-by-frame generation and historical enhancement methods [2] Group 3 - Suno V4.5+ offers layered generation and fusion of vocals and instruments, allowing users to upload personal vocals or accompaniments for AI-assisted creation [3] - The new "Inspire" mode enables users to upload personal dry vocals for AI to learn and create music that matches their vocal characteristics [3] - The platform has optimized creative thresholds and enhanced AI collaboration efficiency with the launch of Suno V4.5+ [3] Group 4 - Tencent Yuanbao App integrates QQ Music services, enabling users to search for songs with a phrase and play them instantly without leaving the chat interface [4] - The technology is driven by a dual-engine system combining mixed models and DeepSeek-R1, capable of recognizing vague music descriptions and providing contextual recommendations [4] - User experience improvements include seamless account connectivity, multimodal interaction, and creative assistance, reflecting the evolution of AI assistants from tools to partners [4] Group 5 - OpenAI's ChatGPT agent faces criticism from competitors like Manus and Genspark, highlighting its limitations despite integrating multiple functionalities [5] - The ChatGPT agent can automate tasks like retirement planning and shopping lists, but its output is considered simplistic compared to competitors [5] Group 6 - PhysRig, developed by UIUC and Stability AI, introduces a framework for character animation with micro-physical binding, embedding rigid skeletons into elastic soft bodies [6] - This method replaces traditional techniques with micro-physical simulations, addressing issues of volume loss and deformation artifacts [6] - The framework outperforms traditional methods across 17 character types and 120 animation tests, supporting cross-species motion transfer [6] Group 7 - OpenAI's mysterious general reasoning model achieved a gold medal level in IMO 2025 by solving five problems and scoring 35 points [7] - The model demonstrates deep creative thinking capabilities lasting several hours, surpassing previous AI's minute-level reasoning [7] - This achievement is a result of breakthroughs in general reinforcement learning rather than task-specific training, although the model will not be released [7] Group 8 - The creator of Claude Code emphasizes that the best AI tools should empower users, advocating for simple, universal tools rather than complex systems [8] - The focus is on providing foundational capabilities that allow users to control their workflows rather than having the tools dictate them [8] - Effective workflows should involve exploration and planning followed by user confirmation before coding, utilizing test-driven development for iterative improvement [8] Group 9 - The focus on agents, open-source, and the choice of DSV3 architecture is justified by the need to stimulate model capabilities without relying on external products [9] - Open-sourcing enhances visibility and community contributions, ensuring genuine model progress rather than superficial improvements [9] - The DSV3 architecture has been proven superior in experiments, allowing for cost-effective adjustments without introducing ineffective variables [9] Group 10 - Many current AI products are expected to be replaced as they do not adhere to scaling laws, with a focus on enhancing model capabilities rather than merely expanding tools [10] - Current AI models exhibit lower data efficiency compared to humans, indicating that algorithm improvements are more critical than simply increasing data scale [10] - Research on multi-agent systems is evolving to explore not just interactions but also extending reasoning capabilities from minutes to hours or even days [10]
从「塑料人」到「有血有肉」:角色动画的物理革命,PhysRig实现更真实、更自然的动画角色变形效果
机器之心· 2025-07-10 08:35
Core Viewpoint - The article discusses the limitations of traditional Linear Blend Skinning (LBS) in character animation and introduces a new framework called PhysRig, which aims to enhance the realism and control of character animations through physics-based modeling [2][3][9]. Summary by Sections Introduction to the Problem - Current animation techniques often result in characters appearing unrealistic, with issues such as volume loss and distortion, particularly in soft materials like skin and fat [2][6][11]. PhysRig Framework - PhysRig integrates a rigid skeleton with a deformable soft body model, utilizing differentiable physics simulation to achieve more natural character deformation [3][9]. - The framework consists of three key components: a differentiable physics simulator, a driving point system, and an optimization strategy [10][13]. Physics Simulation and Optimization Strategy - The optimization process involves inferring internal skeletal movements and material parameters from observed animation results, ensuring stability and efficiency through temporal consistency and local frame optimization [15][17][20]. Comprehensive Evaluation and Dataset - A dataset was created to validate PhysRig's effectiveness, including 17 character types and 120 animation sequences, with metrics such as user ratings and Chamfer distance showing significant improvements over traditional methods [19][22]. Applications and Future Directions - PhysRig allows for pose transfer, enabling the generation of natural volume animations based on skeletal angles from existing animations [24][26]. - The project aims to transition from traditional rigging to physically realistic binding, with plans to open-source the code and dataset and develop a Blender plugin for animation artists [29][30].