Workflow
AI数字人
icon
Search documents
长视频AI数字人来了!字节×浙大推出商用级音频驱动数字人模型InfinityHuman
机器之心· 2025-09-04 04:11
Core Viewpoint - The article discusses the launch of InfinityHuman, a commercial-grade long-sequence audio-driven video generation model developed by ByteDance's GenAI team in collaboration with Zhejiang University, aimed at addressing the industry's pain points in high-quality digital human video creation [2][6]. Group 1: Technology Breakthroughs - InfinityHuman can generate coherent, high-resolution long videos from a single image and corresponding audio, enabling professional-grade presentations for various formats, from 30-second product pitches to 3-minute speeches [4][11]. - The model effectively addresses two major challenges in long video animation: identity drift and detail distortion, ensuring consistent facial features and natural hand movements throughout the video [8][14]. Group 2: Commercial Applications - InfinityHuman has been successfully applied in multiple commercial scenarios, particularly excelling in supporting Chinese speech, maintaining identity stability and natural hand movements in longer videos [7][13]. - Potential applications include virtual hosts for e-commerce, virtual instructors for corporate training, and digital human anchors for content creation in social media [8][15]. Group 3: Technical Framework - The model employs a unified framework that generates long, high-resolution speaking videos using a reference image, audio, and optional text prompts, ensuring visual consistency and accurate lip synchronization [11][16]. - It utilizes a "coarse-to-fine" strategy, starting with low-resolution video generation and refining it through a pose-guided module to enhance realism and structural integrity of hand movements [11][16]. Group 4: Performance Metrics - Experimental results indicate that InfinityHuman outperforms mainstream baseline methods in visual realism and temporal coherence, with significant improvements in overall video quality [13][14]. - The model maintains identity consistency and enhances hand movement accuracy, particularly in complex gesture scenarios, addressing common issues like finger distortion and joint anomalies [13][14].
那些自愿把脸卖给AI的人,已经后悔了
3 6 Ke· 2025-09-03 23:55
Core Insights - The article discusses the emerging trend of companies, particularly TikTok, purchasing the likenesses of individuals for advertising purposes, raising ethical concerns about consent and control over personal images [2][6][9] Group 1: TikTok's Digital Avatar Initiative - TikTok launched a tool called Symphony Digital Avatars, which utilizes AI-generated digital personas for advertising [6] - The company reportedly paid Scott, a minor actor, $750 plus a free trip for a one-year license to use his likeness [9] - This initiative has led to many individuals, like Scott, unknowingly appearing in various advertisements, often for products they do not endorse [11][20] Group 2: Financial Aspects and Market Practices - Other actors who licensed their likenesses to TikTok received between $500 to $1,000 for a year, which is significantly lower than typical advertising fees in the industry [13] - Companies like Synthesia in the UK also engage in similar practices, purchasing likeness rights from individuals for digital avatars [16] Group 3: Ethical and Control Issues - The use of digital avatars raises concerns about the lack of control individuals have over how their likeness is used, including potential misrepresentation in advertisements [20][22] - There is a risk that individuals may find their digital likenesses used in contexts they find objectionable, such as promoting controversial products or ideologies [22][26] - The article emphasizes the need for clear contracts that outline the scope of use, duration, and revenue sharing to protect individuals' rights [26]
华视集团控股盈喜后涨超41% 预期上半年净利同比增长约45%到55%
Zhi Tong Cai Jing· 2025-08-27 08:49
Core Viewpoint - Huashi Group Holdings (01111) experienced a significant stock price increase of over 41% following the announcement of a profit alert, indicating strong growth expectations for the upcoming period [1] Financial Performance - As of August 27, the stock closed at 0.375 HKD, reflecting a 41.51% increase with a trading volume of 15.8335 million HKD [1] - The company anticipates a net profit increase of approximately 45% to 55% year-on-year for the first half of 2025 [1] Business Drivers - The expected growth is primarily driven by the company's active expansion in AI digital human and internet marketing businesses [1] - There is a significant increase in operating revenue compared to the same period last year [1]
华视集团控股(01111)盈喜后涨超41% 预期上半年净利同比增长约45%到55%
智通财经网· 2025-08-27 08:48
Core Viewpoint - Huashi Group Holdings (01111) experienced a significant stock price increase of over 41% following the announcement of a profit alert, indicating strong market confidence in the company's future performance driven by its expansion in AI digital human and internet marketing businesses [1]. Financial Performance - As of August 27, the stock closed at 0.375 HKD, reflecting a 41.51% increase with a trading volume of 46.23 million shares and a turnover of 15.83 million HKD [1]. - The company anticipates a net profit increase of approximately 45% to 55% year-on-year for the first half of 2025, highlighting substantial growth in revenue compared to the same period last year [1].
8月以来股价累计涨幅翻倍 大洋集团对外发布Web 4.0转型计划
Core Insights - The stock price of Ocean Group (01991.HK) surged over 6% intraday on August 22, closing up 4.03% at HKD 1.55 per share, with a cumulative increase of over 115% from August 1 to date [1][2] Group 1: Web 4.0 Strategy - Ocean Group held a Web 4.0 strategy launch event on August 20, unveiling its transformation blueprint and initiating the Real World Asset (RWA) ecosystem [1] - The chairman, Shi Qi, emphasized that Web 4.0 represents not only a technological evolution but also an ecological reconstruction, aiming to build a smart, autonomous, and sustainable digital economy [1] - The core framework of the Web 4.0 strategy includes three pillars: Data Standardization, Asset Tokenization, and Transnational Value, leveraging AI digital humans to empower education, gaming, and healthcare sectors [1][2] Group 2: Sector-Specific Applications - In the education sector, AI digital human technology will generate personalized content, RWA-izing research outcomes for knowledge asset sharing and cash flow stabilization [2] - The gaming sector will utilize AI digital human technology to enhance immersion, tokenizing virtual assets like props and land to construct a metaverse economy [2] - In the healthcare sector, AI digital human medical assistants will RWA-ize health data and service assets, improving user experience and generating capital returns [2] Group 3: Revenue Streams - Ocean Group anticipates deriving four types of revenue from its involvement in the AI industry: income from AI digital human traffic, one-stop financing consulting for SMEs involving AI and RWA, transaction fees related to RWA trading, and subscription fees for multilingual and multicultural AI customer service and marketing outsourcing [2] Group 4: Company Background - Established in 1991 and listed on the Hong Kong Stock Exchange in 2007, Ocean Group is one of the largest integrated service providers in silicone raw material production and processing [2] - The company has achieved rapid asset growth through comprehensive industry integration, structural adjustments, and value reconfiguration, with main business segments including silicone, digital marketing, retail, healthcare, and hospitality [2]
大洋集团(01991.HK)Web 4.0战略转型,深度开拓三大万亿赛道
Ge Long Hui· 2025-08-22 10:42
Core Insights - The core focus of the news is the launch of the Web 4.0 strategic blueprint and the initiation of the RWA ecosystem by Ocean Group on August 20 [1] Group 1: Web 4.0 Strategy - The Web 4.0 strategy is built on three main pillars: Data Standardization, Asset Tokenization, and Transnational Value [3] - The strategy aims to empower three key sectors: education, gaming, and healthcare through AI digital human technology [3] - The education sector will utilize AI digital human technology to generate personalized content, transforming research outcomes into RWA for knowledge asset sharing and cash flow stabilization [3] Group 2: Gaming and Healthcare Sectors - In the gaming sector, AI digital human technology will enhance immersion, allowing for the tokenization of virtual assets such as items and land, thereby constructing a metaverse economy [3] - The healthcare sector will integrate AI digital human medical assistants with health data and service assets, facilitating RWA transformation to improve user experience and generate capital returns [3] Group 3: Vision and Future Outlook - The Chairman of Ocean Group, Shi Qi, emphasized that Web 4.0 represents not just a technological evolution but a reconstruction of the ecosystem [3] - Ocean Group aims to build a smart, autonomous, and sustainable new paradigm of the digital economy, leveraging AI digital humans as the engine, RWA as the value channel, and global traffic as the link [3]
00后看数博(二)| 社交媒体浪潮里的“科技印记”
Sou Hu Cai Jing· 2025-08-13 12:23
Core Insights - The 2025 China International Big Data Industry Expo (Big Data Expo) will be held from August 28 to 30 in Guiyang, focusing on the integration of data elements and artificial intelligence technology to drive industrial transformation and high-quality economic development [1] Group 1: Event Overview - The theme of this year's Big Data Expo is "Data Aggregates Industrial Momentum, Intelligent Development New Chapter," aiming to showcase the latest achievements in the fusion of data and AI technology [1] - The event is expected to highlight the efficient aggregation and utilization of data resources, providing strong momentum for industrial upgrades [1] Group 2: AI Innovations - Tencent Cloud showcased three PaaS products at the 2024 Big Data Expo, including the "Large Model Image Creation Engine," demonstrating the powerful capabilities of large model native toolchains in knowledge services and content creation [7] - The "Image Creation Engine" utilizes Tencent's self-developed image creation model to provide high-quality AI image generation and editing capabilities, significantly shortening the creative and production cycle for enterprise clients [7] - The release of Tencent's HunyuanImage2.0 model in May 2023 emphasized real-time efficiency and ultra-realistic image quality, addressing common issues in AI-generated art [7] Group 3: AI in Social Media - AI-generated user avatars are increasingly popular on social media, allowing users to upload several photos and receive diverse style images, catering to the aesthetic preferences of the younger generation [5] - AI synthetic anchors have become common symbols of the era, with advancements in digital human generation technology enabling realistic simulations of appearance, expression, and voice [13][15] - The integration of AI technology in content production has created new possibilities for virtual images and content creation, enhancing user engagement across various platforms [15] Group 4: AI Chat Solutions - NetEase Cloud's AI chat feature addresses social anxiety among the younger generation by generating personalized opening lines based on user interests and personality traits [19][23] - The AI chat function can monitor conversation dynamics and suggest engaging topics to maintain interaction, enhancing the overall social experience [25] - The technologies showcased at the Big Data Expo are already being applied in social media, enriching the daily lives of the younger generation and leaving a technological imprint on social platforms [25]
AI数字人辅助小程序功能版块设计分析
Sou Hu Cai Jing· 2025-08-06 08:00
Core Concept - The article discusses the development of AI digital assistants that enhance human-computer interaction by simulating human communication, aiming to provide natural and efficient service support in daily scenarios [1] Natural Language Interaction System - The dialogue interface utilizes multi-turn conversation technology, enabling context semantic understanding and intent recognition. Users can input requests via text or voice, with the system automatically correcting and completing key information [2] - The response module is designed to express human-like responses, matching emojis and tone words to the conversation content to avoid mechanical replies [2] Task Management and Scheduling - The digital assistant can parse complex user requests and break them down into executable steps. For example, if a user inputs "prepare for a weekend family gathering," the system generates a shopping list, venue setup suggestions, and a schedule [4] - The scheduling module synchronizes with the user's mobile calendar, setting reminders and detecting conflicts, automatically suggesting adjustments when overlapping events are detected [4] Preference Model and Service Recommendations - Based on historical dialogue data, the digital assistant can proactively push relevant services. For instance, if a user frequently inquires about fitness plans, the system will regularly send workout tutorials and dietary suggestions [5] - Recommended content spans various categories, including lifestyle services, learning resources, and entertainment activities, with each recommendation accompanied by a brief description and action entry [5] Multimodal Interaction Expansion - In addition to basic text interaction, the digital assistant supports simple gesture recognition and emotional feedback. Users can express satisfaction through a thumbs-up gesture, which the system records to enhance similar recommendations [6] - The visual presentation adopts a 2.5D cartoon style to avoid discomfort from excessive realism, maintaining a consistent hairstyle and outfit for brand recognition while reducing cognitive load [6] Privacy Protection and Permission Management - Dialogue data is secured with end-to-end encryption, allowing users to choose data retention periods. The permission settings page offers detailed control options, such as allowing calendar access while prohibiting contact list access [7] - Sensitive operations require secondary verification, such as entering a preset password or biometric information to modify schedule arrangements [7] Visual Standards and Adaptation Optimization - The interface design adheres to brand color standards, primarily using a light blue color scheme to create a technological feel. Key operation buttons are sized no less than 44px to ensure accurate touch response across different devices [8] - Animation frame rates are maintained above 30fps to prevent lag during interactions. Testing shows that optimized versions have reduced the error rate by 40% among elderly users [8] - Through the collaborative operation of these functional modules, the AI digital assistant can establish a complete link of "demand understanding - task breakdown - service push," balancing technical advancement with emotional value to provide users with an efficient and warm digital assistance experience [8]
独家 | 对话百度副总裁平晓黎:深度复盘数字人业务逻辑
Hu Xiu· 2025-07-31 11:10
Core Insights - Baidu views digital humans as a "key weapon" for growth, with plans to increase investment in the digital human project by the second half of 2025 [1] - The evaluation metrics for digital humans focus on customer volume, user engagement, and advertising revenue generated from digital human products [1][18] - The digital human project is a significant revenue and profit growth driver for Baidu's e-commerce business, which has been profitable since becoming a primary business unit in 2023 [1][4] Investment and Business Strategy - Baidu's e-commerce division launched the "digital human" of Luo Yonghao during the 618 shopping festival, seen as a pivotal move for 2025 [3] - Baidu's founder, Li Yanhong, has expressed confidence in the digital human technology, highlighting it as one of the most exciting applications of AI by 2025 [3] - The digital human project is part of Baidu's strategy to leverage AI for enhancing user shopping experiences and reducing operational costs for merchants [10][12] Technological Development - The digital human technology has evolved through several stages, with the latest being the "high persuasive digital human" (3.0 stage), which possesses decision-making capabilities and can interact in real-time with users [7][8] - The introduction of large models like ChatGPT has significantly improved the capabilities of digital humans, allowing for real-time interaction and script generation [6][11] - Baidu aims to make digital humans capable of surpassing real humans in certain scenarios, with ongoing improvements in technology and user experience [21] Financial Performance - The digital human project is self-sustaining within the e-commerce division, covering its operational costs and contributing to overall profitability [20] - Revenue growth for digital humans accelerated significantly starting Q3 2024, particularly after the launch of a low-cost cloning feature [20] - The company anticipates further growth in digital human applications across various sectors, including education, healthcare, and automotive [20][21]
2025年AI数字人排行榜,五大优秀数字人公司揭晓!
Sou Hu Cai Jing· 2025-07-25 01:52
Group 1 - The ranking of digital human technology manufacturers in 2025 is based on technical strength, market application, and influence, highlighting their performance in the AI digital human field across various industries such as e-commerce, education, live streaming, and customer service [1][12] - Kexiyun focuses on AI digital human OEM customization services, providing specialized image customization, intelligent content generation, and live streaming solutions, aiding enterprises in precise marketing and efficient customer acquisition [1] - Polang offers powerful video editing and AI capabilities, providing virtual anchor and intelligent digital human solutions, enabling rapid brand image generation for enterprises [1] Group 2 - Ronghuike is a leading AI digital human technology manufacturer, specializing in creating virtual anchors and AI assistants for live streaming, intelligent customer service, and digital immortality, offering highly realistic virtual human images [3] - Hourone.ai utilizes AI technology to generate digital human videos, allowing users to create professional-level videos from text or PPT, supporting multiple languages and customizable digital human images [5] - Colossyan provides an AI-driven video generation service platform, enabling users to create talking avatar videos through text input, suitable for corporate training, marketing, and educational content production [9]