Workflow
腾讯混元
icon
Search documents
Sora2还在5秒打转,字节AI生视频已经4分钟“起飞”
量子位· 2025-10-06 05:42
鹭羽 发自 凹非寺 量子位 | 公众号 QbitAI 从5秒到 4分钟 ,Sora2也做不到的分钟级长视频生成,字节做到了! 先来看一个前方潜水员拍摄的"真实"海底世界Vlog: 华生,有发现么?不同于一般的AI生成视频,只有短短几秒钟……这个片子全程1分40秒, 都是"水分"、都是AI 。 这就是字节和UCLA联合提出的新方法—— Self-Forcing++ ,无需更换模型架构或重新收集长视频数据集,就能轻松生成分钟级长视频,也 不会后期画质突然变糊或卡住。 通过利用教师知识和自生成视频片段指导自回归生成,最长生成视频可达 4分15秒 ,而且高质量、还开源。 话不多说,再看几个视频效果尝尝鲜。 长达3分钟的无人机视角下的海岸线,be like: 时长拉到极致,4分15秒跟随大象的脚步纵览草原美景。 而相同时长下,此前的长视频生成SOTA SkyReels 做出的效果是酱紫的: (重生之我成为一只蚂蚁) Self-Forcing++在短时长上继承了 Self-Forcing 的高质量画面效果,长时长生成也能达成性能指标All kill,视觉稳定性大幅领先 CausVid 等方法。 或许,AI电影时代离我们已 ...
财经观察|经济引擎装上AI“新三件”:不是未来已来,而是正在发财
Sou Hu Cai Jing· 2025-09-30 12:58
Core Insights - The rapid integration of artificial intelligence (AI) into various industries is transforming China's economic landscape, with AI becoming a key driver of growth and efficiency [4][22] - The core AI industry in China surpassed 700 billion RMB in 2024, maintaining over 20% annual growth, significantly outpacing overall economic growth [4][22] - AI is being recognized as a strategic tool for enhancing productivity and facilitating deep structural transformation across traditional sectors [4][22] Group 1: AI's Impact on Industries - AI has demonstrated its effectiveness in retail, with a sales competition showing AI-driven sales outperforming human efforts by over three times [1] - The AI-driven transformation is evident in various trillion-yuan markets, including retail, finance, and logistics, reshaping efficiency and growth [8] - AI is not merely a replacement for traditional methods but acts as a catalyst for innovation and productivity in established industries [20] Group 2: Infrastructure and Technological Advancements - The foundation of AI's success lies in robust cloud computing and computational power, which are essential for its widespread application [9][10] - Major Chinese tech companies like Tencent, Alibaba, and Huawei are competing to enhance their cloud services to support AI operations [9][10] - The development of domestic large models and stable computational power is crucial for the advancement of AI applications across the country [12][22] Group 3: New Business Models and Opportunities - AI is creating new business models and industries, significantly lowering the barriers to creativity and production, as seen in the 3D printing sector [14][18] - The integration of AI in 3D printing allows users to generate high-quality models easily, marking a shift towards an AI-driven era in consumer-grade 3D printing [18] - AI's capabilities in cross-cultural understanding and content generation are opening new markets for Chinese enterprises, enhancing their global competitiveness [19] Group 4: Traditional Industries and AI Integration - AI is enhancing traditional industries by improving productivity and addressing challenges such as rising labor costs and declining capital returns [20] - Collaborations between tech firms and traditional manufacturers, such as the partnership between GAC Group and Tencent, are leading to advancements in smart manufacturing and global expansion [20][21] - In sectors like healthcare, AI is streamlining processes and improving decision-making, as demonstrated by the applications in hospitals and medical institutions [22]
对话腾讯集团高级执行副总裁汤道生:AI基础设施投入巨大 算力倒逼探索“最优成本+规模化应用”路径
Mei Ri Jing Ji Xin Wen· 2025-09-17 14:37
Core Insights - The focus of the industry is on how companies can implement cutting-edge AI technologies into practical business scenarios for sustainable growth as the AI model technology hype returns to rationality [2] - Tencent's Senior Executive Vice President emphasized that "driving industrial efficiency through intelligence and revenue scale through globalization" are the two core drivers of corporate growth [2] Group 1: AI Infrastructure and Investment - Tencent is significantly investing in AI infrastructure, with a strong emphasis on providing comprehensive support from infrastructure to model training and inference acceleration tools [4] - The shift in the big model industry focus from training to inference has become an industry consensus, leading to a surge in inference demand [4] - Tencent has established 11 regional offices globally and deployed 9 global technical support centers, enhancing its international infrastructure investment [5] Group 2: AI Strategy and Development - Tencent aims to create "human-centered AI," with a clear positioning that embraces AI across all business sectors [5] - The company has released over 30 models in the past year, focusing on achieving stronger model performance at lower deployment and inference costs [6] - Tencent's intelligent agent strategy was officially launched, providing a comprehensive open development platform and support for various application scenarios [6] Group 3: Market Dynamics and User Engagement - The demand for intelligent agents is diverse, with small and medium enterprises seeking more commercial support products from Tencent's intelligent agent development platform [6] - Tencent's AI applications are still in the investment phase, with a focus on enhancing product and service experiences rather than immediate commercialization [7] - User inquiries to Tencent's AI applications have reached the total monthly inquiries from earlier this year, indicating growing engagement [7]
对话汤道生:AI如何“再造”腾讯?
Bei Ke Cai Jing· 2025-09-17 07:21
Core Insights - Tencent's AI strategy has become a fundamental part of its business model, with AI capabilities integrated across various sectors [2][10] - The company emphasizes a smarter investment approach in AI, focusing on efficiency and cost-effectiveness while serving over 1 billion users [5][6] - Tencent's AI application, Yuanbao, has gained significant traction, becoming one of the top AI native applications in China, with daily user interactions surpassing previous monthly totals [10] AI Investment Strategy - Tencent aims to balance AI investment and output, recognizing the need for substantial ongoing investment while optimizing efficiency [6] - The company has released over 30 AI models in the past year, focusing on reducing deployment and inference costs while enhancing model performance [7][8] - Tencent's historical focus on core products has led to a competitive position in the cloud market, emphasizing the importance of sustainable revenue streams [9] Yuanbao Development - Yuanbao has rapidly evolved, integrating its capabilities into various Tencent products, enhancing user interaction across platforms like WeChat [10][12] - The future development of Yuanbao is seen as a continuous process, with ongoing discussions about user experience and potential feature testing [12] Hardware and Software Integration - Tencent has fully adapted to mainstream domestic chips and is committed to optimizing its software capabilities in conjunction with hardware [14] - The company collaborates with multiple chip manufacturers to ensure the best hardware configurations for different AI models and scenarios [14]
腾讯研究院AI速递 20250916
腾讯研究院· 2025-09-15 16:01
Group 1: Google Gemini and AI Tools - Google Gemini has topped the App Store free chart, surpassing ChatGPT, due to its popular Nano Banana image editing feature [1] - Gemini is a comprehensive AI toolkit that includes Canvas, Veo3 video generation, Storybook, and Deep Research among other functionalities [1] - The Google AI suite also features NotebookLM knowledge base (allowing up to 300 file uploads), Flow video generation (supporting 1080p HD), AI Mode search, and Gemini CLI local assistant [1] Group 2: xAI's Grok 4 Fast Model - xAI has launched the Grok 4 Fast model, achieving a generation speed of 75 tokens per second, which is ten times faster than the standard version [2] - User tests indicate that the new model excels in programming and middle school math tasks, solving LeetCode problems in under 2 seconds [2] - Despite its speed advantage, Grok 4 Fast compromises on accuracy, making it suitable for simple queries or tool usage, reflecting xAI's recent focus on speed [2] Group 3: Keling AI's Digital Human - Keling AI has introduced an upgraded digital human feature that supports up to 60 seconds of output at 1080P/48fps, significantly enhancing facial recognition and lip-sync accuracy [3] - The new feature allows for prompt-based control of character emotions and actions, enabling digital humans to display richer expressions and body language [3] - Keling's digital human service is priced at 0.12 yuan per second at 720P, approximately one-third the cost of similar products from Heygen, nearing the industry's lowest price [3] Group 4: Tencent's AI Painting Upgrade - Tencent's Mix Yuan has proposed a new method to optimize AI painting, improving diffusion model training through Direct-Align and Semantic Relative Preference Optimization (SRPO) techniques [4] - Direct-Align optimizes the entire diffusion trajectory, addressing the "reward hacking" issue seen in traditional methods that only optimize later stages [4] - The FLUX1.dev model trained with SRPO has seen a threefold increase in realism and aesthetic scores, requiring only 32 H20 blocks for 10 minutes of training [4] Group 5: Albania's AI Minister - Albania has become the first country to appoint an "AI Minister," named Diella, which will oversee public procurement projects [5] - Diella aims to serve as a benchmark for government transparency reforms, responsible for evaluating tenders and selecting personnel to achieve 100% integrity in public bidding [5] - This initiative seeks to address long-standing issues of corruption in public procurement in Albania while promoting the country's digital government transformation [5] Group 6: xAI's Workforce Changes - xAI has reportedly laid off about 500 employees from its data labeling team, accounting for one-third of that team, with affected employees receiving salary payments until the end of November [6] - The company announced a strategic shift to reduce general AI mentors while expanding the professional AI mentor team by tenfold, focusing on recruiting talent from STEM, finance, and medicine [7] - Prior to the layoffs, xAI required employees to participate in tests determining their job security, leading to concerns about the fairness of the process among some employees [7] Group 7: UCLA's Energy-Efficient Imaging - A research team from UCLA has published a paper in Nature on a nearly zero-energy optical image generation model, with Shiqi Chen, a Zhejiang University alumnus, as the first author [8] - The system generates static noise using digital encoders, imprinting noise patterns onto laser beams via spatial light modulators, and then converting the noise into images with a second device [8] - This system can produce images of handwritten digits, fashion items, and Van Gogh-style artworks, making it suitable for VR, AR displays, and wearable devices due to its ultra-fast and low-energy characteristics [8] Group 8: AI Programming Challenges - A senior developer, Carla Rover, experienced significant issues with "vibe coding," leading to a project overhaul and emotional distress [9] - A report from Fastly indicates that 95% of developers require additional time to fix AI-generated code, leading to the emergence of "vibe coding cleanup specialists" with salaries reaching $100,000 [9] - Many experienced developers express that AI programming resembles "caring for a 6-year-old," lacking systematic thinking and often introducing security vulnerabilities, with 50% of their time spent on requirements and 30-40% on fixing AI code [9] Group 9: Anthropic's AI Economic Index - Anthropic has released its first comprehensive AI economic index report, revealing that the proportion of users assigning complete tasks to Claude has increased from 27% to 39% [10] - The report highlights a close correlation between AI usage and regional economic characteristics, with Washington D.C. and Utah showing the highest per capita usage, while Hawaii focuses on travel planning and Massachusetts on scientific research [10] - Data indicates that regions with higher GDP exhibit greater AI usage rates, with wealthier countries showcasing more diverse use cases, while enterprise users have an automation rate of 77%, significantly higher than that of individual users [10]
腾讯混元升级AI绘画微调范式,在整个扩散轨迹上优化,人工评估分数提升300%
量子位· 2025-09-15 03:59
Core Viewpoint - The article discusses advancements in AI image generation, specifically focusing on the introduction of two key methods, Direct-Align and Semantic Relative Preference Optimization (SRPO), which significantly enhance the quality and aesthetic appeal of generated images [5][14]. Group 1: Current Challenges in Diffusion Models - Existing diffusion models face two main issues: limited optimization steps leading to "reward hacking," and the need for offline adjustments to the reward model for achieving good aesthetic results [4][8]. - The optimization process is constrained to the last few steps of the diffusion process due to high gradient computation costs [8]. Group 2: Direct-Align Method - Direct-Align method allows for the recovery of original images from any time step by pre-injecting noise, thus avoiding the limitations of optimizing only in later steps [5][10]. - This method enables the model to recover clear images from high noise states, addressing the gradient explosion problem during early time step backpropagation [11]. - Experiments show that even at just 5% denoising progress, Direct-Align can recover a rough structure of the image [11][19]. Group 3: Semantic Relative Preference Optimization (SRPO) - SRPO redefines rewards as text-conditioned signals, allowing for online adjustments without additional data by using positive and negative prompt words [14][16]. - The method enhances the model's ability to generate images with improved realism and aesthetic quality, achieving approximately 3.7 times and 3.1 times improvements, respectively [16]. - SRPO allows for flexible style adjustments, such as brightness and cartoon style conversion, based on the frequency of control words in the training set [16]. Group 4: Experimental Results - Comprehensive experiments on the FLUX.1-dev model demonstrate that SRPO outperforms other methods like ReFL, DRaFT, and DanceGRPO across multiple evaluation metrics [17]. - In human evaluations, the excellent rate for realism increased from 8.2% to 38.9% and for aesthetic quality from 9.8% to 40.5% after SRPO training [17][18]. - Notably, a mere 10 minutes of SRPO training allowed FLUX.1-dev to surpass the latest open-source version FLUX.1.Krea on the HPDv2 benchmark [19].
可灵VS即梦:初探“多模态”
Tai Mei Ti A P P· 2025-09-11 05:33
Core Insights - The article discusses the current state of AI-generated video platforms in China, specifically focusing on two leading platforms: Keling and Jimeng [1] - It explores the process of creating a film using AI, highlighting the roles of AI in scriptwriting, storyboarding, and directing [5][10][18] - The article emphasizes the strengths and weaknesses of the AI platforms in generating videos, particularly in terms of creativity and fidelity [35][42] Group 1: AI Video Generation Process - The first step involves using AI as a screenwriter to create scripts, demonstrating that AI can effectively handle text-based tasks [7][8] - The second step is utilizing AI as an artist to create storyboards, where the quality of images generated can vary, with some instances of misunderstanding instructions [12][14] - The third step involves AI directing the video, where initial results may be impressive, but inconsistencies and logical errors become apparent in later outputs [18][20][24] Group 2: Performance of AI Platforms - Keling shows better performance in understanding abstract concepts and artistic interpretation, often producing videos that reflect the intended themes [36][38] - Jimeng excels in image fidelity and stability, ensuring that the generated videos maintain a consistent visual quality [43][44] - Both platforms face challenges in simulating physical realism and maintaining narrative coherence, leading to issues such as "memory loss" within short video segments [31][50] Group 3: Technical and Cost Considerations - The article notes that the current technology in AI video generation struggles to balance fidelity and creativity, with limitations on video length impacting content expression [50][52] - The cost of using these platforms can be significant, with basic configurations priced at 1 yuan per video for Jimeng and 2 yuan for Keling, indicating that achieving high-quality outputs may require additional investment [59][60] - The need for patience is emphasized, as generating visually appealing films with AI may take time and repeated adjustments [62]
信达国际港股晨报快-20250902
Xin Da Guo Ji Kong Gu· 2025-09-02 02:06
Market Overview - The Hang Seng Index is facing resistance at 26,000 points, influenced by the extension of the US-China tariff truce and the Federal Reserve's potential policy adjustments due to changing risk balances [2][5] - The overall market is active with a positive risk appetite, and capital is rotating among different sectors [2] Company News - JD Group (9618) has made a takeover offer for CECONOMY, with the acceptance period until November 10 [4][10] - BYD (1211) reported a 0.2% increase in August vehicle sales, totaling 373,600 units [10] - Xiaomi (1810) delivered over 30,000 vehicles in August and opened 18 new stores [10] - China Merchants Bank (3968) is working on anti-competitive measures under regulatory guidance [4] - Shandong Gold (1787) raised 3.9 billion yuan through a discounted share placement to repay debts [4] Economic Indicators - The S&P Global Manufacturing PMI for China rose to 50.5 in August, indicating a recovery in manufacturing activity [7][8] - The average price of second-hand residential properties in 100 cities in China fell by 0.76% month-on-month in August, while new residential prices saw a slight increase [8] - Hong Kong's retail sales increased by 1.8% in July, although this was below market expectations [8] Sector Focus - The smartphone parts sector is entering a traditional peak season with major brands set to launch new devices [7] - The AI and robotics sectors are seeing increased activity, with advancements in humanoid robots and smart glasses [7] Stock Market Performance - The Hang Seng Index closed at 25,617, up 2.15%, with a total market turnover of 380.2 billion HKD [6] - The Hang Seng Tech Index rose by 2.20%, reflecting strong performance in technology stocks [6] International Market Insights - The US Federal Reserve is expected to maintain a cautious approach to interest rate cuts, with projections indicating two rate cuts totaling 50 basis points this year [5] - Global trade negotiations are ongoing, with some progress reported, but uncertainties remain [5]
AI生成图片,哪家强?
3 6 Ke· 2025-08-29 06:26
Group 1 - The article discusses the rapid growth of AI-generated images and their increasing integration into various platforms, highlighting their efficiency in work and study despite ongoing artistic controversies [1] - The evaluation focuses on six AI models, including Tencent's Mix Yuan, Zhiyu CogView-4, Tongyi Qianwen, Jimeng, Keling, and Gemini 2.5 Flash Image, to assess their performance in generating images from text prompts [2][3] - Gemini 2.5 Flash Image, previously known as nano-Banana, has gained significant attention for its superior performance in generating images [4][5] Group 2 - The evaluation criteria include basic aesthetics and realism, imagination and creativity, instruction understanding and execution, style imitation and mastery, and cultural understanding and concept expression [9][26][40][48] - In the first dimension, various models showed differing levels of realism, with some generating images that were too smooth or lacked natural proportions, while others performed exceptionally well [16][18] - The second dimension revealed challenges for AI in understanding abstract concepts, with models struggling to accurately depict a lion made of star clouds, indicating limitations in their imaginative capabilities [25] Group 3 - The third dimension highlighted that only a few models correctly executed simple instructions, suggesting that AI does not process numerical instructions in the same way humans do, but rather interprets them based on learned patterns [30][39] - In the fourth dimension, Gemini excelled in mimicking traditional Chinese ink painting styles, while other models struggled to meet the artistic requirements, indicating a lack of mastery in specific artistic styles [44] - The fifth dimension showed that Gemini and Keling demonstrated a strong understanding of cultural elements, effectively incorporating traditional features into their generated images, while others fell short [57] Group 4 - The overall scores from the evaluation ranked Gemini highest with 44 points, followed by Keling and Jimeng, indicating that these models produced the most visually appealing results [58][59] - The article emphasizes that while AI can produce impressive images, it does not create art in the same way humans do, as it relies on probabilistic models rather than creative inspiration [61][62] - The complexity of AI image generation processes is acknowledged, with the article noting that the exact sources of errors in image generation remain unclear [65][66]
全球AI周报:腾讯财报超预期,AI已成为业务增长的核心驱动力量-20250819
Tianfeng Securities· 2025-08-19 13:06
Investment Rating - The industry investment rating is "Strong Outperform" with an expected industry index increase of over 5% in the next six months [49]. Core Insights - Tencent's FY25Q2 revenue reached 184.5 billion CNY, a year-on-year increase of 14.5%, exceeding Bloomberg's consensus estimate of 178.9 billion CNY [4][14]. - Coreweave's FY25Q2 revenue was 1.21 billion USD, a year-on-year increase of 207%, surpassing the expected 1.08 billion USD [18]. - The AI sector is experiencing rapid growth, with significant advancements in model capabilities and applications, particularly in China and overseas [7][5]. Summary by Sections Financial Performance - Tencent's gross profit for FY25Q2 was 105 billion CNY, up 22.3% year-on-year, exceeding the expected 98.8 billion CNY [4][14]. - Coreweave's remaining performance obligations reached 30.1 billion USD, a year-on-year increase of 86%, surpassing the expected 14.9 billion USD [18][22]. AI Developments - Tencent's AI initiatives have significantly enhanced user experience and operational efficiency, particularly in gaming and marketing [7][17]. - Coreweave is expanding its capacity to meet strong customer demand across various sectors, including media and finance [22][26]. - The launch of the GLM-4.5V model by Zhiyuan demonstrates significant advancements in visual reasoning capabilities, achieving state-of-the-art performance in multiple benchmarks [33][31]. Investment Recommendations - The report suggests a focus on companies like Alibaba, Tencent, Baidu, and Xiaomi for long-term investment opportunities in the AI sector [5]. - For overseas AI applications, companies such as Duolingo, Palantir, and AppLovin are highlighted for their strong growth potential in high-frequency, high-value verticals [5][7]. Capital Expenditure - Tencent's capital expenditure for the quarter was 17.9 billion CNY, a year-on-year increase of 149%, driven by investments in GPU and server capabilities [4][16]. - Coreweave's capital expenditure for FY25Q2 reached 2.9 billion USD, with expectations of continued high spending to support growth [26][24]. Model Innovations - Tencent's new multi-modal understanding model, Mix Yuan Large-Vision, has achieved top rankings in international evaluations, showcasing its advanced capabilities in multi-language understanding [34][35]. - Kunlun Wanwei's Skywork Deep Research Agent v2 has set new industry standards for performance in complex task handling [43][44].