腾讯研究院AI速递 20250701

Group 1: OpenAI Custom Services - OpenAI has launched a custom AI consulting service starting at ten million dollars, with engineers assisting clients in model fine-tuning and application development [1] - The U.S. Department of Defense (contract worth $200 million) and Singapore's Grab are among the first clients, with services extending to military strategy and map automation [1] - This move positions OpenAI in competition with consulting firms like Palantir and may pose a threat to smaller startups focused on specific AI applications [1] Group 2: Gemini 2.5 Pro API - The Gemini 2.5 Pro API has returned to free usage, offering five requests per minute, 250,000 tokens per minute, and 100 requests per day [2] - Users can obtain an API Key by logging into Google AI Studio, creating the key, and saving it, with more lenient usage restrictions compared to OpenAI's o3 model [2] - The API can be accessed through third-party clients like Cherry Studio or Chatbox, supporting text Q&A, image analysis, and built-in internet search functions [2] Group 3: LeCun's PEVA World Model - LeCun's team has released the PEVA world model, achieving coherent scene prediction for 16 seconds, enabling embodied agents to possess human-like predictive capabilities [3] - The model combines 48-dimensional human joint kinematics data with conditional diffusion Transformers, trained using first-person perspective videos and full-body pose trajectories [3] - PEVA demonstrates intelligent planning abilities, selecting optimal solutions among multiple action options for complex tasks, outperforming baseline models by over 15% [3] Group 4: Huawei's Open Source Models - Huawei has open-sourced two large models: the 720 billion parameter mixed expert model "Pangu Pro MoE" and the 70 billion parameter dense model "Pangu Embedded 7B" [4][5] - The Pangu Pro MoE is trained using 4,000 Ascend NPUs, with an activated parameter count of 16 billion, achieving performance comparable to Qwen3-32B and GLM-Z1-32B models, with single-card inference throughput reaching 1,528 tokens/s [5] - The Pangu Embedded 7B employs a dual-system architecture of "fast thinking" and "slow thinking," automatically switching based on task complexity, outperforming similarly sized models like Qwen3-8B and GLM4-9B [5] Group 5: Baidu's Wenxin Model 4.5 Series - Baidu has officially open-sourced the Wenxin model 4.5 series, launching ten models with parameter scales ranging from a 47 billion mixed expert model to a 0.3 billion lightweight model, along with API services [6] - The series adopts the Apache 2.0 open-source protocol and introduces a multi-modal heterogeneous model structure, enhancing multi-modal understanding capabilities while maintaining high performance in text tasks [6] - The models have been benchmarked against DeepSeek-V3 and provide support through the ERNIEKit development suite and FastDeploy deployment suite [6] Group 6: Zhihu's Knowledge Base Upgrade - Zhihu has completed a significant upgrade to its knowledge base, allowing for public subscription and link sharing, deeply integrating with community content for an immersive reading experience [7] - The knowledge base capacity has expanded to 50GB, supporting various file formats for upload, and increasing exposure scenarios such as knowledge squares and personal homepages [7] - Zhihu has initiated an incentive program to encourage users to create and share vertical knowledge bases, with awards for "most valuable" and "prompt creativity," running until July 18 [7] Group 7: EVE 3D AI Companion - EVE is a 3D AI companion application designed with gamified elements, a favorability system, and interactive features, creating a strong sense of "human-like" presence and proactivity [8] - The AI can perform cross-dimensional interactions, such as delivering milk tea to users' homes and creating personalized songs, blurring the lines between virtual and real experiences [8] - EVE enhances the AI companionship experience through detailed expressions (emojis, trending topics) and a memory system, representing a significant breakthrough in the AI entertainment sector [8] Group 8: Apple's XR Devices - Apple is reportedly developing at least seven head-mounted devices, including three Vision series and four AI glasses, with the first AI glasses expected to launch in Q2 2027, targeting annual shipments of 3 to 5 million units [10] - The lightweight Vision Air is anticipated to begin mass production in Q3 2027, being over 40% lighter than the Vision Pro and significantly cheaper, while XR glasses with display features are expected by late 2028 [10] - The development of these devices is expected to ignite the AI glasses market, potentially exceeding 10 million units in sales [10] Group 9: Insights from Iconiq Capital's AI Report - A survey of 300 AI companies indicates a shift from conceptual hype to practical implementation, with OpenAI and Claude leading in enterprise AI selection, and nearly 90% of high-growth startups deploying intelligent agents [12] - The structure of AI spending shows that data storage and processing costs far exceed training and inference, with companies transitioning from traditional subscription models to usage-based hybrid pricing [12] - Among AI-native companies, 47% have reached critical scale, while only 13% of AI-enhanced companies have done so, with 37% of rapidly growing companies focusing on AI, making code intelligent agents the primary productivity application [12]