腾讯研究院AI速递 20260123
腾讯研究院·2026-01-22 16:01

Group 1 - Runway has launched the new Gen 4.5 model, significantly improving lens control and storytelling capabilities, generating three shots (close-up, medium, and long) within 5 seconds [1] - In a test with 1,000 participants, only 57% could distinguish between AI-generated videos and real videos, with the model achieving near cinematic quality in facial consistency, lighting logic, and physical laws [1] - The video generation model is entering a new upgrade phase, with trends towards realism, audio-visual synchronization, refined local control, and longer generation times [1] Group 2 - Google has partnered with The Princeton Review to integrate a full set of SAT practice tests into Gemini, allowing users to take free full-length mock exams with immediate scoring and detailed error analysis [2] - The tests cover reading, writing, and math modules, supporting customizable countdowns and hints, with Gemini breaking down problem-solving steps for better understanding [2] - SAT is just the beginning, as Google plans to expand Gemini to more standardized tests, positioning AI as an expert assistant across various industries [2] Group 3 - Zhizhu's GLM-4.7 has seen rapid user growth leading to computational strain, causing some users to experience throttling and slower model speeds during peak times [3] - Starting January 23, the GLM Coding Plan will be sold in limited quantities, reducing daily sales to 20% to prioritize the programming experience for existing users [3] - Zhizhu is developing more powerful and efficient models while accelerating computational capacity expansion, with automatic renewals unaffected and the end date for the limited sale to be announced later [3] Group 4 - Baichuan has released the medical model M3 Plus, achieving a hallucination rate of 2.6%, the lowest globally, introducing "evidence anchoring" technology to precisely link each medical conclusion to corresponding sections of original papers [4] - M3 Plus topped authoritative evaluations like Healthbench, surpassing GPT-5.2, with API call prices reduced by 70% compared to the previous generation [4] - Baichuan has launched the "Haina Baichuan" initiative, offering free access to the M3 Plus API for Chinese medical service institutions to promote the development of the AI medical ecosystem [4] Group 5 - Apple is secretly developing an AI device resembling AirTag, equipped with dual cameras and three microphones, similar to Ai Pin, with plans to produce 20 million units, potentially launching in 2027 [5] - Apple plans to introduce a new Siri, codenamed "Campos," deeply integrated with iOS 27, supporting web searches, email writing, image generation, and screen awareness capabilities akin to ChatGPT [5] - The new Siri's foundational model will be based on Google Gemini 3, with Apple paying approximately $1 billion annually to Google and possibly switching to TPU server hosting [5] Group 6 - Remotion is an open-source library that allows users to programmatically create videos using React code, with specific skills available for installation in development tools like Cursor and Claude Code [6] - Users only need to provide text and rhythm requirements, and AI can automatically generate animated video effects, suitable for product demonstrations and promotional videos, with a web editor for detail modifications [6] - This tool is designed for independent developers to create promotional videos, facilitating a shift towards "video editing approaching programming" and supporting iterative adjustments with AI [6] Group 7 - AAAI 2026 announced five outstanding papers, three of which were led by Chinese teams from various universities [7] - The awarded papers cover cutting-edge topics such as robotic visual language action models, multimodal representation learning, and causal discovery in dynamic systems [7] - AAAI 2026 received 23,680 submissions, with 4,167 accepted, resulting in an acceptance rate of 17.6%, with the conference scheduled for January 20-27 in Singapore [7] Group 8 - a16z reviewed the consumer AI landscape, indicating that the general LLM assistant market is trending towards a "winner-takes-all" scenario, with ChatGPT's weekly active users reaching 800-900 million, and only 9% of users willing to pay for multiple AI products [8] - By 2025, image and video generation models are expected to make significant advancements in realism and reasoning capabilities, with Veo 3's audio-video integration and Nano Banana Pro's search integration being key breakthroughs [8] - Leading labs have excelled in model development, but new consumer products have not achieved ideal results, indicating substantial growth opportunities for startups in niche application scenarios in 2026 [8] Group 9 - Anthropic has released the 84-page "Claude Constitution" under the CC0 license, a value declaration directly aimed at AI models, defining Claude's identity and operational principles [9] - The constitution establishes a four-tier value priority: broad safety > broad ethics > adherence to guidelines > genuine helpfulness, emphasizing "modifiability" as the most critical safety feature at this stage [9] - The document outlines strict boundaries, including prohibitions against assisting in the creation of weapons of mass destruction and generating CSAM, while encouraging Claude to develop a stable and positive self-identity [9]

腾讯研究院AI速递 20260123 - Reportify