腾讯研究院AI速递 20260129

Group 1: OpenAI Developments - OpenAI launched Prism, a cloud-based LaTeX workspace powered by GPT-5.2, integrating drafting, editing, collaboration, and publishing, with capabilities to read the overall structure and context of papers [1] - Prism offers features like intelligent literature search, sketch-to-LaTeX conversion, and voice editing, allowing unlimited collaborators and is free for all ChatGPT users [1] - OpenAI anticipates that AI will transform software development by 2025 and the scientific field by 2026, positioning Prism as a pioneer in accelerating scientific discovery [1] Group 2: Google AI Plus Initiative - Google officially launched the AI Plus plan globally, priced at $7.99 per month in the U.S., with a 50% discount for the first two months, targeting budget-conscious users [2] - The plan includes access to Gemini 3 Pro, Flow video creation, NotebookLM research assistance, and 200GB of cloud storage, supporting up to six family members [2] - Existing Google One Premium 2TB users will automatically receive all AI Plus benefits, seen as a direct response to OpenAI's ChatGPT Go [2] Group 3: Clawdbot Rebranding - The open-source project Clawdbot was forced to rebrand as Moltbot due to trademark infringement claims from Anthropic, with developers humorously noting "same lobster spirit, new shell" [3] - During the rebranding, a GitHub issue led to the old ID being seized by cryptocurrency scammers for blockchain fraud, prompting the author to clarify that no tokens were ever issued [3] - The author also advised that "most non-technical users should not install this," as the project is still in its early stages and poses security risks [3] Group 4: Tencent's Mixed Yuan Image 3.0 - Tencent's Mixed Yuan Image 3.0, a state-of-the-art image generation model, has been open-sourced, based on an 80 billion parameter mixed expert architecture, ranking seventh globally on the LMArena image editing leaderboard [4] - The model employs a "think before edit" workflow, supporting diverse editing capabilities such as addition, deletion, style transformation, and old photo restoration [4] - The training process involved constructing a dataset of millions of image generation tasks covering over 80 tasks, utilizing a proprietary MixGRPO algorithm to align with user preferences [4] Group 5: Kunlun Tiangong's Mureka V8 - Kunlun Tiangong released the Mureka V8 music model, leveraging MusiCoT technology to enhance musicality, arrangement completeness, and vocal expression, transitioning from "generable" to "publishable" [5][6] - The V8 model surpassed Suno in subjective scoring for Chinese song generation and has formed a strategic partnership with Taihe Music Group, integrating AI music into mainstream production and distribution [6] - The platform has served over 8,000 global clients and plans to iterate 2-3 versions annually, aiming to become the leading platform in the global AI music sector [6] Group 6: Vidu's Q2 Reference Model - Vidu launched the Q2 Reference Pro model, featuring a unique "everything can be referenced" capability, supporting six types of references including effects, expressions, textures, actions, characters, and scenes [7] - The model enables fine-tuned video editing, allowing users to add, delete, modify, and replace any elements, with one-click switching between real and animated styles [7] - This new functionality allows users to create special effects films without needing to learn professional tools like C4D or AE, accelerating the production of AI-driven short dramas [7] Group 7: Ant Group's LingBot-VLA - Ant Group released the LingBot-VLA, an embodied intelligent base model trained on approximately 20,000 hours of real data covering nine dual-arm robot configurations, outperforming Pi0.5 in GM-100 benchmark tests [8] - The model utilizes a Mixture-of-Transformers architecture, integrating visual distillation to achieve strong generalization across different entities and scenes [8] - The research revealed the scaling law of the VLA model, showing continuous performance improvement as data expanded from 3,000 to 20,000 hours without saturation [8] Group 8: Establishment of the Interstellar Navigation Academy - The Interstellar Navigation Academy was officially established at the Chinese Academy of Sciences, with Academician Zhu Junqiang as the director, aiming to build a curriculum system covering 14 primary disciplines [9] - The academy will introduce 22 core courses, focusing on cutting-edge topics such as interstellar dynamics and governance, along with six specialized teaching practice platforms [9] - This initiative is positioned as a key measure to seize technological high ground, providing talent support for national deep space exploration and space science research [9] Group 9: OpenAI's CEO Acknowledgment - OpenAI's CEO acknowledged during a developer meeting that GPT-5.2 sacrificed writing capabilities for improved reasoning and coding, stating "we messed up," with plans to address this in future versions [10] - The CEO predicted that by the end of 2027, the cost of GPT-5.2 level intelligence will decrease by at least 100 times, leading to personalized app versions for everyone [10] - He emphasized that the most important skills in the AI era will be high adaptability and the ability to generate ideas, noting that while the definition of engineers may change, demand will remain [10] Group 10: AI for Science Competition - OpenAI's Vice President Kevin Weil stated that GPT-5's reasoning capabilities have reached the forefront of human performance, scoring 92% on the GPQA doctoral-level test, significantly surpassing GPT-4's 39% [11] - Weil believes the greatest value of large language models lies in discovering interdisciplinary connections and forgotten research, exploring ways to instill "cognitive humility" and self-fact-checking abilities in models [11] - He predicts that 2026 will be a pivotal year for AI-enabled research, warning that researchers who do not deeply utilize AI tools will miss opportunities to enhance efficiency [11]