Character.ai
Search documents
腾讯研究院AI速递 20250604
腾讯研究院· 2025-06-03 14:49
Group 1 - Microsoft launched Bing Video Creator, supported by OpenAI's Sora technology, allowing users to generate various types of videos through natural language [1] - The service is free and offers two generation modes: quick and standard, with an initial allowance of 10 quick generation opportunities, producing videos of 5 seconds in length [1] - Built-in safety measures are included to prevent misuse, and each generated video is tagged with content credentials and traceability information; currently, it is not available in the national region [1] Group 2 - Manus introduced a new slide feature that can generate 8 professional PPT slides in 10 minutes, receiving positive feedback [2] - The testing process showed that Manus can automatically search for information, plan structure, and generate content, supporting instant modifications and various export formats, although there are issues with incomplete page displays [2] - Compared to Genspark, Manus is faster (10 minutes vs. 20 minutes) and more powerful, being rated as the best PPT creation tool currently [2] Group 3 - Character.ai launched AvatarFX, enabling static images to speak, sing, and interact with users [3] - AvatarFX is based on the DiT architecture, featuring high fidelity and strong temporal consistency, maintaining stability even in complex scenarios with multiple characters and long sequences [3] - Character.ai also introduced several AI creation features, including immersive narrative experiences and animated chat, while facing an antitrust investigation regarding Google's acquisition of the platform [3] Group 4 - Fellou 2.0 was officially released, functioning as an intelligent agent similar to "Jarvis," enabling 24/7 batch production of AI tasks [4][5] - The new version boasts improved speed (1.2-1.5 times faster), enhanced capabilities (supporting diverse delivery), and increased reliability (success rate improved from 31% to 80%) [5] - Built on the new Eko 2.0 architecture, it supports parallel processing of multiple tasks and plans to release a Windows version while continuously optimizing user experience and model intelligence [5] Group 5 - YouWare is an "ambient programming" platform designed for creators in the AI era, allowing non-programmers to convert ideas into web pages and share them online [6] - The platform's core advantage lies in its "what you see is what you think" experience, where users describe their ideas, and AI generates code for immediate visualization and sharing [6] - YouWare is supported by self-developed AI Agent and Sandbox technology, creating a community similar to "Instagram" and implementing a "Knot" reward mechanism to encourage quality content creation [6] Group 6 - Zhiyuan Research Institute open-sourced the lightweight long video understanding model Video-XL-2, capable of efficiently processing video inputs of up to ten thousand frames on a single card [7] - The model consists of a visual encoder, dynamic token synthesis module, and a large language model, employing a four-stage progressive training method and introducing a segmented pre-filling strategy [7] - Video-XL-2 outperforms all lightweight open-source models on mainstream evaluation benchmarks, encoding 2048 frames of video in just 12 seconds, applicable in film content analysis and anomaly behavior monitoring [7] Group 7 - Salesforce, the leading global CRM platform, acquired the AI Agent platform Moonhub, with the entire team joining Salesforce to develop the Agentforce platform [8] - Salesforce CEO Marc Benioff is optimistic about the development of intelligent agents, aiming to create one billion agents through Agentforce by the end of 2025, with 3,000 paying customers already onboard [8] - Moonhub specializes in recruiting intelligent agents, autonomously searching and screening candidates, complementing Salesforce's existing HR intelligent agent functions and enhancing its influence in the intelligent agent sector [8] Group 8 - Li Feifei's World Labs open-sourced the Forge renderer, enabling real-time rendering of AI-generated 3D worlds on ordinary devices [10] - Forge is a web-based 3D Gaussian splat (3DGS) renderer, seamlessly integrating with three.js, supporting multiple splat objects, cameras, and real-time animation/editing [10] - The technology's key lies in an efficient painter's algorithm for sorting issues and a programmable data pipeline, allowing developers to handle AI-generated 3D worlds as easily as processing triangular meshes [10] Group 9 - The report discusses the model selection guide by Kapasi, recommending GPT-4o for simple daily questions and switching to o3 for complex tasks [11] - Specific usage scenarios include 40% for simple daily questions with 4o, 40% for complex important issues with o3, and using GPT-4.1 for code refinement [11] - The core principle for model selection is "either-or": first determine if the task is important and if one is willing to wait (choose o3) or if it is unimportant and needs quick understanding (choose 4o) [11] Group 10 - ChatGPT's memory system consists of two main components: saving memories and chat history, which is further divided into current session history, dialogue history, and user insights [12] - The technical implementation of memory saving is achieved through bio tools, while dialogue history utilizes vector space to establish multi-layer indexing [12] - The user experience is significantly enhanced by the memory mechanism, particularly the user insight system, which may contribute over 80% to ChatGPT's improved understanding, transforming it from "you tell me" to "I can see" [12]
我在 Character.ai 做 Post Training|42章经
42章经· 2024-11-24 14:09
在我 9 月份的硅谷行程里,让我印象最深、最有收获的人之一就是 Ted。 他先后在 Meta、Apple、Google 和 Roblox 都工作过,并在 23 年年底加入了 Character.ai,做 Post Training。作为 C.AI 第四十来号员工,他对于 C.AI 的产品、模 型、训练等等的熟悉程度都非常高。 所以我这次特别把他请来,跟大家一起分享下美国最好的 AI 公司内部是如何运作的,Post Training 的最佳实践是怎么做的等等。 Inside C.AI 曲凯 : 我首先问一个问题,C.AI 一直是 AI 陪聊类产品的代表,各项数据都非常好,所以你们到底是哪个点做得比别人好? Ted: 我觉得 C.AI 走到现在,核心优势有三个: 1) 模型全自研带来的性能优势。自研模型有更大的自由度,我们可以自如地调整预训练阶段的语料比例,从而极大地提升对话效果。 2) Noam Shazeer 带来的成本优势。Noam 是创造 Transformer 的核心人物之一,一个真正少有的技术天才。创立 C.AI 后,他带领着一群业界最顶尖的技术团队,把 我们的推理成本压缩到了其它同参数量模型的 1% ...