Seko

Search documents
Sora登顶苹果App Store;阿里云升级全栈AI体系;英伟达拟投1000亿助OpenAI建数据中心|一周AI要闻回顾
36氪· 2025-10-04 13:22
来源| 未来人类实验室 (ID:LabforAI) 封面来源 | Unsplash OpenAI上线AI 视频生成应用 Sora,登顶苹果App Store OpenAI本周推出的AI视频生成应用Sora表现抢眼,凭借其独特的AI技术成功超越Google的Gemini和OpenAI的ChatGPT,成为苹 果App Store免费应用榜首。用户可以制作并分享由版权内容生成的 AI 视频,混剪他人视频并发布到类似社交媒体的内容流中。 该应用仅支持iOS设备,用户需通过邀请码获取访问权限。其背后的Sora2模型能生成高度逼真的场景和声音。OpenAI表示已采取 措施确保用户形象使用安全,但部分涉及OpenAI CEO Sam Altman的争议视频引发了关于其应用、危害及合法性的讨论。 (CNBC) 阿里云云栖大会升级全栈AI体系,未来三年投3800亿建云和AI基础设施 9月24日,2025年云栖大会上阿里云CTO周靖人发布7款通义大模型,覆盖语言、视觉、语音、多模态、编程全领域。 其中,旗舰模型Qwen3-Max参数超万亿,预训练数据达36T tokens,在编程、Agent工具调用等测试中超越GPT5、Cla ...
商汤推出短片创作平台Seko 成本骤降引10万创作者入驻
Zheng Quan Shi Bao Wang· 2025-09-30 11:41
此前应用AI虽令影片创作门槛大幅降低,但制作过程仍较繁琐——需使用不同软件分别完成文字生 图、图像生视频、剪辑、配音等环节,既需掌握多平台操作技巧,也影响制作效率。商汤推出的Seko, 首创 "创编一体" 模式,重构影片制作的智能工作流程,让用户一站式制作完整、连贯、专业的AI影 片,涵盖近期热议的AI短剧、AI漫剧等场景,堪称"AI短剧大师"。 根据剧本拆分分镜场景是影片制作的核心难点,Seko可精准拆解复杂剧本,生成详细分镜和角色台词。 其中商汤自研的SekoTalk技术,支持长图生成影像并匹配口型,且不受时长限制。平台同时提供配音及 配乐服务,包含60种自然音色,能通过强大生成能力精准创建角色画面、台词与配音,并确保各元素高 度匹配。 利用人工智能(AI)生成短剧成为新趋势,商汤(00020.HK)近期推出AI短片创作平台Seko,作为业界 首个将创作及编辑影片流程一体化的AI视频Agent,用户通过输入文字指令、与AI对话即可生成影片, 并能确保各分镜中的角色、风格等保持一致。 目前平台的高效体验已吸引逾10万创作者使用,涵盖影片制作团队、知名自媒体创作者(KOL)、热 门短剧导演等群体。其中,AIG ...
腾讯研究院AI速递 20250813
腾讯研究院· 2025-08-12 16:01
Group 1 - Nvidia and AMD have agreed to pay 15% of their revenue from specific AI chips sold in China to the U.S. government in exchange for export licenses [1] - Nvidia will pay 15% of its revenue from H20 chips, while AMD will do the same for MI308 chips [1] - The U.S. Department of Commerce has begun issuing export licenses for these products, but the Trump administration has not yet decided how to utilize the funds collected [1] Group 2 - OpenAI achieved a gold medal in the AI category at the 2025 International Olympiad in Informatics, ranking first among AI participants and only behind five human competitors [2] - OpenAI's performance improved significantly from the 49th percentile last year to the 98th percentile this year, using a general reasoning model without specialized training for the competition [2] - The model used by OpenAI is the same as the one that won a gold medal at the International Mathematical Olympiad, showcasing its strong general reasoning capabilities [2] Group 3 - Zhizhu released and open-sourced the GLM-4.5V model, which has 106 billion parameters and achieved state-of-the-art performance in 41 multimodal benchmarks [3] - The model outperformed 99% of human players in image recognition and reasoning tests, achieving a notable rank in a global scoring competition [3] - It employs a three-stage strategy for training and supports long-context multimodal inputs, with low API usage costs [3] Group 4 - Kunlun Wanwei launched the Matrix-3D model for generating high-quality panoramic videos from single images, enabling immersive 3D space exploration [4] - The model boasts advantages such as global scene consistency, large generation range, high controllability, strong generalization ability, and fast generation speed [4] - A dataset containing 116,000 panoramic videos and 22 million frames was created to support the model's training [4] Group 5 - Tencent introduced the mixed Yuan Large-Vision model, which has 52 billion active parameters and enhances multimodal understanding capabilities [5] - The model scored 1256 points on the international LMArena Vision leaderboard, ranking first among domestic models and comparable to GPT-4.5 and Claude-4-Sonnet [5] - It consists of three core modules and utilizes a large dataset for training [5] Group 6 - GitHub will no longer operate independently and will be integrated into Microsoft's newly established CoreAI group [7] - The integration will be overseen by multiple Microsoft executives, with a focus on transforming GitHub into a core component of Microsoft's AI strategy [7] - The goal is to develop GitHub into an "AI agent factory" [7] Group 7 - SenseTime launched the AI tool Seko, which automates the video production process based on user descriptions [8] - Seko integrates various models to ensure consistency in character portrayal, scene materials, and camera movements [8] - The tool offers a visual editing experience and plans to introduce advanced features in the future [8] Group 8 - Apple is gradually revamping Siri, with a new architecture set to launch by late 2025 or early 2026 [9] - The new Siri will enhance inter-application communication and support continuous dialogue [9] - Apple is conducting extensive internal testing with strategic partners to ensure security and reliability [9] Group 9 - Periodic Labs, co-founded by former OpenAI and Google DeepMind leaders, aims to create a "ChatGPT for materials science" and has secured $200 million in funding [10] - The startup achieved a pre-money valuation of $1 billion shortly after its establishment [10] - The funding will be used to develop AI for discovering and analyzing new compounds [10] Group 10 - GPT-5 demonstrated significantly lower token consumption compared to Claude Opus 4.1 in algorithmic tasks, saving approximately 90% in overall token usage [12] - Claude Opus 4.1 excelled in web development tasks but at a higher token cost [12] - The cost comparison shows GPT-5 completing tasks at about $3.50, while Claude Opus 4.1 costs around $7.58 [12]