Workflow
Mirage
icon
Search documents
腾讯研究院AI速递 20250704
腾讯研究院· 2025-07-03 15:31
Group 1 - Google, Nvidia, and seven other institutions have launched the world's first AI-native UGC game engine, Mirage, which can generate game content in real-time through natural language commands [1] - Mirage supports a smooth experience at 16 FPS, allowing for 5-10 minutes of continuous gameplay, with graphics quality comparable to GTA and Forza [1] - The core technology is based on a "world model" created using Transformer and diffusion models, trained on extensive gaming data to enable dynamic interaction and real-time control [1] Group 2 - Zhiyuan Research Institute has released OmniGen2, a unified image generation model that supports text-to-image, image editing, and theme-driven image generation [2] - The model introduces an innovative image generation reflection mechanism, significantly enhancing context understanding, instruction adherence, and image generation quality [2] - OmniGen2 has an open research experience version, with model weights, training code, and training data fully open-sourced, achieving over 2000 stars on GitHub within a week [2] Group 3 - Google has announced the free provision of the Gemini AI tool suite to global educators, deeply integrated into Google Classroom and ChromeOS [3] - Gemini in Classroom includes over 30 AI tools that can automatically generate lesson plans, classroom activities, and quiz questions, saving teachers preparation time [3] - New AI tools like NotebookLM and Gems, along with data analysis features, aim to create personalized learning experiences and data-driven teaching [3] Group 4 - Xingliu Agent is a multifunctional AI creation platform that can complete various creative tasks such as batch emoji generation, brand VI design, video generation, and 3D modeling through natural language commands [4][5] - Key features include high-quality content generation in bulk, Kontext intelligent image editing, and full media workflow support, establishing a new design paradigm of "Vibe designing" [5] - The platform offers free experience credits and supports diverse creative outputs, shifting the designer's role from "mastering technology" to "understanding needs and expressing creativity" [5] Group 5 - Tencent Yuanbao has introduced a new feature that supports AI-based image and video content search, allowing intelligent matching of content without restrictions on model usage [6] - The results can intelligently reference related video tutorials, facilitating a combination of text and video explanations, with one-click access to watch the videos [6] - Users can continue to ask follow-up questions after receiving initial answers, enhancing the interactive experience [6] Group 6 - The Xie Saineng team has released the Blender Fusion framework, enabling precise control of 3D scenes without relying on text prompts [7] - The core technology involves a three-step process: separating objects and scenes using the SAM model, editing in Blender, and generating high-quality composite images with a diffusion model [7] - The system employs a dual-stream diffusion synthesizer to enhance generalization and realism through techniques like source occlusion and simulated object jitter [7] Group 7 - xAI is set to release the new Grok 4 series, including the flagship Grok 4 and the specialized programming model Grok 4 Code, with a launch expected after the U.S. National Day [8] - Grok 4 features a context window of 130,000 tokens, supports function calls, structured outputs, and reasoning capabilities, but currently lacks visual and image generation functions [8] - Elon Musk aims for Grok 4 to rewrite the human knowledge base, filling in missing information and correcting errors, while Grok 4 Code will serve as a professional programming assistant [8] Group 8 - The U.S. Department of Commerce has lifted temporary bans on the three major EDA companies, Siemens, Synopsys, and Cadence, allowing full access to their software and technology for Chinese customers [11] - Previously, a sudden export restriction led to a significant drop in stock prices, with Synopsys predicting a 28% year-on-year decline in revenue from the China region [11] - The domestic EDA industry faces challenges regarding maturity and market share, as chip design companies prefer using more mature foreign products to ensure successful tape-out [11] Group 9 - The World Economic Forum's "2025 Global Future of Jobs Report" indicates that AI and machine learning specialists will be the fastest-growing occupations, with an expected growth of 86% in job numbers [12] - AI is set to reshape the global labor market, with data analytics, cybersecurity, and technical literacy emerging as the three fastest-growing skills, while traditional roles like data entry clerks and administrative assistants face declining demand [12] - Approximately 39% of employees' skills are expected to change significantly between 2025 and 2030, yet only 50% of employees have received systematic training, with 63% of employers viewing skill gaps as the biggest obstacle to business transformation [12]
全球首款AI原生UGC游戏引擎诞生!输入文字秒建GTA世界,试玩体验来了
机器之心· 2025-07-03 03:26
机器之心报道 编辑:杜伟、Panda 从此,游戏的未来不单单由专业设计师逐关打造,而是让每一个人都能实时构思、生成并体验游戏世界。 就在今天, 全球首个由实时世界模型驱动的 AI 原生游戏引擎问世了! 该游戏引擎名为「 Mirage 」,由 Dynamics Lab 开发。 该系统专为构建动态、交互式且持续演变的游戏体验而设计, 玩家可以通过自然语言、键盘或控制器实时生成并修改整个游戏世界。 从功能定位来看,Mirage 支持多类型的游戏开发。 目前发布了两款可玩游戏演示,包括 都市乱斗(GTA 风格) 和 海岸漂移(极限竞速地平线风格) 。 所有场景都是实时动态生成的 ,并非预设脚本。我们看到的是一个随着玩家操作实时演变的可交互动态模拟世界。 都市乱斗( GTA 风格) 都市乱斗:https://demo.dynamicslab.ai/chaos 海岸漂移: https://demo.dynamicslab.ai/drift 海岸漂移(极限竞速地平线风格) 机器之心上手试玩了一下都市乱斗(GTA 风格),打开后界面是下面这样的,左边是控制选项,右边是街景选项。 体验了一小会,我们发现: 游戏延迟还比较高,人 ...
舍弃CUDA编程!CMU等用几十行代码将LLM编译成巨型内核,推理延迟可降6.7倍
机器之心· 2025-06-21 01:33
机器之心报道 编辑:杜伟 在 AI 领域,英伟达开发的 CUDA 是驱动大语言模型(LLM)训练和推理的核心计算引擎。 MPK 的易用性很强,你只需要几十行 Python 代码就能将 LLM 编译成一个高性能巨型内核,实现快速推理,整个过程无需 CUDA 编程。 不过,CUDA 驱动的 LLM 推理面临着手动优化成本高、端到端延迟高等不足,需要进一步优化或者寻找更高效的替代方案。 近日,CMU 助理教授贾志豪(Zhihao Jia)团队创新玩法,推出了一个名为 「Mirage Persistent Kernel(MPK)」的编译器,可以自动将 LLM 转化为优化的巨型 内核(megakernel),从而将 LLM 推理延迟降低 1.2 到 6.7 倍。 MPK 将 LLM 推理延迟推近硬件极限。在单个 A100-40GB GPU 上,MPK 将 Qwen3-8B 每个 token 的延迟从 14.5 毫秒 (vLLM/SGLang) 降低到 12.5 毫秒,逼近基于 内存带宽计算得出的 10 毫秒理论下限。 GitHub 地址:https://github.com/mirage-project/mirage/ ...
商汤-W(00020) - 2023 H2 - 业绩电话会
2024-03-26 09:00
SenseTime Group (00020) H2 2023 Earnings Call March 26, 2024 05:00 AM ET Speaker0 Good evening, everyone, and welcome to Sensetime Group's twenty twenty three Annual Results Presentation. I'm Jessie Lin, the Joint Company Secretary and today's MC. Let me introduce the management representatives joining us today. Doctor. Xu Li, Chairman and CEO of Senstime Group Mr. Xu Bin, Co Founder and Executive Director and Mr. Wang Zheng, CFO. First of all, let me read the disclaimer. Today's discussions may contain for ...