Blender Fusion框架 - filings, earnings calls, financial reports, news

Blender Fusion框架

Search documents

腾讯研究院AI速递 20250704

腾讯研究院· 2025-07-03 15:31

Group 1 - Google, Nvidia, and seven other institutions have launched the world's first AI-native UGC game engine, Mirage, which can generate game content in real-time through natural language commands [1] - Mirage supports a smooth experience at 16 FPS, allowing for 5-10 minutes of continuous gameplay, with graphics quality comparable to GTA and Forza [1] - The core technology is based on a "world model" created using Transformer and diffusion models, trained on extensive gaming data to enable dynamic interaction and real-time control [1] Group 2 - Zhiyuan Research Institute has released OmniGen2, a unified image generation model that supports text-to-image, image editing, and theme-driven image generation [2] - The model introduces an innovative image generation reflection mechanism, significantly enhancing context understanding, instruction adherence, and image generation quality [2] - OmniGen2 has an open research experience version, with model weights, training code, and training data fully open-sourced, achieving over 2000 stars on GitHub within a week [2] Group 3 - Google has announced the free provision of the Gemini AI tool suite to global educators, deeply integrated into Google Classroom and ChromeOS [3] - Gemini in Classroom includes over 30 AI tools that can automatically generate lesson plans, classroom activities, and quiz questions, saving teachers preparation time [3] - New AI tools like NotebookLM and Gems, along with data analysis features, aim to create personalized learning experiences and data-driven teaching [3] Group 4 - Xingliu Agent is a multifunctional AI creation platform that can complete various creative tasks such as batch emoji generation, brand VI design, video generation, and 3D modeling through natural language commands [4][5] - Key features include high-quality content generation in bulk, Kontext intelligent image editing, and full media workflow support, establishing a new design paradigm of "Vibe designing" [5] - The platform offers free experience credits and supports diverse creative outputs, shifting the designer's role from "mastering technology" to "understanding needs and expressing creativity" [5] Group 5 - Tencent Yuanbao has introduced a new feature that supports AI-based image and video content search, allowing intelligent matching of content without restrictions on model usage [6] - The results can intelligently reference related video tutorials, facilitating a combination of text and video explanations, with one-click access to watch the videos [6] - Users can continue to ask follow-up questions after receiving initial answers, enhancing the interactive experience [6] Group 6 - The Xie Saineng team has released the Blender Fusion framework, enabling precise control of 3D scenes without relying on text prompts [7] - The core technology involves a three-step process: separating objects and scenes using the SAM model, editing in Blender, and generating high-quality composite images with a diffusion model [7] - The system employs a dual-stream diffusion synthesizer to enhance generalization and realism through techniques like source occlusion and simulated object jitter [7] Group 7 - xAI is set to release the new Grok 4 series, including the flagship Grok 4 and the specialized programming model Grok 4 Code, with a launch expected after the U.S. National Day [8] - Grok 4 features a context window of 130,000 tokens, supports function calls, structured outputs, and reasoning capabilities, but currently lacks visual and image generation functions [8] - Elon Musk aims for Grok 4 to rewrite the human knowledge base, filling in missing information and correcting errors, while Grok 4 Code will serve as a professional programming assistant [8] Group 8 - The U.S. Department of Commerce has lifted temporary bans on the three major EDA companies, Siemens, Synopsys, and Cadence, allowing full access to their software and technology for Chinese customers [11] - Previously, a sudden export restriction led to a significant drop in stock prices, with Synopsys predicting a 28% year-on-year decline in revenue from the China region [11] - The domestic EDA industry faces challenges regarding maturity and market share, as chip design companies prefer using more mature foreign products to ensure successful tape-out [11] Group 9 - The World Economic Forum's "2025 Global Future of Jobs Report" indicates that AI and machine learning specialists will be the fastest-growing occupations, with an expected growth of 86% in job numbers [12] - AI is set to reshape the global labor market, with data analytics, cybersecurity, and technical literacy emerging as the three fastest-growing skills, while traditional roles like data entry clerks and administrative assistants face declining demand [12] - Approximately 39% of employees' skills are expected to change significantly between 2025 and 2030, yet only 50% of employees have received systematic training, with 63% of employers viewing skill gaps as the biggest obstacle to business transformation [12]

谢赛宁团队新作：不用提示词精准实现3D画面控制

量子位· 2025-07-03 04:26

henry 发自凹非寺量子位 | 公众号 QbitAI 曾几何时，用文字生成图像已经变得像用笔作画一样稀松平常。但你有没有想过拖动方向键来控制画面？像这样，拖动方向键（或用鼠标拖动滑块）让画面里的物体左右移动：还能旋转角度：缩放大小：这一神奇操作就来自于谢赛宁团队新发布的 Blender Fusion框架，通过结合图形工具 (Blender) 与扩散模型，让视觉合成不再仅仅依赖文本提示，实现了精准的画面控制与灵活操作。图像合成三步走 BlenderFusion "按键生图" 的核心并不在于模型自身的创新，而在于其对现有技术（分割、深度估计、Blender渲染、扩散模型）的高效组合，打通了一套新的Pipeline 。这套Pipeline包含三个步骤：先将物体和场景分离 → 再用Blender做3D编辑 → 最后用扩散模型生成高质量合成图像。接下来看看每一步都是怎么做的吧！第一步：以物体为中心的分层。（Object-centric Layering）第一步是将输入的图像或视频中的各个物体从原有的场景中分离，并推断出它们的三维信息。具体来说，BlenderFusion利用现有 ...

AI视觉合成

Artificial Intelligence

Blender Fusion框架

AI视觉合成

Artificial Intelligence

Blender Fusion框架