Workflow
VAREdit
icon
Search documents
图像编辑太慢太粗糙?全新开源自回归模型实现精准秒级修改 | 智象未来
量子位· 2025-09-02 10:45
智象未来团队 投稿 量子位 | 公众号 QbitAI AI图像编辑技术发展迅猛,扩散模型凭借强大的生成能力,成为行业主流。 但这类模型在实际应用中始终面临两大难题:一是"牵一发而动全身",即便只想修改一个细节,系统也可能影响到整个画面;二是生成速度缓 慢,难以满足实时交互的需求。 针对这些痛点,智象未来(HiDream.ai)团队开辟了新路径:提出全新的自回归图像编辑框架 VAREdit 。 它引入了视觉自回归(VAR)架构,能够在遵循指令的前提下做到"指哪打哪",大幅提升编辑精准度与生成速度,推动图像编辑进入新的阶 段。 模型与代码均已开源,具体链接可见文末。 全新自回归图像编辑框架VAREdit 智象未来提出的VAREdit将视觉自回归建模引入指令引导的图像编辑中,将图像编辑定义为下一尺度预测问题,通过自回归地生成下一尺度目 标特征残差,以实现精确的图像编辑。 多尺度量化编码 :将图像表征 编码为多尺度残差视觉令牌序列R₁,R₂,…,Rₖ,其中Rₖ的空间规模(hₖ,wₖ)随着k的增大而依次递 增;融合前k个尺度残差信息的连续累积特征可通过码本查询和上采样操作进行加和,表示为 。 该方法虽能提供逐尺度参考, ...
智象未来发布全新自回归图像编辑框架VAREdit;豆包未成年人保护模式上线丨AIGC日报
创业邦· 2025-08-27 00:12
2.【豆包未成年人保护模式上线】8月26日,豆包正式上线未成年人保护模式。家长输入密码开启该 模式后,推荐视频、浏览第三方网页、和豆包以外的智能体对话、AI创作功能将被默认关闭。翻译、 深入研究等功能仍能正常使用。(界面新闻) 3. 【沙特Humain数据中心计划明年初投入运营,从英伟达等进口芯片】沙特人工智能(AI)公司 Humain在该国的第一批数据中心已经破土动工,计划2026年年初投入运营。该公司首席执行官 (CEO)Tareq Amin称,将从美国的英伟达等供应商进口芯片。该公司希望,通过修建数据中心、 AI基础设施、以及云算力,将沙特打造为一个地区性的AI中心。(财联社) 1.【智象未来发布全新自回归图像编辑框架VAREdit】近日,智象未来团队正式推出全新自回归图像 编辑框架 VAREdit。该框架不仅能够精准执行用户指令,避免过度修改,且编辑速度提升至0.7秒 级。(每日经济新闻) , 欢 迎 加 入 睿 兽 分 析 会 员 , 解 锁 相 关 行 业 图 谱 和 报 告 等 。 4. 【阿里云百炼宣布部分模型上下文缓存降价】8月26日消息,阿里云大模型服务平台百炼8月26日 发布部分模型上下 ...
马斯克正式起诉OpenAI和苹果,电商成为小红书一级入口 | 蓝媒GPT
Sou Hu Cai Jing· 2025-08-26 10:48
Group 1: Legal Actions and Allegations - Musk's company xAI has filed a lawsuit against OpenAI and Apple, accusing them of illegal collusion to hinder competition in the AI sector [1] - Musk claims that Apple is violating antitrust laws by favoring OpenAI in its app store rankings, making it difficult for other AI companies to compete [1] - Musk has threatened immediate legal action against Apple and questioned why Apple has not included xAI's applications in its recommended section [1] Group 2: E-commerce Developments - Xiaohongshu has launched a new version of its app, making e-commerce a primary entry point, indicating a significant investment in its e-commerce business [1] - The new version features a "Market" page in the bottom navigation bar, positioned near the main "Home" page, showcasing Xiaohongshu's lifestyle e-commerce offerings [1] Group 3: Technology and Product Innovations - Zhixiang Future has introduced a new self-regressive image editing framework called VAREdit, which executes user commands accurately and improves editing speed to 0.7 seconds [1] - Multiple smartphone manufacturers, including Vivo, Honor, Xiaomi, Huawei, and OPPO, are entering the mixed reality (MR) and augmented reality (AR) markets, reflecting a strategic shift towards new human-computer interaction opportunities [2] - Meta Platforms plans to unveil a new type of smart glasses with a display at the upcoming Connect conference, priced around $800, indicating ongoing expansion in augmented reality and wearable technology [3]
0.7秒实现精准图像编辑!智象未来团队提出全新自回归图像编辑框架VAREdit
Mei Ri Jing Ji Xin Wen· 2025-08-25 07:35
(文章来源:每日经济新闻) 在EMU-Edit和PIE-Bench这两个业界公认的基准测试数据集上,VAREdit在传统的CLIP的评价指标和更 能体现编辑精准性的GPT指标均取得了显著优势。其中,VAREdit-8.4B在GPT-Balance指标上较ICEdit和 UltraEdit分别提升41.5%和30.8%;轻量版VAREdit-2.2B可在0.7秒内完成512×512图像高保真编辑。目 前,VAREdit已在GitHub和Hugging Face平台全面开源。 每经AI快讯,据智象未来微信公众号8月25日消息,为了攻克效果"失控"与效率低下等问题,智象未来 团队将视觉自回归(VAR)架构引入图像编辑,提出了全新的指令引导编辑框架VAREdit。 ...
智象未来发布全新自回归图像编辑框架 VAREdit ,0.7 秒完成高保真图像编辑
Ge Long Hui· 2025-08-25 06:26
Core Insights - The launch of VAREdit marks a significant breakthrough in image editing technology, being the world's first purely autoregressive image editing model [1] - VAREdit enhances editing speed to 0.7 seconds, facilitating real-time interaction and efficient creation [1] Group 1: Technology and Innovation - VAREdit addresses limitations of diffusion models in image editing, such as imprecise modifications and low efficiency in multi-step iterations [1] - The framework introduces a visual autoregressive (VAR) architecture, defining editing as "next-scale prediction" to achieve precise local modifications while maintaining overall structure [1] - The innovative Scale Alignment Reference (SAR) module effectively resolves scale matching issues, further improving editing quality and efficiency [1] Group 2: Performance Metrics - In authoritative benchmarks EMU-Edit and PIE-Bench, VAREdit outperforms competitors across various metrics, including CLIP and GPT [1] - The VAREdit-8.4B model shows a 41.5% and 30.8% improvement in the GPT-Balance metric compared to ICEdit and UltraEdit, respectively [1] - The lightweight VAREdit-2.2B model can achieve high-fidelity editing of 512×512 images within 0.7 seconds, resulting in multiple speed enhancements [1] Group 3: Future Developments - VAREdit is fully open-sourced on GitHub and Hugging Face platforms, indicating a commitment to community engagement and collaboration [2] - The company plans to explore applications in video editing and multimodal generation, aiming to advance AI image editing into a new era of efficiency, control, and real-time capabilities [2]