MirageLSD

Search documents
26岁,创业两年,他的公司估值超200亿
创业邦· 2025-08-23 03:25
Core Viewpoint - The article highlights the rapid growth and innovative technology of the Israeli startup Decart, which has recently achieved a valuation of $3.1 billion after raising $100 million in Series B funding, showcasing its potential to reshape various industries with its groundbreaking AI video generation model, MirageLSD [4][20]. Group 1: Company Overview - Decart, co-founded by Dean Leitersdorf and Moshe Shalev, has quickly become a unicorn, raising a total of $153 million in just 11 months [4][9]. - The company focuses on real-time AI creative generation, with a team that has expanded from 15 to 60 members within two years [10][11]. - The founders have unique backgrounds, with Leitersdorf being a prodigy with a PhD at 23 and Shalev a veteran from the elite Israeli intelligence unit [5][9]. Group 2: Technology and Innovation - Decart launched MirageLSD, the world's first model capable of generating infinite-length videos in real-time with a latency of less than 40 milliseconds, making it faster than human perception [6][20]. - The technology utilizes a method called "diffusion forcing" to maintain video clarity over long durations and optimize GPU performance for rapid frame generation [17][21]. - MirageLSD allows users to interactively modify video content in real-time, enhancing user experience in gaming, live streaming, and other applications [19][20]. Group 3: Business Model and Market Strategy - Decart's business strategy includes two product lines: a GPU optimization tool for enterprises and a consumer-facing AI game called "Oasis," which has attracted over a million users shortly after launch [11][25]. - The GPU optimization tool significantly reduces operational costs from approximately $100 per hour to $0.25 per hour, generating millions in revenue [11][25]. - The company aims to become a trillion-dollar enterprise, aspiring to create an app that will be downloaded by a billion users, similar to the impact of smartphones [25][26]. Group 4: Future Vision - The founders envision transforming the entertainment and creative sectors through AI, aiming to revolutionize how users interact with technology [28][29]. - They believe that the future of the internet will be significantly influenced by AI, particularly in areas like video content and gaming, which have yet to fully leverage AI capabilities [28][29].
腾讯研究院AI速递 20250721
腾讯研究院· 2025-07-20 16:02
Group 1 - Kimi K2 surpasses DeepSeek to become the top open-source model globally, ranking fifth overall and closely following leading closed-source models [1] - K2 inherits the DeepSeek V3 architecture with parameter adjustments, including an increase in expert numbers and a reduction in attention heads [1] - Two of the top 10 open-source models are from China, challenging the perception that "open-source equals weak performance" [1] Group 2 - Decart releases MirageLSD, the first real-time, unlimited diffusion video model capable of processing any video stream with a 40-millisecond delay [2] - Karpathy invests as an angel investor, foreseeing broad applications in real-time film production, game development, and AR [2] - The breakthrough lies in the real-time stream diffusion architecture, addressing error accumulation through frame-by-frame generation and historical enhancement methods [2] Group 3 - Suno V4.5+ offers layered generation and fusion of vocals and instruments, allowing users to upload personal vocals or accompaniments for AI-assisted creation [3] - The new "Inspire" mode enables users to upload personal dry vocals for AI to learn and create music that matches their vocal characteristics [3] - The platform has optimized creative thresholds and enhanced AI collaboration efficiency with the launch of Suno V4.5+ [3] Group 4 - Tencent Yuanbao App integrates QQ Music services, enabling users to search for songs with a phrase and play them instantly without leaving the chat interface [4] - The technology is driven by a dual-engine system combining mixed models and DeepSeek-R1, capable of recognizing vague music descriptions and providing contextual recommendations [4] - User experience improvements include seamless account connectivity, multimodal interaction, and creative assistance, reflecting the evolution of AI assistants from tools to partners [4] Group 5 - OpenAI's ChatGPT agent faces criticism from competitors like Manus and Genspark, highlighting its limitations despite integrating multiple functionalities [5] - The ChatGPT agent can automate tasks like retirement planning and shopping lists, but its output is considered simplistic compared to competitors [5] Group 6 - PhysRig, developed by UIUC and Stability AI, introduces a framework for character animation with micro-physical binding, embedding rigid skeletons into elastic soft bodies [6] - This method replaces traditional techniques with micro-physical simulations, addressing issues of volume loss and deformation artifacts [6] - The framework outperforms traditional methods across 17 character types and 120 animation tests, supporting cross-species motion transfer [6] Group 7 - OpenAI's mysterious general reasoning model achieved a gold medal level in IMO 2025 by solving five problems and scoring 35 points [7] - The model demonstrates deep creative thinking capabilities lasting several hours, surpassing previous AI's minute-level reasoning [7] - This achievement is a result of breakthroughs in general reinforcement learning rather than task-specific training, although the model will not be released [7] Group 8 - The creator of Claude Code emphasizes that the best AI tools should empower users, advocating for simple, universal tools rather than complex systems [8] - The focus is on providing foundational capabilities that allow users to control their workflows rather than having the tools dictate them [8] - Effective workflows should involve exploration and planning followed by user confirmation before coding, utilizing test-driven development for iterative improvement [8] Group 9 - The focus on agents, open-source, and the choice of DSV3 architecture is justified by the need to stimulate model capabilities without relying on external products [9] - Open-sourcing enhances visibility and community contributions, ensuring genuine model progress rather than superficial improvements [9] - The DSV3 architecture has been proven superior in experiments, allowing for cost-effective adjustments without introducing ineffective variables [9] Group 10 - Many current AI products are expected to be replaced as they do not adhere to scaling laws, with a focus on enhancing model capabilities rather than merely expanding tools [10] - Current AI models exhibit lower data efficiency compared to humans, indicating that algorithm improvements are more critical than simply increasing data scale [10] - Research on multi-agent systems is evolving to explore not just interactions but also extending reasoning capabilities from minutes to hours or even days [10]
大神Karpathy都投的AI实时视频生成模型:直播都能立即转,无限时长几乎零延迟
量子位· 2025-07-19 05:15
Core Viewpoint - The article discusses the innovative AI startup Decart and its groundbreaking video model MirageLSD, which enables real-time, zero-latency video generation, revolutionizing live streaming, gaming, and video communication [4][5][7]. Group 1: Technology and Features - MirageLSD is the first AI model to achieve zero-latency, infinite real-time video generation, allowing for continuous video streams without time limitations [4][5]. - The model operates at a speed 16 times faster than previous models, generating video at 24 frames per second and allowing for ongoing prompts, transitions, and edits during video generation [6][28]. - It addresses the "error accumulation" issue found in traditional autoregressive video models, ensuring temporal coherence while generating content frame by frame [9][11]. Group 2: Innovations and Mechanisms - The model employs a custom real-time stream diffusion model (Live-Stream Diffusion) that generates each frame based on previously generated frames and user prompts, rather than relying on the entire video sequence [14]. - It utilizes Diffusion Forcing technology to independently denoise single frames during training, ensuring coherence in frame generation [15]. - The model incorporates a historical enhancement strategy to preemptively correct potential errors by simulating artifacts during training [16]. Group 3: Performance and User Interaction - MirageLSD's architecture includes an improved Transformer model and a specially designed visual encoder, which enhances processing speed and reduces latency [18][20]. - The system features a dynamic input mechanism that processes player inputs with ultra-low latency, allowing for immediate responses to changes in the environment [22]. - Users can perform actions like changing outfits or transforming objects with minimal delay, showcasing the model's interactive capabilities [23]. Group 4: Company Background and Future Developments - Decart, the company behind MirageLSD, was founded in 2023 and previously launched the Oasis model, which also supports real-time interactions [25][26]. - The team plans to regularly release upgrades and new features for MirageLSD, including facial consistency, voice control, and precise object manipulation to enhance user experience [28].
世界首个「实时、无限」扩散视频生成模型,Karpathy投资站台
机器之心· 2025-07-19 03:13
Core Viewpoint - The article discusses the revolutionary breakthrough in AI video generation with the launch of Decart's MirageLSD, which allows real-time, unlimited-length video transformation from any video stream with a latency of 40 milliseconds [3][18]. Group 1: Technology and Features - MirageLSD is the first video generation model capable of producing unlimited-length videos, overcoming previous limitations of error accumulation in traditional models [23][24]. - The technology achieves zero-latency video generation, allowing real-time interaction by generating each frame based on previous frames and user prompts, thus enabling continuous video creation without pre-set endpoints [28][32]. - The model utilizes a causal autoregressive structure, which supports immediate feedback and adapts to changes in video content and user input [34][35]. Group 2: Applications and Potential - The technology opens up new applications such as transforming camera footage into alternate realities, real-time movie production, and simplified game development [7][8][9]. - It also enables innovative uses in video conferencing backgrounds, virtual try-ons, and augmented reality enhancements [11][12]. - The potential for "killer applications" remains vast, with the technology being compared to concepts from popular culture, such as "Sword Art Online" [15]. Group 3: Future Developments - Decart plans to continue releasing model upgrades and new features, including facial consistency, voice control, and precise object manipulation [16]. - The platform will also introduce streaming support for live broadcasts and game integration, expanding its functionality [16].