Workflow
推理加速
icon
Search documents
WaveSpeedAI 成泽毅:AI Infra 本来就是一门能挣钱的生意
Founder Park· 2025-06-10 12:59
Core Viewpoint - The article discusses the journey of Cheng Zeyi, who transitioned from working in a large tech company to founding WaveSpeedAI, a startup focused on AI infrastructure, emphasizing the importance of inference acceleration in the AI industry and the potential for significant market growth in AI video generation [4][39]. Group 1: Background and Motivation - Cheng Zeyi initially did not plan to start a business but felt constrained in a large company environment after rapid promotions [1][6]. - Technical professionals often seek to prove their value and find better opportunities to utilize their skills [2][3]. - After leaving a large company, Cheng Zeyi validated his skills by creating a new model that gained significant attention on GitHub, leading him to realize the market demand for his expertise [8][11]. Group 2: Company Formation and Strategy - WaveSpeedAI was founded to provide inference acceleration for image and video generation, with early revenue growth indicating strong market demand [4][26]. - The company adopted a unique approach of prioritizing revenue generation before expansion, focusing on a lean team structure to maintain agility and responsiveness [27][30]. - Cheng Zeyi emphasized the importance of infrastructure in AI, likening it to a vehicle's transmission system that affects performance and user experience [15][20]. Group 3: Market Insights and Opportunities - The AI video generation market is projected to grow significantly, with a compound annual growth rate that could lead to billions in revenue by 2030 [42]. - Current high costs of AI video generation limit widespread adoption, creating a demand for more cost-effective solutions [43][44]. - WaveSpeedAI aims to reduce costs to one-fifth of existing platforms while maintaining high quality and low latency, addressing a critical need in the market [46]. Group 4: Collaborative Ecosystem and Future Plans - The company collaborates with various partners to enhance its service offerings and expand its market reach, focusing on creating a symbiotic ecosystem rather than competing directly with larger firms [32][48]. - WaveSpeedAI is committed to empowering global creators by providing resources and support for developers, aiming to foster innovation in the AI space [55][56]. - The company aspires to be a model for Chinese AI enterprises in global markets, encouraging confidence and ambition among local entrepreneurs [57][58].
ICLR 2025|首个动态视觉-文本稀疏化框架来了,计算开销直降50%-75%
机器之心· 2025-04-29 03:22
本文由华东师范大学和小红书联合完成,共同第一作者是华东师范大学在读硕士、小红书 NLP 团队实习生黄文轩和翟子杰,通讯作者是小红书 NLP 团队负责人 曹绍升,以及华东师范大学林绍辉研究员。 多模态大模型(MLLMs)在视觉理解与推理等领域取得了显著成就。然而,随着解码(decoding)阶段不断生成新的 token,推理过程的计算复杂度和 GPU 显存 占用逐渐增加,这导致了多模态大模型推理效率的降低。现有的方法通过减少预填充(prefill)阶段的视觉 token 冗余来实现推理加速。遗憾的是,这种在预填充 阶段实现的视觉 token 稀疏化所带来的加速优势,在解码阶段会逐渐减弱。当解码输出的文本 token 数量增多时,这些方法仍然会遇到性能瓶颈。 为了解决上述问题,团队创新性地提出了一个全新的动态视觉 - 文本上下文稀疏化推理加速框架 ——Dynamic-LLaVA。该框架针对多模态大模型在不同推理模式 下(包括预填充阶段以及有无 KV Cache 的解码阶段),设计了定制化的稀疏化推理方案,以实现多模态大模型的高效推理。实验结果表明,Dynamic-LLaVA 在 几乎不损失视觉理解和生成能力的前提 ...