Mooncake - filings, earnings calls, financial reports, news

Mooncake

Search documents

来这场沙龙，一览SGLang X 超长上下文扩展、RL后训练框架、扩散语言模型等前沿技术实践

机器之心· 2026-01-29 08:12

在当前人工智能从"聊天"范式加速向"能办事"的智能体时代演进的关键节点，LLM 系统优化与技术落地的实践探索，更需要开发者们的深度联结与经验共创。基于此，由 SGLang 社区、机器之心、张江孵化器联合举办线下 Meetup，让屏幕前的贡献者走到台前，让幕后优化者分享实战心法。2 月 6日下午，「 SGLang 上海 Meetup」将在上海浦东·纳贤路 800 号 1 层举办。本次 Meetup 将围绕 SGLang 技术路线、超长上下文扩展、RL 后训练框架、扩散语言模型探索等议题展开深度解析，并设有自由交流环节。诚邀开发者与研究同仁共赴现场，探讨 LLM 系统优化与落地实践的新可能。最新日程最新日程正式揭晓，扫描下方报名二维码，锁定您的专属入场资格。 1层活动日程 | 13:30-14:00 签 झ | 14:00-14:30 主题分享一 SGLang roadmap 张柏舟 SGLang 核心开发成员 | 14:30-15:00 主题分享二 Omni-infer 对 SGL 的性能优化实践郑锦焕 Omni-infer 核心开发者 | 15:00-15:30 主题分享三 slime ...

人工智能

大语言模型

强化学习

Artificial Intelligence

Artificial Intelligence

SGLang

Mooncake

基于 SGlang RBG + Mooncake 打造生产级云原生大模型推理平台

AI前线· 2025-12-12 00:40

Core Insights - The article emphasizes the rapid evolution of large language model (LLM) inference services into core enterprise infrastructure, focusing on the balance of performance, stability, and cost in building high-performance inference systems [2] - It discusses the transition from monolithic to distributed architectures in LLM inference, highlighting the need for external KVCache to alleviate memory pressure and enhance performance in high-demand scenarios [2][4] Distributed KVCache and Mooncake - Mooncake is introduced as a leading distributed KVCache storage engine designed to provide high throughput and low latency for inference frameworks like SGLang [3] - The article outlines the challenges in managing distributed KVCache systems in production environments, which necessitate the development of RoleBasedGroup (RBG) for unified management of caching and inference nodes [4] RoleBasedGroup (RBG) Design and Challenges - RBG is presented as a Kubernetes-native API aimed at AI inference, facilitating multi-role orchestration to ensure stable and high-performance operations [4][12] - The article identifies five fundamental challenges in deploying large model inference services, including the need for strong state management and performance optimization [12][15] SCOPE Framework - The SCOPE framework is introduced, focusing on five core capabilities: Stability, Coordination, Orchestration, Performance, and Extensibility, which are essential for managing LLM inference services [16][18] - RBG's design allows for rapid architecture iteration and performance-sensitive operations, addressing the complexities of multi-role dependencies and operational efficiency [15][24] Benchmark Testing and Performance Metrics - Benchmark tests demonstrate significant improvements in KVCache hit rates and inference performance, with L3 Mooncake cache achieving a 64.67% hit rate and reducing average TTFT to 2.58 seconds [32][48] - The article highlights the importance of a multi-tier caching architecture in enhancing performance for applications like multi-turn dialogue and AI agents [44] Conclusion and Future Outlook - The integration of RBG and Mooncake is positioned as a transformative approach to building production-grade LLM inference services, emphasizing the need for deep integration of high-performance design with cloud-native operational capabilities [43][44] - The article concludes with a call for community collaboration to advance this paradigm and lay the foundation for the next generation of AI infrastructure [43]

大模型推理

云原生

Artificial Intelligence

Artificial Intelligence

SGLang

Mooncake

RoleBasedGroup (RBG)

2025新一代计算产业大会召开聚焦算力标准与技术创新

Zhong Guo Xin Wen Wang· 2025-09-17 08:59

Core Insights - The 2025 New Generation Computing Industry Conference was held in Beijing, focusing on the standardization of computing power and technological innovation paths [1][3] - Key discussions included the entire process of AI large model data acquisition, preprocessing, training, fine-tuning, and inference, emphasizing the use of open-source foundational models for application value [3] Group 1: Standardization and Innovation - The conference highlighted the need for high-level planning, collaboration, and quality application in the construction of new generation computing standards [3] - The establishment of working groups for GPU, DPU, computing product components, liquid cooling ecosystems, and heterogeneous computing was announced, along with the initiation of two national standards for server power supplies [4] Group 2: Technical Challenges and Solutions - The DPU was identified as a core chip for computing power, capable of handling data processing and network forwarding tasks to enhance CPU and GPU efficiency, but the lack of unified technical standards hinders large-scale application [3] - Two core technologies were introduced to address memory challenges in inference: Mooncake, which reduces memory consumption through shared public storage, and KTransformers, which enables CPU and GPU memory collaboration [3]

想要产品显得“贵气”，搭配就不能基础 | 烘焙“高级感”搭配指南

东京烘焙职业人· 2025-08-26 08:39

Core Insights - The article emphasizes the importance of balancing basic and extravagant elements in baking products to create high-value offerings that resonate with consumers [1][42] - The perception of ingredients plays a crucial role in determining product pricing and consumer appeal, particularly among Gen Z and new middle-class consumers [2][4] Ingredient Trends - Quality of ingredients is the primary language of product premiumization, with consumers increasingly valuing ingredient transparency [2] - Popular ingredient trends on platforms like Xiaohongshu and Douyin include contrasting flavors and regional specialties, such as mint chocolate and Yunnan mushrooms [5][4] Pricing Strategies - The article discusses the pricing differences within the same category of products, highlighting that a well-curated selection of ingredients can significantly enhance perceived value [9][11] - Seasonal ingredients are noted to evoke emotional connections, with specific keywords associated with each season influencing consumer choices [13][18] Consumer Experience - The article suggests that visual appeal is no longer sufficient; products must offer multi-sensory experiences to justify premium pricing [19] - The concept of "surprise fillings" and layered textures in products can create memorable experiences that drive repeat purchases [20][23] Marketing and Storytelling - The ultimate competitive edge in baking products lies in the storytelling aspect, where consumers seek not just food but an experience and lifestyle [29][30] - Limited editions and seasonal offerings serve as emotional leverage for consumers, enhancing the perceived value of products [31] Social Media and Branding - Products that are visually appealing and suitable for social media sharing tend to perform better in terms of consumer engagement and sales [38][39] - The article highlights the importance of creating a narrative around products to enhance their marketability and consumer interest [34][42]

Bakery Premiumization

Multi - sensory Experience

Bakery Premiumization

Multi - sensory Experience

促开放协作与跨界融合 2025CCF中国开源大会在上海召开

Zhong Guo Xin Wen Wang· 2025-08-02 13:15

Core Insights - The 2025 CCF China Open Source Conference opened in Shanghai, focusing on key directions such as open-source large models and embodied intelligence [1][3] - Experts from academia and industry shared forward-looking views on critical technology areas including large models, open-source hardware, and intelligent operating systems [3] Group 1: Key Developments - The conference featured the introduction of efficient inference systems Mooncake and KTransformers developed by a team led by Zheng Weimin, showcasing their core role in supporting workloads in the intelligent era [3] - Academician E Wei Nan emphasized the paradigm shift in AI from a "model-centric" to a "data-centric" approach, highlighting the need for high-quality data infrastructure to lower the barriers for AI implementation [3] Group 2: Community and Ecosystem Initiatives - The CCF Ubiquitous Operating System Open Community was established with participation from top universities and research institutions, focusing on technology research, project incubation, standard development, application promotion, and talent cultivation [4] - A series of strategic initiatives were launched, including the establishment of the CCF-Mulan Innovation Open Source Incubator and the Omni-Infer Cloud Co-Creation Plan [3][4] Group 3: Educational and Collaborative Efforts - Shanghai Jiao Tong University aims to integrate open-source concepts into its curriculum, fostering talent for next-generation operating systems [5] - The collaboration model between Shanghai Jiao Tong University and Huawei emphasizes shared goals and resources to support core technology breakthroughs [5]