Workflow
EXAScaler
icon
Search documents
Hands-On, Enabling KV Cache on EXAScaler
DDN· 2026-01-30 19:12
Your EXAScaler is AI-ready. Join us in this hands-on technical session as we walk through how KV Cache is enabled on EXAScaler and what to expect in real environments. This session is focused on practical enablement and operational considerations, building directly on the KV Cache demo video. What You’ll Learn • How KV Cache fits into the EXAScaler architecture • Environment prerequisites and key considerations • How to enable and verify KV Cache and interpret performance results Who Should Attend • Enginee ...
KV Cache and EXAScaler, Enabling AI Without New Systems
DDN· 2026-01-29 20:01
Your EXAScaler is AI-ready. Join us to learn how to unlock it. Improve performance and GPU efficiency. In this live session, Joel Kaufman, Senior Product Manager at DDN, explains how KV Cache works with EXAScaler to address these challenges. The session focuses on architecture, impact, and when KV Cache delivers the most value, without requiring new infrastructure. This session is designed to help teams understand why KV Cache matters and where it fits as AI workloads grow. What You’ll Learn • How KV Cache ...
KV Cache Acceleration of vLLM using DDN EXAScaler
DDN· 2025-11-11 16:44
AI Inference Challenges & KV Caching Solution - AI inference faces challenges with large context windows, impacting tokenization and latency [1][2] - Caching context tokens speeds up responsiveness, lowers latency, and allows storing larger context amounts [4] - Effective caching requires storage systems with low latency and large capacity at scale [5] DDN's Solution & Performance - DDN's Exoscaler platform enables high-performance KV caching for AI inference, improving user concurrency, responsiveness, and user experience [7] - DDN leverages GPU direct storage (GDS) for cached engine [9] - Caching demonstrates a 10x improvement in performance with larger context [14] - DDN's Exoscaler performance can improve time to first token during inference by 10-25x [16] - DDN improves response times, provides larger cache repository space, and delivers cost-effective performance and capacity density [17] Capacity Implications - KV caching accelerates the end-user experience, putting a premium on high-performance shared storage [16] - Approximately 200,000 input characters resulted in a cache of 796 files, totaling almost 13 gigabytes [15]
EXAScaler Multi-Tenancy Demo
DDN· 2025-09-17 23:03
Core Functionality - Exoscaler data intelligence platform supports multi-tenancy by leveraging VLANs and secure data partitions [1][2] - Client access controls prevent unauthorized data access, enhancing security [2] - Capacity management controls via quotas allow flexible space allocation to tenants [2] Technical Implementation - Network configuration utilizes VLANs with paired IP addresses for intracluster networking and tenant connections [3] - Each tenant maps to two IPs for multiple connections to each VLAN, ensuring high availability [3] - Multi-tenancy is enabled via EMF settings and synced across the cluster [4] - Clients without registered IPs on the appropriate VLAN lose system access due to VLAN isolation [5] Quota Management - Hard quotas enforce strict limits, preventing tenants from exceeding allocated capacity, ensuring total capacity of all tenants never exceed the cluster's capacity [7][9] - Soft quotas allow tenants to use shared capacity by overallocating quotas, potentially leading to less waste but requiring trust [7][10] - Hybrid approach combines soft and hard quotas, providing leeway while preventing excessive consumption of free space [11][12] Data Handling - The system supports on-the-fly quota adjustments while serving data to clients [9] - Demonstrated the creation of a 10 TB (Terabyte) test file to illustrate quota enforcement [8]