DataPelago Nucleus

Search documents
DataPelago Nucleus Outperforms cuDF, Nvidia's Data Processing Library, Raising The Roofline of GPU-Accelerated Data Processing
GlobeNewswire News Roomยท 2025-08-22 10:00
Core Insights - DataPelago Nucleus significantly outperforms Nvidia's cuDF in compute-intensive operations on Nvidia GPUs, enhancing price/performance for data processing workloads without requiring code or infrastructure changes [1][4][5] Industry Context - Businesses are facing challenges in managing growing volumes of complex data for ETL, business intelligence, and GenAI workloads, necessitating the use of GPUs for better performance due to their massive parallelism and throughput advantages [2][5] - The limitations of CPU-based data processing are becoming apparent, as they cannot keep pace with the demands of modern data workloads [2] Product Performance - Nucleus is designed to overcome challenges associated with GPU data processing, such as I/O bottlenecks and limited GPU memory, by utilizing better parallel algorithms and optimized multi-column support [4][5] - Benchmark results indicate that Nucleus is up to 10.5x faster for project operations, 10.1x faster for filter operations, and 4.3x faster for aggregate operations compared to cuDF [8] - For hash join operations, Nucleus achieves up to 38.6x faster throughput for smaller strings and up to 4x faster for larger strings, with significant improvements in hash aggregate operations [8] Company Vision - DataPelago aims to set a new standard in data processing for the accelerated computing era, addressing performance, cost, and scalability limitations faced by organizations [5][6] - The company is focused on transforming data processing economics to support the growing demands of AI and data acceleration [6][7]