乘影(Ventus)GPGPU
Search documents
清华大学 集成电路学院在 MICRO 2025 成功举办“Ventus:基于 RISC-V 的高性能开源 GPGPU”学术教程
半导体行业观察· 2025-10-26 03:16
Core Insights - The article discusses the successful organization of a tutorial on "Ventus: A High-performance Open-source GPGPU Based on RISC-V and Its Vector Extension" by Tsinghua University at the IEEE/ACM International Symposium on Microarchitecture (MICRO 2025) [1][15] - The tutorial included eight presentations and a hands-on demonstration, showcasing Tsinghua University's comprehensive research achievements in the open-source GPGPU project "Ventus" [3][15] Group 1: Project Overview - Professor He Hu introduced the Ventus GPGPU project, covering its inception, key technologies, team development, future research goals, and plans for open-source community building [3][15] - The project encompasses a complete layout in instruction set architecture (ISA), hardware architecture, compilers, simulators, and verification tools [3][15] Group 2: GPGPU Design Philosophy and Architecture - PhD student Ma Mingyuan elaborated on the essence of GPGPU as a hardware multithreaded SIMD processor, discussing core issues in instruction design and how Ventus builds a complete GPGPU base on RISC-V Vector extensions [5][16] - Key microarchitecture components such as CTA scheduler, core pipeline, and warp scheduler were introduced [5][16] Group 3: Cache Subsystem and MMU Design - PhD student Sun Haonan presented the cache subsystem and memory management unit (MMU) design under the RISC-V RVWMO memory model, utilizing a release consistency-guided cache coherence mechanism (RCC) [6][16] - The design achieved over 95% L1 DTLB hit rate and over 85% L2 TLB hit rate while controlling MMU overhead between 15% and 25% [6][16] Group 4: Multi-Precision Tensor Core Design - PhD student Liu Wei introduced a new generation of multi-precision reusable tensor cores optimized for AI workloads, supporting various data precisions from FP16 to INT4 [7][16] - Benchmark tests showed significant optimizations of 69.1% in instruction count and 68.4% in execution cycles after integrating the tensor core [7][16] Group 5: Differential Verification Framework - Master's student Xie Wenxuan presented the GVM (GPU Verification Model) framework, which addresses verification challenges posed by out-of-order execution in GPGPU [8][17] - The framework effectively identifies bugs and shortens debugging cycles by integrating with the Ventus software stack [9][17] Group 6: Compiler Design - Dr. Wu Hualin from Zhaosong Technology discussed the design considerations for the OpenCL compiler and Triton AI operator library compiler for Ventus GPGPU [10][17] - Ventus GPGPU supports OpenCL 2.0 profile and has passed over 85% of OpenCL conformance tests [10][17] Group 7: Toolchain Design - Engineer Kong Li introduced the design of the Ventus GPGPU toolchain, which includes core modules such as Compiler, Runtime, Driver, and Simulator [11][17] - The toolchain has achieved stable functionality through OpenCL-CTS and Rodinia benchmark tests [11][17] Group 8: Hands-on Demonstration - The hands-on demonstration provided an entry-level guide for developers to deploy the Ventus environment and run OpenCL programs [12][17] - The team showcased a two-tier FPGA verification platform, successfully running key tests such as vector addition and MNIST inference [13][17] - The tutorial highlighted Tsinghua University's systematic research capabilities in the intersection of RISC-V and GPGPU, marking significant progress in open-source high-performance computing architecture [14][17]