CUDA

Search documents
Continuous Profiling for GPUs — Matthias Loibl, Polar Signals
AI Engineer· 2025-07-22 19:46
GPU Profiling & Performance Optimization - The industry emphasizes improving performance and saving costs by optimizing software, potentially reducing server usage by 10% [4] - Sampled profiling is used to balance data volume and continuous monitoring, with examples of sampling 100 times per second resulting in less than 1% CPU overhead and 4MB memory overhead [5] - The industry highlights the importance of production environment profiling to observe real-world application performance with low overhead [8] - The company's solution leverages Linux EVPF, enabling profiling without application instrumentation [9] Technology & Metrics - The company's GPU profiling solution uses Nvidia NVML to extract metrics, including overall node utilization (blue line), individual process utilization (orange line), memory utilization, and clock speed [11][12] - Key metrics include power utilization (with power limit as a dashed line), temperature (important to avoid throttling at 80 degrees Celsius), and PCIe throughput (negative for receiving, positive for sending, e g 10 MB/s) [13][14] - The solution correlates GPU metrics with CPU profiles collected using EVPF to analyze CPU activity during periods of less than full GPU utilization [14] GPU Time Profiling - The company introduces GPU time profiling to measure time spent on individual CUDA functions, determining start and end times of kernels via the Linux kernel [18] - The solution displays CPU stacks with leaf nodes representing functions taking time on the GPU, with colors indicating different binaries (e g blue for Python) [19][20] Deployment & Integration - The company's solution can be deployed using a binary on Linux, Docker, or as a DaemonSet on Kubernetes, requiring a manifest YAML and token [21] - Turbo Puffer is interested in integrating the company's GPU profiling to improve the performance of their vector engine [22]
X @外汇交易员
外汇交易员· 2025-07-16 07:45
Investment & Strategy - Nvidia plans to continue investing in China [1] - Openness and stability are key messages received from Chinese leadership [1] Technology & Compatibility - Nvidia's CUDA platform is not closed, and the company is open to compatible platforms [1]
国产GPU厂商的“烧钱与梦想”
经济观察报· 2025-07-11 12:17
Core Viewpoint - The future defined by "sovereign AI" represents both a technological and capital competition, prompting Chinese GPU companies to seek funding urgently in the secondary market [2][5]. Group 1: Market Dynamics - NVIDIA's market capitalization surpassed $4 trillion, making it the largest publicly traded company globally, which has created significant market potential for domestic GPU companies [2][3]. - The surge in interest for domestic GPU firms is driven by the need for a comparable company to NVIDIA in the A-share market, as highlighted by industry experts [7][8]. - The valuation logic in the market has led to speculative methods, such as the "market probability" approach, where companies like Cambricon are valued based on their perceived chances of becoming the Chinese equivalent of NVIDIA [8]. Group 2: Financial Performance - Both Moer Technology and Muxi Integrated Circuit have reported substantial losses, with Moer Technology's net profit from 2022 to 2024 showing losses of approximately 1.84 billion yuan, 1.67 billion yuan, and 1.49 billion yuan, totaling around 5 billion yuan [11]. - Muxi Integrated Circuit reported cumulative losses of 3.29 billion yuan from 2022 to the first quarter of 2025, with R&D expenses significantly exceeding revenue [12]. - Cambricon has also faced long-term losses, accumulating over 3.3 billion yuan since its IPO in 2020 [13]. Group 3: Investment and Funding - Moer Technology has undergone eight rounds of financing in less than five years, achieving a valuation of 21.071 billion yuan by March 2025 [5]. - The urgency for IPOs among domestic GPU companies is seen as a "lifeline" to secure necessary capital for survival and growth [22][23]. - The capital raised is intended for advancing core GPU technologies and enhancing governance to create value for investors [24][25]. Group 4: Customer Base and Revenue Quality - The customer base for domestic GPU companies is heavily concentrated, with Moer Technology's top five clients accounting for over 89% of its revenue from 2022 to 2024 [28]. - The reliance on large clients, particularly government projects, raises concerns about the sustainability and quality of revenue [28][32]. - The shifting customer dynamics for Muxi Integrated Circuit indicate potential instability in business relationships, as evidenced by significant changes in its top clients from 2023 to 2024 [32][33]. Group 5: Competitive Landscape and Challenges - Domestic GPU companies face challenges in performance and compatibility, particularly in the consumer market, where user experience is critical [36][39]. - The lack of a robust software ecosystem compared to established players like NVIDIA and AMD poses a significant barrier to market penetration [39]. - Supply chain vulnerabilities, particularly in accessing advanced manufacturing processes, could hinder the development of next-generation chips [41][42]. Group 6: Strategic Considerations - Industry experts suggest that the key to success for domestic GPU companies lies in maintaining strategic focus and resilience rather than merely chasing rapid growth [42]. - The importance of securing strong backing from established firms or ecosystems is emphasized as a critical factor for long-term viability [45].
国产GPU,还有多少硬骨头要啃?
Hu Xiu· 2025-07-02 00:46
Core Viewpoint - The recent IPO applications of domestic GPU companies, Moore Threads and Muxi Integrated Circuit, have reignited discussions about the challenges and potential of the domestic GPU industry, particularly regarding the high costs and the need for substantial investment to achieve profitability [1][3][4][5]. Group 1: IPO Developments - Both Moore Threads and Muxi Integrated Circuit have had their IPO applications accepted by the Shanghai Stock Exchange, marking a significant step for the domestic GPU sector [1][3]. - Muxi plans to raise 3.9 billion yuan, while Moore Threads aims to raise 8 billion yuan through their IPOs [4][5]. Group 2: Financial Performance - Muxi's projected net losses from 2022 to 2024 are 777 million yuan, 871 million yuan, and 1.4 billion yuan, with R&D expenditures of 647.8 million yuan, 699 million yuan, and 900 million yuan respectively [4]. - Moore Threads anticipates net losses of 1.84 billion yuan, 1.673 billion yuan, and 1.492 billion yuan over the same period, with R&D costs of 1.116 billion yuan, 1.334 billion yuan, and 1.359 billion yuan [5]. - Despite the losses, both companies are seeing revenue growth, with Muxi's revenue projected to reach 743 million yuan by 2024, and Moore Threads expecting 438 million yuan in the same year [7][9]. Group 3: Market Dynamics - The domestic GPU market is characterized by high competition, with various players adopting different strategies, including those aligned with NVIDIA and AMD technologies [12][14]. - The GPU industry is heavily reliant on R&D investments, with companies needing to continue funding their development efforts to remain competitive [21][22]. Group 4: Future Prospects - The AI sector is identified as a significant growth area for GPUs, with the market for AI chips in China projected to grow substantially, indicating a promising future for domestic GPU manufacturers [25][26]. - The competitive landscape is expected to undergo consolidation, as many players vie for market share, suggesting that mergers and acquisitions may become more common [26][27].