Bandwidth

Search documents
X @mert | helius.dev
mert | helius.dev· 2025-08-18 13:33
Network Performance - Latency increases [1] - Bandwidth reduces [1]
What every AI engineer needs to know about GPUs — Charles Frye, Modal
AI Engineer· 2025-07-20 07:00
AI Engineering & GPU Utilization - AI engineering is shifting towards tighter integration and self-hosting of language models, increasing the need to understand GPU hardware [6][7] - The industry should focus on high bandwidth, not low latency, when utilizing GPUs [8] - GPUs optimize for math bandwidth over memory bandwidth, emphasizing computational operations [9] - Low precision matrix matrix multiplications are key to fully utilizing GPU potential [10] - Tensor cores, specialized for low precision matrix matrix multiplication, are crucial for efficient GPU usage [6][37] Hardware & Performance - GPUs achieve parallelism significantly exceeding CPUs, with the Nvidia H100 SXM GPU capable of over 16,000 parallel threads at 5 cents per thread, compared to AMD Epic CPU's two threads per core at approximately 1 watt per thread [20][21] - GPUs offer faster context switching compared to CPUs, happening every clock cycle [23] - Bandwidth improvement increases at the square of latency improvement, favoring bandwidth-oriented hardware [25][26] Model Optimization - Small models can be more hardware-sympathetic, potentially matching the quality of larger models with techniques like verification and multiple generations [32][33] - Multi-token prediction and multi-sample queries can become nearly "free" due to tensor core capabilities [36] - Generating multiple samples or tokens can improve performance by leveraging matrix matrix operations [39]
X @Starlink
Starlink· 2025-07-18 19:36
BANDwidth 🛰️🎸Bob Plankers (@plankers):Band: “Hey, is that a Starlink? Could we connect to it? Cell sucks here and we need to make a call.”Me: “Absolutely.”(I printed a mount for it to use with an Irwin clamp, trying it out) https://t.co/qi7pvQOoHg ...