Inference
Search documents
X @Ansem
Ansem 🧸💸· 2026-05-09 13:56
RT Jukan (@jukan05)Why did xAI hand over a 220,000-GPU cluster to Anthropic?The technical backdrop to xAI's decision to hand Colossus 1 over to Anthropic in its entirety is more interesting than it appears. xAI deployed more than 220,000 NVIDIA GPUs at its Colossus 1 data center in Memphis. Of these, roughly 150,000 are estimated to be H100s, 50,000 H200s, and 20,000 GB200s. In other words, three different generations of silicon are mixed together inside a single cluster — a "heterogeneous architecture."For ...
CoreWeave CEO Mike Intrator: We are very confident in our revenue for 2026
CNBC Television· 2026-05-08 20:41
But you know, look look the gap EPS$140 loss list. People's looking for a$118. I look at that and I say, you know what, let's not forget the idea that we want to see that number go down. >> So, so look, you know, um I think it's important when you're looking at our financial statements to kind of make sure that you're not uh overindexing to any one particular number and you're looking at what we're building. And so, >> but a loss is a loss. The loss is a loss, but a lot of that loss is an impact of tax on G ...
Starcloud's Philip Johnston: Why the Cheapest Compute Will Be in Space
Sequoia Capital· 2026-05-06 16:49
Thanks so much for having me. Um my name is Philip Johnston and I'm the co-founder and CEO of Star Cloud and just like the previous company we have also been abusing GPUs in ways they were not designed for. >> [laughter] >> Um so yeah, we're building data centers in space um mainly for the energy that we can draw and I will spend the next 5 minutes um explaining why it will soon make much more sense to build data centers in space than it does to build them on Earth and then I'll take 5 minutes for questions ...
Inference Chips for Agent Workflows
Y Combinator· 2026-05-04 20:11
Most AI chips are designed for a world where inference means prompt in response out. Agents don't work that [music] way. They loop, calling tools, branching, backtracking, holding context across dozens of steps.That's a completely [music] different hardware problem. Current GPUs hit 30 to 40% of peak utilization on these workloads because the work is bursty, bouncing between memory bound model calls, IO bound tool use, and CPU bound orchestration. That gap is where purpose-built silicon wins.[music] Nvidia ...