Inference

Search documents
X @Polyhedra
Polyhedraยท 2025-09-25 12:00
6/Currently working on Gemma3 quantization, focusing on:- Learning the new model architecture- Adding KV cache support (which accelerates inference)- Implementing quantization support for some new operators-- Full operator support will require 1+ additional day, plus more time for accuracy testingStay tuned for more updates ๐ฅ ...
X @Avi Chawla
Avi Chawlaยท 2025-09-22 19:59
RT Avi Chawla (@_avichawla)Here's a hidden detail about Dropout that many people don't know.Assume that:- There are 100 neurons in a layer, and all activation values are 1.- The weight from 100 neurons to a neuron โAโ in the next layer is 1.- Dropout rate = 50%Computing the input of neuron โAโ:- During training โ Approx. 50 (since ~50% of values will be dropped).- During inference โ 100 (since we don't use Dropout during inference).So essentially, during training, the average neuron input is significantly l ...
CoreWeave CEO: Building AI infrastructure will require trillions in public-private investment
CNBC Televisionยท 2025-09-22 15:45
AI Infrastructure Investment & Scale - Building planetary-scale AI infrastructure requires both private and public sector resources [2] - The AI infrastructure buildout, encompassing energy exploration, power generation, transmission, data centers, supercomputers, and application layers, is estimated to be a multi-trillion dollar investment [5] - This infrastructure is considered a fundamental component of the future economy for the next 50 years [5] Demand & Monetization - The large deals announced by AI labs and hyperscalers indicate significant demand for compute [4] - The majority of compute being built is to serve inference, which represents the monetization of AI [11] - Hyperscalers are being paid to build this infrastructure to serve current and projected client demand [12] Bubble Concerns & Commercial Activity - The question of whether the large capital investments in AI will produce a significant return is being raised [10] - The flow of money into AI is supported by broad-based demand across the technology space as businesses integrate AI into their workflows [13] - The companies deploying capital in AI are among the largest and most successful, differentiating this from the dot-com bubble [14] Comparison to Dot-com Era - Unlike the dot-com era, which was a bolt-on to existing infrastructure, AI requires a new layer of power generation due to increased power consumption [7] - The disruption caused by AI is considered to be on the same order of magnitude as the advent of the internet [8]
X @Avi Chawla
Avi Chawlaยท 2025-09-22 06:39
Here's a hidden detail about Dropout that many people don't know.Assume that:- There are 100 neurons in a layer, and all activation values are 1.- The weight from 100 neurons to a neuron โAโ in the next layer is 1.- Dropout rate = 50%Computing the input of neuron โAโ:- During training โ Approx. 50 (since ~50% of values will be dropped).- During inference โ 100 (since we don't use Dropout during inference).So essentially, during training, the average neuron input is significantly lower than that during infer ...
Prediction: These AI Chip Stocks Could Soar (Hint: It's Not Nvidia or Broadcom)
Yahoo Financeยท 2025-09-20 19:05
Key Points AMD and Marvell are both currently in the shadows of Nvidia and Broadcom. However, AMD has a nice opportunity as the AI infrastructure market starts to shift toward inference. Marvell, meanwhile, has been winning its own custom AI chip designs with customers. 10 stocks we like better than Advanced Micro Devices โบ Nvidia and Broadcom have been getting all the headlines lately, and with good reason. Both companies have been seeing huge data center revenue growth as demand for artificial ...
Groq Hits $6.9 Billion Valuation as Inference Demand Surges
Bloomberg Technologyยท 2025-09-17 18:44
I always like to go back to basics. Jonathan Like Grok stacks up against some video and Google's TPU and we're largely focused on inference. Why did you raise this round in that context.What is it going to allow you to do to be more competitive against those two giants. Well, the bottom line is that the demand for inference is insatiable. The total amount of capacity that people are trying to deploy is mind boggling. The numbers that people are putting up.And that's only growing. And in our case, we don't f ...
Equinix CEO: AI inference in business process needs connectivity which we do
CNBC Televisionยท 2025-09-15 19:38
joined on set by Adair Fox Martin, CEO of Equinex. Adair, thank you for coming in. >> Thank you so much for having me.>> To the wilds of New Jersey. >> Delighted to be here. >> Fant.We're delighted you're here. So, you gave me a really good example before the show began of what you guys do. You compared it to an airport.Explain what that means and who your company is. >> Yeah. So, I think in the world of data center, there's a harmonious view of data centers, but actually there's different types.Um, and Equ ...
X @Avi Chawla
Avi Chawlaยท 2025-09-12 06:31
Inference/Generation Process - Autoregressive generation is used step-by-step during inference [1] - The encoder runs once, while the decoder runs multiple times [1] - Each step utilizes previous predictions to generate the next token [1]
Jensen Huang & Alex Bouzari: CUDA + NIMs Are Accelerating AI
DDNยท 2025-09-05 18:41
I mean the other thing I think is extremely enabling is the CUDA ecosystem which you fostered and nurtured and and and helped really people embark on now with CUDA OBJ I think it is opening all kinds of possibilities because people can now tie into this and apply it combination of Kudo OBJ NIMS you know the inference part of it for specific industries life sciences financial services autonomous driving and so on and so forth you take all these things you tie together with the advances that will be made in t ...
Nvidia wants to be the Ferraris of computing.
Yahoo Financeยท 2025-09-03 17:36
Nvidia's sold out, right. So, we go in every quarter thinking, let's see how they did. Let's see how they did.Well, we know they're sold out. They've sold every single thing that they can make. And it's just kind of a question of where it's going to go and what little pieces are going to go where.The battle going forward, I think, is going to be not about the development, the training of models, but about inference. Inference is going to be such a larger business if it's not already. Let's use a metaphor of ...