Avi Chawla
Search documents
X @Avi Chawla
Avi Chawla· 2025-09-23 06:35
Researchers from AssemblyAI built a state-of-the-art model that:- transcribes speech across 99 languages.- works even if the audio has many speakers.- outperforms Deepgram and OpenAI models.And much more.(2-step setup below) https://t.co/7eg0zpE4pM ...
X @Avi Chawla
Avi Chawla· 2025-09-22 19:59
Dropout Mechanism - During training, the average neuron input is significantly lower compared to inference, potentially causing numerical instability due to activation scale misalignment [1] - Dropout addresses this by multiplying inputs during training by a factor of 1/(1-p), where 'p' is the dropout rate [2] - For example, with a dropout rate of 50%, an input of 50 is scaled to 100 (50 / (1 - 0.5) = 100) [2] - This scaling ensures coherence between training and inference stages for the neural network [2] Training vs Inference - Consider a layer with 100 neurons, each with an activation value of 1, and a weight of 1 from each neuron to neuron 'A' in the next layer [2] - With a 50% dropout rate, approximately 50 neurons are active during training [2] - During inference, all 100 neurons are active since Dropout is not used [2]
X @Avi Chawla
Avi Chawla· 2025-09-22 06:39
If you found it insightful, reshare it with your network.Find me → @_avichawlaEvery day, I share tutorials and insights on DS, ML, LLMs, and RAGs. ...
X @Avi Chawla
Avi Chawla· 2025-09-22 06:39
You can also verify this:- Create a dropout layer in PyTorch- Compute the dropout on a tensor- Now set the dropout layer to eval mode- Compute dropout on the same tensor againCheck this 👇 https://t.co/p5u7Need4G ...
X @Avi Chawla
Avi Chawla· 2025-09-22 06:39
Here's a hidden detail about Dropout that many people don't know.Assume that:- There are 100 neurons in a layer, and all activation values are 1.- The weight from 100 neurons to a neuron ‘A’ in the next layer is 1.- Dropout rate = 50%Computing the input of neuron ‘A’:- During training → Approx. 50 (since ~50% of values will be dropped).- During inference → 100 (since we don't use Dropout during inference).So essentially, during training, the average neuron input is significantly lower than that during infer ...
X @Avi Chawla
Avi Chawla· 2025-09-21 19:48
RT Avi Chawla (@_avichawla)PyTorch dataloader has 2 terrible default settings.Fixing them gave me ~5x speedup.When you train a PyTorch model on a GPU:- .to(device) transfers the data to the GPU.- Everything after this executes on the GPU.This means when the GPU is working, the CPU is idle, and when the CPU is working, the GPU is idle.Memory pinning optimizes this as follows:- When the model is trained on the 1st mini-batch, the CPU can transfer the 2nd mini-batch to the GPU.- This ensures that the GPU does ...
X @Avi Chawla
Avi Chawla· 2025-09-21 06:33
PyTorch dataloader has 2 terrible default settings.Fixing them gave me ~5x speedup.When you train a PyTorch model on a GPU:- .to(device) transfers the data to the GPU.- Everything after this executes on the GPU.This means when the GPU is working, the CPU is idle, and when the CPU is working, the GPU is idle.Memory pinning optimizes this as follows:- When the model is trained on the 1st mini-batch, the CPU can transfer the 2nd mini-batch to the GPU.- This ensures that the GPU does not have to wait for the ne ...
X @Avi Chawla
Avi Chawla· 2025-09-20 19:41
Technology Breakthroughs - True technology breakthroughs are rare, the hype around KANs serves as a reminder [1] - Shifts like the success of Transformers only come once in a decade or more [1] Industry Dynamics - Transformers aligned with hardware, data, and economics, proving to be a significant breakthrough [1]
X @Avi Chawla
Avi Chawla· 2025-09-20 06:33
The ultimate Full-stack AI Engineering roadmap to go from 0 to 100.This is the exact mapped-out path on what it actually takes to go from Beginner → Full-Stack AI Engineer.> Start with Coding Fundamentals.> Learn Python, Bash, Git, and testing.> Every strong AI engineer starts with fundamentals.> Learn how to interact with models by understanding LLM APIs.> This will teach you structured outputs, caching, system prompts, etc.> APIs are great, but raw LLMs still need the latest info to be effective.> Learn h ...
X @Avi Chawla
Avi Chawla· 2025-09-19 19:12
Learning Resources - A free 5-step roadmap to learn Python is available [1] Python Expertise - Avi Chawla has been coding in Python for 9 years [1] Coding Roadmap - A complete roadmap for learning Python is provided [1]