Workflow
Avi Chawla
icon
Search documents
X @Avi Chawla
Avi Chawla· 2025-12-08 06:31
Find it here: https://t.co/hbpdbIZpzT ...
X @Avi Chawla
Avi Chawla· 2025-12-08 06:31
If you need a video guide to Karpathy's nanochat, check out Stanford's CS336!It covers:- Tokenization- Resource Accounting- Pretraining- Finetuning (SFT/RLHF)- Overview of Key Architectures- Working with GPUs- Kernels and Tritons- Parallelism- Scaling Laws- Inference- Evaluation- AlignmentEverything you need to prepare for a job at Frontier AI Labs.I have shared the playlist in the replies! ...
X @Avi Chawla
Avi Chawla· 2025-12-07 19:14
RT Avi Chawla (@_avichawla)You're in a Research Scientist interview at OpenAI.The interviewer asks:"How would you expand the context length of an LLM from 2K to 128K tokens?"You: "I will fine-tune the model on longer docs with 128K context."Interview over.Here's what you missed: ...
X @Avi Chawla
Avi Chawla· 2025-12-07 11:49
If you found it insightful, reshare it with your network.Find me → @_avichawlaEvery day, I share tutorials and insights on DS, ML, LLMs, and RAGs. https://t.co/Y6ooICWVnOAvi Chawla (@_avichawla):You're in a Research Scientist interview at OpenAI.The interviewer asks:"How would you expand the context length of an LLM from 2K to 128K tokens?"You: "I will fine-tune the model on longer docs with 128K context."Interview over.Here's what you missed: ...
X @Avi Chawla
Avi Chawla· 2025-12-07 06:42
At 128K context, prefilling costs drop from ~$0.65 to ~$0.35 per million tokens. And Decoding drops from ~$2.4 to ~$0.8.And the performance stays the same. On some long-context benchmarks, V3.2 actually scores higher.Sparse attention isn’t new. But making it work without losing quality is hard.What are some other techniques to increase the context lengths of LLMs? ...
X @Avi Chawla
Avi Chawla· 2025-12-07 06:42
3) DeepSeek Sparse Attention (DSA)DeepSeek’s new V3.2 model introduces DeepSeek Sparse Attention (DSA), which brings complexity down from O(L²) to O(Lk), where k is fixed.How it works:A lightweight Lightning Indexer scores which tokens actually matter for each query.Small number of heads, runs in FP8, computationally cheap.Then a selection mechanism retrieves only the top-k key-value entries.The key insight is that only 2048 tokens get selected per query, regardless of context length.So the expensive attent ...
X @Avi Chawla
Avi Chawla· 2025-12-07 06:42
You're in a Research Scientist interview at OpenAI.The interviewer asks:"How would you expand the context length of an LLM from 2K to 128K tokens?"You: "I will fine-tune the model on longer docs with 128K context."Interview over.Here's what you missed: ...
X @Avi Chawla
Avi Chawla· 2025-12-06 19:13
RT Avi Chawla (@_avichawla)Docker explained in 2 minutes!Most developers use Docker daily without understanding what happens under the hood. Here's everything you need to know.Docker has 3 main components:1️⃣ Docker Client: Where you type commands that talk to the Docker daemon via API.2️⃣ Docker Host: The daemon runs here, handling all the heavy lifting (building images, running containers, and managing resources)3️⃣ Docker Registry: Stores Docker images. Docker Hub is public, but companies run private reg ...
X @Avi Chawla
Avi Chawla· 2025-12-06 06:30
Docker explained in 2 minutes!Most developers use Docker daily without understanding what happens under the hood. Here's everything you need to know.Docker has 3 main components:1️⃣ Docker Client: Where you type commands that talk to the Docker daemon via API.2️⃣ Docker Host: The daemon runs here, handling all the heavy lifting (building images, running containers, and managing resources)3️⃣ Docker Registry: Stores Docker images. Docker Hub is public, but companies run private registries.Here's what happens ...
X @Avi Chawla
Avi Chawla· 2025-12-05 20:31
RT Avi Chawla (@_avichawla)An MCP server that detects production-grade code quality issues in real-time!Even though AI is now generating code at light speed, the engineering bottleneck has just moved from writing to reviewing, and now devs spend 90% of their debugging time on AI-generated code.AI reviewers aren't that reliable either because they share the same fundamental blind spots as AI generators do:- They pattern match, not proof check.- They validate syntax, not system behavior.- They review code, no ...