Avi Chawla
Search documents
X @Avi Chawla
Avi Chawla· 2025-12-08 12:08
If you found it insightful, reshare it with your network.Find me → @_avichawlaEvery day, I share tutorials and insights on DS, ML, LLMs, and RAGs. https://t.co/X8z3SMqptRAvi Chawla (@_avichawla):If you need a video guide to Karpathy's nanochat, check out Stanford's CS336!It covers:- Tokenization- Resource Accounting- Pretraining- Finetuning (SFT/RLHF)- Overview of Key Architectures- Working with GPUs- Kernels and Tritons- Parallelism- Scaling Laws- Inference https://t.co/7oCl2Od1fO ...
X @Avi Chawla
Avi Chawla· 2025-12-08 06:31
由于没有提供具体报告内容,我无法进行总结和分析,只能根据你提供的框架进行模拟输出。为了更好地完成任务,请提供报告的具体内容。 Financial Performance (Example) - Company revenue increased by 15% year-over-year [1] - Gross margin reached 45%, a 2% increase from the previous year [1] Market Trends (Example) - The industry is experiencing a shift towards sustainable practices [1] - Increased demand for electric vehicles is driving growth in the battery sector [1] Investment Opportunities (Example) - Emerging markets present significant growth potential for the company [1] - Investment in renewable energy infrastructure is expected to yield high returns [1] Potential Risks (Example) - Regulatory changes could impact the company's operations [1] - Increased competition may lead to price erosion [1]
X @Avi Chawla
Avi Chawla· 2025-12-08 06:31
Educational Resources - Stanford's CS336 video guide covers topics essential for Frontier AI Labs jobs [1] - The curriculum includes tokenization, resource accounting, pretraining, and finetuning (SFT/RLHF) [1] - Key AI architectures, GPU usage, kernels, parallelism, and scaling laws are addressed [1] AI Development Lifecycle - The guide also covers inference, evaluation, and alignment in AI models [1]
X @Avi Chawla
Avi Chawla· 2025-12-07 19:14
Model Training & Context Expansion - Fine-tuning on longer documents with 128K context is an insufficient response in a Research Scientist interview at OpenAI [1] - The question focuses on expanding the context length of an LLM from 2K to 128K tokens [1]
X @Avi Chawla
Avi Chawla· 2025-12-07 11:49
If you found it insightful, reshare it with your network.Find me → @_avichawlaEvery day, I share tutorials and insights on DS, ML, LLMs, and RAGs. https://t.co/Y6ooICWVnOAvi Chawla (@_avichawla):You're in a Research Scientist interview at OpenAI.The interviewer asks:"How would you expand the context length of an LLM from 2K to 128K tokens?"You: "I will fine-tune the model on longer docs with 128K context."Interview over.Here's what you missed: ...
X @Avi Chawla
Avi Chawla· 2025-12-07 06:42
At 128K context, prefilling costs drop from ~$0.65 to ~$0.35 per million tokens. And Decoding drops from ~$2.4 to ~$0.8.And the performance stays the same. On some long-context benchmarks, V3.2 actually scores higher.Sparse attention isn’t new. But making it work without losing quality is hard.What are some other techniques to increase the context lengths of LLMs? ...
X @Avi Chawla
Avi Chawla· 2025-12-07 06:42
3) DeepSeek Sparse Attention (DSA)DeepSeek’s new V3.2 model introduces DeepSeek Sparse Attention (DSA), which brings complexity down from O(L²) to O(Lk), where k is fixed.How it works:A lightweight Lightning Indexer scores which tokens actually matter for each query.Small number of heads, runs in FP8, computationally cheap.Then a selection mechanism retrieves only the top-k key-value entries.The key insight is that only 2048 tokens get selected per query, regardless of context length.So the expensive attent ...
X @Avi Chawla
Avi Chawla· 2025-12-07 06:42
You're in a Research Scientist interview at OpenAI.The interviewer asks:"How would you expand the context length of an LLM from 2K to 128K tokens?"You: "I will fine-tune the model on longer docs with 128K context."Interview over.Here's what you missed: ...
X @Avi Chawla
Avi Chawla· 2025-12-06 19:13
RT Avi Chawla (@_avichawla)Docker explained in 2 minutes!Most developers use Docker daily without understanding what happens under the hood. Here's everything you need to know.Docker has 3 main components:1️⃣ Docker Client: Where you type commands that talk to the Docker daemon via API.2️⃣ Docker Host: The daemon runs here, handling all the heavy lifting (building images, running containers, and managing resources)3️⃣ Docker Registry: Stores Docker images. Docker Hub is public, but companies run private reg ...
X @Avi Chawla
Avi Chawla· 2025-12-06 06:30
Docker explained in 2 minutes!Most developers use Docker daily without understanding what happens under the hood. Here's everything you need to know.Docker has 3 main components:1️⃣ Docker Client: Where you type commands that talk to the Docker daemon via API.2️⃣ Docker Host: The daemon runs here, handling all the heavy lifting (building images, running containers, and managing resources)3️⃣ Docker Registry: Stores Docker images. Docker Hub is public, but companies run private registries.Here's what happens ...