Workflow
Avi Chawla
icon
Search documents
X @Avi Chawla
Avi Chawla· 2025-07-11 06:31
Model Training - Deep learning models typically use only one GPU for training by default, even with multiple GPUs available [1] - Distributing the training workload across multiple GPUs is an ideal way to train models [1] - There are four strategies for multi-GPU training [1]
X @Avi Chawla
Avi Chawla· 2025-07-11 06:30
Technical Explanation - The document explains how to synchronize GPUs in multi-GPU training, using visuals for clarity [1]
X @Avi Chawla
Avi Chawla· 2025-07-10 20:33
RAG Architectures - Industry highlights the distinction between Naive RAG and Agentic RAG [1] - Industry emphasizes visual explanations of RAG architectures [1]
X @Avi Chawla
Avi Chawla· 2025-07-10 06:30
Overview - The content is a recommendation to reshare insightful information about DS (Data Science), ML (Machine Learning), LLMs (Large Language Models), and RAGs (Retrieval-Augmented Generation) [1] Resource Sharing - Avi Chawla shares tutorials and insights daily on DS, ML, LLMs, and RAGs [1] Topic Focus - The content highlights a clear explanation (with visuals) of Naive RAG vs Agentic RAG [1]
X @Avi Chawla
Avi Chawla· 2025-07-10 06:30
RAG Systems - Agentic RAG systems enhance robustness by aligning individual outcomes with the overall goal [1] - The provided diagram represents one of many possible blueprints for an agentic RAG system [2] - The specific implementation of an agentic RAG system can be adapted to fit particular use cases [2]
X @Avi Chawla
Avi Chawla· 2025-07-09 19:29
RT Avi Chawla (@_avichawla)The fastest serving engine for LLMs is here (open-source)!LMCache is an LLM serving engine designed to reduce time-to-first-token and increase throughput, especially under long-context scenarios.It boosts vLLM with 7x faster access to 100x more KV caches.100% open-source! https://t.co/IfyZzdnq4z ...
X @Avi Chawla
Avi Chawla· 2025-07-09 06:30
If you found it insightful, reshare it with your network.Find me → @_avichawlaEvery day, I share tutorials and insights on DS, ML, LLMs, and RAGs.Avi Chawla (@_avichawla):The fastest serving engine for LLMs is here (open-source)!LMCache is an LLM serving engine designed to reduce time-to-first-token and increase throughput, especially under long-context scenarios.It boosts vLLM with 7x faster access to 100x more KV caches.100% open-source! https://t.co/IfyZzdnq4z ...
X @Avi Chawla
Avi Chawla· 2025-07-09 06:30
GitHub repo: https://t.co/f9vvUucTne ...
X @Avi Chawla
Avi Chawla· 2025-07-09 06:30
The fastest serving engine for LLMs is here (open-source)!LMCache is an LLM serving engine designed to reduce time-to-first-token and increase throughput, especially under long-context scenarios.It boosts vLLM with 7x faster access to 100x more KV caches.100% open-source! https://t.co/IfyZzdnq4z ...