Workflow
Avi Chawla
icon
Search documents
X @Avi Chawla
Avi Chawla· 2025-10-15 06:54
A time-complexity cheat sheet of 10 ML algorithms:What's the inference time-complexity of KMeans? https://t.co/8qlDxpDubA ...
X @Avi Chawla
Avi Chawla· 2025-10-14 19:08
RT Avi Chawla (@_avichawla)Finally, Python 3.14 lets you disable GIL!It's a big deal because earlier, even if you wrote multi-threaded code, Python could only run one thread at a time, giving no performance benefit.But now, Python can run your multi-threaded code in parallel.And uv fully supports it! https://t.co/pfqh58En3K ...
X @Avi Chawla
Avi Chawla· 2025-10-14 06:31
Core Feature Update - Python 3.14 allows disabling the Global Interpreter Lock (GIL) [1] - This enables true parallel execution of multi-threaded Python code, improving performance [1] Technology Adoption - uv fully supports the GIL disabling feature in Python 3.14 [1]
X @Avi Chawla
Avi Chawla· 2025-10-13 19:22
RT Avi Chawla (@_avichawla)This should be impossible!You can clean any ML dataset in just three lines of code. Flag outliers, find label errors, and more, across:- Any data (tabular, text, image, etc.)- Any task (classification, entity recognition, etc.)100% open-source, built by MIT researchers. https://t.co/xAaKjK4zIM ...
X @Avi Chawla
Avi Chawla· 2025-10-13 06:56
If you found it insightful, reshare it with your network.Find me → @_avichawlaEvery day, I share tutorials and insights on DS, ML, LLMs, and RAGs.Avi Chawla (@_avichawla):This should be impossible!You can clean any ML dataset in just three lines of code. Flag outliers, find label errors, and more, across:- Any data (tabular, text, image, etc.)- Any task (classification, entity recognition, etc.)100% open-source, built by MIT researchers. https://t.co/xAaKjK4zIM ...
X @Avi Chawla
Avi Chawla· 2025-10-13 06:55
Cleanlab's GitHub repo: https://t.co/IiAR1sFSdJ(don't forget to star it ⭐ ) ...
X @Avi Chawla
Avi Chawla· 2025-10-13 06:55
This should be impossible!You can clean any ML dataset in just three lines of code. Flag outliers, find label errors, and more, across:- Any data (tabular, text, image, etc.)- Any task (classification, entity recognition, etc.)100% open-source, built by MIT researchers. https://t.co/xAaKjK4zIM ...
X @Avi Chawla
Avi Chawla· 2025-10-12 19:29
Core Problem of Traditional RAG - Most retrieved chunks in traditional RAG setups do not effectively aid the LLM, leading to increased computational costs, latency, and context processing [1][5] - Classic RAG involves fetching similar chunks from a vector database and directly inputting the retrieved context into the LLM [5] REFRAG Solution by Meta AI - Meta AI's REFRAG introduces a novel approach by compressing and filtering context at a vector level, focusing on relevance [1][2] - REFRAG employs chunk compression, relevance policy (RL-trained), and selective expansion to process only essential information [2] - The process involves encoding documents, finding relevant chunks, using a relevance policy to select chunks, and concatenating token-level representations [3][4] Performance Metrics of REFRAG - REFRAG outperforms LLaMA on 16 RAG benchmarks, demonstrating enhanced performance [5][7] - REFRAG achieves 30.85x faster time-to-first-token, significantly improving processing speed [5][7] - REFRAG handles 16x larger context windows, allowing for more extensive information processing [5][7] - REFRAG utilizes 2-4x fewer tokens, reducing computational resource consumption [5][7] - REFRAG leads to no accuracy loss across RAG, summarization, and multi-turn conversation tasks [7]
X @Avi Chawla
Avi Chawla· 2025-10-12 06:31
Core Innovation - REFRAG - Meta's REFRAG fundamentally rethinks retrieval in RAG setups by compressing and filtering context at a vector level [1] - REFRAG compresses each chunk into a single compressed embedding and uses a relevance policy trained via RL to select the most relevant chunks [1][2] - Only selected chunks are expanded back into full embeddings and passed to the LLM, processing only what matters [2] Technical Details - REFRAG encodes documents and stores them in a vector database [2] - It encodes the full user query, finds relevant chunks, and computes token-level embeddings for both [3] - A relevance policy, trained via RL, selects chunks to keep [3][5] - Token-level representations of the input query are concatenated with selected chunks and a compressed single-vector representation of rejected chunks before being sent to the LLM [3] Performance Metrics - REFRAG outperforms LLaMA on 16 RAG benchmarks [4][6] - It achieves 30.85x faster time-to-first-token, which is 3.75x better than previous state-of-the-art [4][6] - REFRAG handles 16x larger context windows [4][6] - It utilizes 2-4x fewer tokens [4][6] - REFRAG leads to no accuracy loss across RAG, summarization, and multi-turn conversation tasks [6]
X @Avi Chawla
Avi Chawla· 2025-10-11 20:06
RT Avi Chawla (@_avichawla)4 must-know model training paradigms for ML engineers: https://t.co/G3KunNYswt ...