Workflow
Avi Chawla
icon
Search documents
X @Avi Chawla
Avi Chawla· 2025-10-20 06:31
Finally, researchers have open-sourced a new reasoning approach that actually prevents hallucinations in LLMs.It beats popular techniques like Chain-of-Thought and has a SOTA success rate of 90.2%.Here's the core problem with current techniques that this new approach solves:We have enough research to conclude that LLMs often struggle to assess what truly matters in a particular stage of a long, multi-turn conversation.For instance, when you give Agents a 2,000-word system prompt filled with policies, tone r ...
X @Avi Chawla
Avi Chawla· 2025-10-19 20:32
RT Avi Chawla (@_avichawla)Here's a neural net optimization trick that leads to ~4x faster CPU to GPU transfers.Imagine an image classification task.- We define the network, load the data and transform it.- In the training loop, we transfer the data to the GPU and train.Here's the problem with this:If you look at the profiler:- Most of the time/resources will be allocated to the kernel (the actual training code).- However, a significant amount of time will also be dedicated to data transfer from CPU to GPU ...
X @Avi Chawla
Avi Chawla· 2025-10-19 06:31
Here's a neural net optimization trick that leads to ~4x faster CPU to GPU transfers.Imagine an image classification task.- We define the network, load the data and transform it.- In the training loop, we transfer the data to the GPU and train.Here's the problem with this:If you look at the profiler:- Most of the time/resources will be allocated to the kernel (the actual training code).- However, a significant amount of time will also be dedicated to data transfer from CPU to GPU (this appears under cudaMem ...
X @Avi Chawla
Avi Chawla· 2025-10-18 19:20
RT Avi Chawla (@_avichawla)Keras now lets you quantize models with just one line of code!You can either quantize your own models or any pre-trained model obtained from KerasHub.Simply run model.quantize(quantization_mode).Supports quantization to int4, int8, float8, and GPTQ modes. https://t.co/isWsphiJq5 ...
X @Avi Chawla
Avi Chawla· 2025-10-18 06:31
Model Quantization - Keras enables model quantization with a single line of code [1] - Supports quantization to int4, int8, float8, and GPTQ modes [1] - Can quantize user's own models or pre-trained models from KerasHub [1]
X @Avi Chawla
Avi Chawla· 2025-10-17 19:18
Active Learning Methodology - Active learning is presented as an efficient method for building supervised models with unlabeled data [1][4] - The process involves iteratively training a model, identifying low-confidence predictions, and labeling them with human input to improve model performance [2][3][4] - The methodology emphasizes the importance of accurate confidence measure generation for effective training [5] Model Building and Refinement - The initial step involves manually labeling a small percentage of the data to create a seed dataset [2] - Probabilistic models are recommended for confidence level determination, using the gap between the top probabilities as a proxy [3] - Cooperative learning, a variant of active learning, utilizes high-confidence data by incorporating the model's predictions as labels [5] Application and Considerations - Active learning can save significant time when building supervised models on unlabeled datasets [4] - The accuracy of confidence measures is critical, as errors can negatively impact subsequent training steps [5]
X @Avi Chawla
Avi Chawla· 2025-10-17 06:31
Active Learning Overview - Active learning is an efficient method for building supervised models with unlabeled data by incorporating human feedback [1][4] - The core idea involves iteratively training a model, identifying low-confidence predictions, and using human labels to refine the model [2][4] Active Learning Process - The process begins with manually labeling a small portion of the data to create an initial model [2] - The model then generates predictions on the unlabeled data, along with confidence levels for each prediction [3] - Low-confidence predictions are prioritized for human labeling and then fed back into the model for retraining [4] - This iterative process continues until the model achieves satisfactory performance [4] Key Considerations - Generating accurate confidence measures is crucial for the success of active learning [5] - Cooperative learning, a variant of active learning, incorporates high-confidence data by using the model's predictions as labels [5]
X @Avi Chawla
Avi Chawla· 2025-10-16 19:17
AI Engineering Fundamentals - Industry emphasizes the importance of coding fundamentals, including Python, Bash, Git, and testing as a starting point for AI engineers [4] - Focus on understanding and utilizing LLM APIs for structured outputs, caching, and system prompts [4] - Industry highlights the necessity of augmenting LLMs with additional information through fine-tuning, RAG (Retrieval-Augmented Generation), and prompt/context engineering [4] Retrieval and RAG Techniques - Industry stresses the significance of retrieval techniques, including vector databases, hybrid retrieval, and indexing strategies, for providing context to LLMs [4] - Industry focuses on building retrieval and generation pipelines, reranking, and multi-step retrieval using orchestration frameworks [2] - After solid retrieval, industry moves into RAG (Retrieval-Augmented Generation) [4] AI Agents and Production Deployment - Industry explores AI Agents, focusing on memory, multi-agent systems, human-in-the-loop design, and agentic patterns [4] - Industry emphasizes shipping AI systems in production with infrastructure, including CI/CD, containers, model routing, Kubernetes, and deployment at scale [4] - Industry prioritizes observability, evaluation, and security, including guardrails, sandboxing, prompt injection defenses, and ethical guidelines [3][4] Advanced AI Workflows - Industry explores advanced workflows, including voice & vision agents, CLI agents, robotics, agent swarms, and self-refining AI systems [4]
X @Avi Chawla
Avi Chawla· 2025-10-16 06:31
If you found it insightful, reshare it with your network.Find me → @_avichawlaEvery day, I share tutorials and insights on DS, ML, LLMs, and RAGs.Avi Chawla (@_avichawla):A great tool to estimate how much VRAM your LLMs actually need.Alter the hardware config, quantization, etc., it tells you about:- Generation speed (tokens/sec)- Precise memory allocation- System throughput, etc.No more VRAM guessing! https://t.co/FlaeMVaWmK ...
X @Avi Chawla
Avi Chawla· 2025-10-16 06:31
Try it here → https://t.co/cnIXhrJliZ ...