Avi Chawla
Search documents
X @Avi Chawla
Avi Chawla· 2025-11-12 06:31
GitHub repo: https://t.co/r9Y8dKjtaX(don't forget to star it ⭐) ...
X @Avi Chawla
Avi Chawla· 2025-11-12 06:31
Agent Learning & Development - Current agents lack continual learning, hindering their ability to build intuition and expertise through experience [1][2] - A key challenge is enabling agents to learn from interactions and develop heuristics, similar to how humans master skills [1][2] - Composio is developing infrastructure for a shared learning layer, allowing agents to evolve and accumulate skills collectively [3] - This "skill layer" provides agents with an interface to interact with tools and build practical knowledge [4] Industry Trends & Alignment - Anthropic is exploring similar approaches, codifying agent behaviors as reusable skills [4] - The industry is moving towards a design pattern where agents progressively turn experience into composable skills [4] Composio's Solution - Composio's collective AI learning layer enables agents to share knowledge, allowing them to handle API edge cases and develop real intuition [5] - This approach facilitates continual learning, where agents accumulate skills through interaction rather than just memorizing [5]
X @Avi Chawla
Avi Chawla· 2025-11-11 20:14
Mixture of Experts (MoE) Architecture - MoE is a popular architecture leveraging different experts to enhance Transformer models [1] - MoE differs from Transformer in the decoder block, utilizing experts (smaller feed-forward networks) instead of a single feed-forward network [2][3] - During inference, only a subset of experts are selected in MoE, leading to faster inference [4] - A router, a multi-class classifier, selects the top K experts by producing softmax scores [5] - The router is trained with the network to learn the best expert selection [5] Training Challenges and Solutions - Challenge 1: Some experts may become under-trained due to the overselection of a few experts [5] - Solution 1: Add noise to the router's feed-forward output and set all but the top K logits to negative infinity to allow other experts to train [5][6] - Challenge 2: Some experts may be exposed to more tokens than others, leading to under-trained experts [6] - Solution 2: Limit the number of tokens an expert can process; if the limit is reached, the token is passed to the next best expert [6] MoE Characteristics and Examples - Text passes through different experts across layers, and chosen experts differ between tokens [7] - MoEs have more parameters to load, but only a fraction are activated during inference, resulting in faster inference [9] - Mixtral 8x7B and Llama 4 are examples of popular MoE-based LLMs [9]
X @Avi Chawla
Avi Chawla· 2025-11-10 19:24
RT Avi Chawla (@_avichawla)25 most important mathematical definitions in data science.P.S. What else would you add here? https://t.co/iMNFip5kIC ...
X @Avi Chawla
Avi Chawla· 2025-11-10 06:31
25 most important mathematical definitions in data science.P.S. What else would you add here? https://t.co/iMNFip5kIC ...
X @Avi Chawla
Avi Chawla· 2025-11-09 19:41
Project Overview - An open-source MCP (Meta-Configurator Protocol) server is available for controlling Jupyter notebooks from Claude [1] Functionality - The server enables the creation of code cells within Jupyter notebooks [1] - The server enables the execution of code cells within Jupyter notebooks [1] - The server enables the creation of markdown cells within Jupyter notebooks [1]
X @Avi Chawla
Avi Chawla· 2025-11-09 06:33
Resources - A GitHub repository is available [1] - A free visual guidebook for learning MCPs from scratch, including 11 projects, is offered [1]
X @Avi Chawla
Avi Chawla· 2025-11-09 06:33
Project Overview - Aims to control Jupyter notebooks from Claude [1] - 100% open-source [1] Functionality - Enables creation of code cells [1] - Enables execution of code cells [1] - Enables creation of markdown cells [1]
X @Avi Chawla
Avi Chawla· 2025-11-08 18:58
AI Tools & Technologies - Six no-code LLM/RAG/Agent builder tools are available for AI engineers [1] - The tools are production-grade and 100% open-source [1]
X @Avi Chawla
Avi Chawla· 2025-11-08 12:21
If you found it insightful, reshare it with your network.Find me → @_avichawlaEvery day, I share tutorials and insights on DS, ML, LLMs, and RAGs. https://t.co/SvYt7PiJQxAvi Chawla (@_avichawla):6 no-code LLM/RAG/Agent builder tools for AI engineers.Production-grade and 100% open-source!(find the GitHub repos in the replies) https://t.co/It07fQRBL7 ...