Avi Chawla - filings, earnings calls, financial reports, news

Avi Chawla· 2025-11-12 06:31

GitHub repo: https://t.co/r9Y8dKjtaX(don't forget to star it ⭐) ...

GitHub repo

Avi Chawla· 2025-11-12 06:31

Agent Learning & Development - Current agents lack continual learning, hindering their ability to build intuition and expertise through experience [1][2] - A key challenge is enabling agents to learn from interactions and develop heuristics, similar to how humans master skills [1][2] - Composio is developing infrastructure for a shared learning layer, allowing agents to evolve and accumulate skills collectively [3] - This "skill layer" provides agents with an interface to interact with tools and build practical knowledge [4] Industry Trends & Alignment - Anthropic is exploring similar approaches, codifying agent behaviors as reusable skills [4] - The industry is moving towards a design pattern where agents progressively turn experience into composable skills [4] Composio's Solution - Composio's collective AI learning layer enables agents to share knowledge, allowing them to handle API edge cases and develop real intuition [5] - This approach facilitates continual learning, where agents accumulate skills through interaction rather than just memorizing [5]

Agents

Continual Learning

Artificial Intelligence

Skills

Agents

Continual Learning

Artificial Intelligence

Skills

Avi Chawla· 2025-11-11 20:14

Mixture of Experts (MoE) Architecture - MoE is a popular architecture leveraging different experts to enhance Transformer models [1] - MoE differs from Transformer in the decoder block, utilizing experts (smaller feed-forward networks) instead of a single feed-forward network [2][3] - During inference, only a subset of experts are selected in MoE, leading to faster inference [4] - A router, a multi-class classifier, selects the top K experts by producing softmax scores [5] - The router is trained with the network to learn the best expert selection [5] Training Challenges and Solutions - Challenge 1: Some experts may become under-trained due to the overselection of a few experts [5] - Solution 1: Add noise to the router's feed-forward output and set all but the top K logits to negative infinity to allow other experts to train [5][6] - Challenge 2: Some experts may be exposed to more tokens than others, leading to under-trained experts [6] - Solution 2: Limit the number of tokens an expert can process; if the limit is reached, the token is passed to the next best expert [6] MoE Characteristics and Examples - Text passes through different experts across layers, and chosen experts differ between tokens [7] - MoEs have more parameters to load, but only a fraction are activated during inference, resulting in faster inference [9] - Mixtral 8x7B and Llama 4 are examples of popular MoE-based LLMs [9]

Avi Chawla· 2025-11-10 19:24

RT Avi Chawla (@_avichawla)25 most important mathematical definitions in data science.P.S. What else would you add here? https://t.co/iMNFip5kIC ...

Avi Chawla· 2025-11-10 06:31

25 most important mathematical definitions in data science.P.S. What else would you add here? https://t.co/iMNFip5kIC ...