Workflow
Avi Chawla
icon
Search documents
X @Avi Chawla
Avi Chawla· 2025-10-03 06:52
Data Analysis Tools - The industry suggests using Skimpy as a superior open-source alternative to Pandas' describe method for comprehensive data summarization [1] - Skimpy offers detailed data insights, including data shape, column data types, statistics, and distribution charts [1]
X @Avi Chawla
Avi Chawla· 2025-10-03 06:33
Feature scaling is not always necessary in ML.Logistic regression (trained using SGD), SVM, MLP, and kNN classifiers usually do better with feature scaling.Tree-based models, Naive bayes, and Gradient Boosting are unaffected. https://t.co/vzy2RzLBW8 ...
X @Avi Chawla
Avi Chawla· 2025-10-03 04:47
@semanticbeeng on bi-temporal 👆 ...
X @Avi Chawla
Avi Chawla· 2025-10-02 19:40
RT Avi Chawla (@_avichawla)RAG can’t keep up with real-time data.Airweave builds live, bi-temporal knowledge bases so that your Agents always reason on the freshest facts.Supports fully agentic retrieval with semantic and keyword search, query expansion, and more across 30+ sources.100% open-source. https://t.co/0ne2MeCLbY ...
X @Avi Chawla
Avi Chawla· 2025-10-02 18:03
Looks like grok missed. Let me explain with an example.Suppose a company policy changes its parental leave from 12 weeks to 16 weeks, effective Jan 1 2025.If you query “What was the policy on Dec 15 2024?”, the database says 12 weeks.If you query “What is the policy on Feb 1 2025?”, it says 16 weeks.Temporal lets you see what the actual policy is at different times.Now imagine the HR team only updated the database on Jan 20 2025 to reflect the Jan 1 change mentioned above.If you query “What did we believe t ...
X @Avi Chawla
Avi Chawla· 2025-10-02 06:31
Technology & Innovation - Airweave builds live, bi-temporal knowledge bases for agents to reason on the freshest facts [1] - Supports fully agentic retrieval with semantic and keyword search, query expansion, and more across 30+ sources [1] - Airweave is 100% open-source [1] Real-time Data Challenge - RAG (Retrieval-Augmented Generation) struggles with real-time data [1]
X @Avi Chawla
Avi Chawla· 2025-10-02 06:31
GitHub repo: https://t.co/iU6P0KoaRf(Don't forget to star 🌟) ...
X @Avi Chawla
Avi Chawla· 2025-10-02 06:31
Technology & Innovation - Airweave addresses the limitations of RAG (Retrieval-Augmented Generation) in handling real-time data [1] - Airweave constructs live, bi-temporal knowledge bases to ensure agents utilize the most up-to-date information [1] - The system supports comprehensive agentic retrieval, incorporating semantic and keyword search, query expansion, and integration across 30+ sources [1] - Airweave is 100% open-source [1]
X @Avi Chawla
Avi Chawla· 2025-10-01 19:16
RT Avi Chawla (@_avichawla)Here's a common misconception about RAG!When we talk about RAG, it's usually thought: index the doc → retrieve the same doc.But indexing ≠ retrievalSo the data you index doesn't have to be the data you feed the LLM during generation.Here are 4 smart ways to index data:1) Chunk Indexing- The most common approach.- Split the doc into chunks, embed, and store them in a vector DB.- At query time, the closest chunks are retrieved directly.This is simple and effective, but large or nois ...
X @Avi Chawla
Avi Chawla· 2025-10-01 07:02
Here's a common misconception about RAG!When we talk about RAG, it's usually thought: index the doc → retrieve the same doc.But indexing ≠ retrievalSo the data you index doesn't have to be the data you feed the LLM during generation.Here are 4 smart ways to index data:1) Chunk Indexing- The most common approach.- Split the doc into chunks, embed, and store them in a vector DB.- At query time, the closest chunks are retrieved directly.This is simple and effective, but large or noisy chunks can reduce precisi ...