LLMs

Search documents
X @Avi Chawla
Avi Chawla· 2025-08-11 06:31
That's a wrap!If you found it insightful, reshare it with your network.Find me → @_avichawlaEvery day, I share tutorials and insights on DS, ML, LLMs, and RAGs.Avi Chawla (@_avichawla):Let's fine-tune OpenAI gpt-oss (100% locally): ...
The Future of Evals - Ankur Goyal, Braintrust
AI Engineer· 2025-08-09 15:12
[Music] [Applause] Awesome. Uh so today we're going to talk a little bit about evals to date and where we think eval are going to be going in the future. Also for those of you who saw my brother earlier um I'm going to do my best to live up to his energy and uh and charisma.But um yeah, you know, it's been an amazing almost two-year journey for us at Brain Trust. We have had the opportunity to work with some of the most amazing companies building um I think the best AI products in the world. Uh I'm blown aw ...
X @Avi Chawla
Avi Chawla· 2025-08-08 06:34
That's a wrap!If you found it insightful, reshare it with your network.Find me → @_avichawlaEvery day, I share tutorials and insights on DS, ML, LLMs, and RAGs.Avi Chawla (@_avichawla):Enterprises build RAG over 100s of data sources, not one!- Microsoft ships it in M365 products.- Google ships it in its Vertex AI Search.- AWS ships it in its Amazon Q Business.Let's build an MCP-powered RAG over 200+ sources (100% local): ...
X @Avi Chawla
Avi Chawla· 2025-08-08 06:34
In this demo, we used mcp-use.It lets us connect LLMs to MCP servers & build local MCP clients in a few lines of code.- Compatible with Ollama & LangChain- Stream Agent output async- Built-in debugging mode, etcRepo: https://t.co/PWcuwMFvzi(don't forget to star ⭐) ...
X @Avi Chawla
Avi Chawla· 2025-08-06 19:13
12 MCP, RAG, and Agents cheat sheets covering:- Function calling & MCP for LLMs- 4 stages of training LLMs from scratch- Training LLMs using other LLMs- Supervised & Reinforcement fine-tuning- RAG vs Agentic RAG, and more.Check the detailed thread below 👇 https://t.co/erWhHLhldqAvi Chawla (@_avichawla):12 MCP, RAG, and Agents cheat sheets for AI engineers (with visuals): ...
Evals Are Not Unit Tests — Ido Pesok, Vercel v0
AI Engineer· 2025-08-06 16:14
[Music] My name is Ido. I'm an engineer at Verscell working on Vzero. If you don't know, Vzero is a full stack Vibe coding platform.It's the easiest and fastest way to prototype, build on the web, and express new ideas. Uh, here are some examples of cool things people have built and shared on Twitter. And to catch you up, we recently just launched GitHub sync, so you can now push generated code to GitHub directly from VZO.You can also uh automatically pull changes from GitHub into your chat, and furthermore ...
X @Sam Altman
Sam Altman· 2025-08-05 17:27
RT Eric Wallace (@Eric_Wallace_)Today we release gpt-oss-120b and gpt-oss-20b—two open-weight LLMs that deliver strong performance and agentic tool use.Before release, we ran a first of its kind safety analysis where we fine-tuned the models to intentionally maximize their bio and cyber capabilities 🧵 https://t.co/err2mBcggx ...
SEMrush (SEMR) - 2025 Q2 - Earnings Call Transcript
2025-08-05 13:30
Financial Data and Key Metrics Changes - Revenue for the quarter was $108.9 million, representing a 20% year-over-year growth [4][13] - Non-GAAP operating margin was 11%, down approximately 240 basis points year-over-year due to a weaker U.S. Dollar [16][22] - Annual recurring revenue (ARR) grew 15.3% year-over-year to $435.3 million, with average ARR per paying customer increasing to $3,756, marking over 15% growth compared to the same quarter last year [17][18] Business Line Data and Key Metrics Changes - The Enterprise segment is now the largest contributor to overall company growth, with enterprise SEO solutions growing to 260 customers and an average ARR of approximately $60,000 [4][5] - The AI Toolkit, launched at the end of Q1, became the fastest-growing product in the company's history, achieving $3 million in ARR within a few months [6][8] - ARR from enterprise and AI products is expected to approach $50 million by the end of the year [8][19] Market Data and Key Metrics Changes - Approximately 116,000 paying customers were reported, down sequentially from the prior quarter, primarily due to softness among freelancers and less sophisticated customer segments [14] - Dollar-based net revenue retention was 105%, with strong retention in the Enterprise segment consistently above 120% [14][19] Company Strategy and Development Direction - The company is focusing on high-growth areas, specifically enterprise and AI search, reallocating resources away from lower-value customer segments [9][20] - A strategic decision was made to not increase marketing spend in response to rising customer acquisition costs in the lower end of the market, instead prioritizing investments in enterprise and AI products [9][20] - The company announced a $150 million share repurchase program, reflecting confidence in its business and valuation [25] Management's Comments on Operating Environment and Future Outlook - Management expressed optimism about the growth potential in the enterprise and AI segments, despite experiencing softness in the lower end of the market [10][12] - The company believes that the shift to AI and LLMs (Large Language Models) presents significant opportunities for growth [11][12] - Management anticipates that the current pressures in the lower end of the market are temporary and expects stabilization in the future [36][64] Other Important Information - The company adjusted its full-year 2025 revenue guidance to a range of $443 million to $446 million, reflecting approximately 18% growth at the midpoint [21] - The non-GAAP operating margin guidance remains at 12%, despite the reduced revenue outlook and foreign exchange headwinds [21][24] Q&A Session Summary Question: Pressures in the low-end customer segment - Management indicated that the pressures are fairly contained to freelancers and less sophisticated customers, primarily impacted by rising cost per click [28][29] Question: Liquidity of the stock and buyback program - The share repurchase program is seen as a way to express confidence in the company's future potential and momentum in enterprise and AI [30][32] Question: Down market weakness and macro factors - Management believes the weakness is contained to the low-end segment and not reflective of broader macroeconomic conditions [36][38] Question: Customer acquisition costs and market dynamics - The increase in customer acquisition costs is primarily affecting the low-end segment, while other segments continue to perform well [51][56] Question: Future trajectory of the low-end customer base - Management expects stabilization in the low-end segment, with ongoing strength in the SMB and enterprise segments [62][64]
X @Avi Chawla
Avi Chawla· 2025-08-05 06:35
If you found it insightful, reshare it with your network.Find me → @_avichawlaEvery day, I share tutorials and insights on DS, ML, LLMs, and RAGs.Avi Chawla (@_avichawla):Evaluate conversational LLM apps like ChatGPT in 3 steps (open-source).Unlike single-turn tasks, conversations unfold over multiple messages.This means that the LLM's behavior must be consistent, compliant, and context-aware across turns, not just accurate in one-shot output. https://t.co/dugCyqQl6D ...