Workflow
LLM
icon
Search documents
X @Avi Chawla
Avi Chawla· 2025-08-12 19:30
AI Agent Fundamentals - The report covers AI Agent fundamentals [1] - It differentiates LLM, RAG, and Agents [1] - Agentic design patterns are included [1] - Building blocks of Agents are discussed [1] AI Agent Development - The report details building custom tools via MCP (likely meaning "Minimum Complete Product" or similar) [1] - It provides 12 hands-on projects for AI Engineers [1]
X @Avi Chawla
Avi Chawla· 2025-08-12 06:30
AI Agent Fundamentals - The document covers AI Agent fundamentals [1] - It compares LLM, RAG, and Agents [1] - It discusses Agentic design patterns [1] - It outlines the Building Blocks of Agents [1] AI Agent Development - The document details building custom tools via MCP [1] - It includes 12 hands-on projects for AI Engineers [1]
Beware of Gross Margin In Early Stage Investing
Beware of gross margin in the early days. That's a mistake we've made a couple of times. You know, you have a lot of businesses that in the early days have really bad gross margin.All the LLM providers were very clear examples of that. I think if that's the only thing that's holding you up in most cases, I would totally ignore it. We never lose a deal or pass on the deal because of price in the early stage.So, we've been around for 30 years. We invested 11.5% billion. We've returned close to 30 and we still ...
X @Polyhedra
Polyhedra· 2025-08-11 09:34
7/Key insight:Don’t just naively compile an LLM into a circuit.Exploit structure:- Linear ops (MatMul, LayerNorm) → custom efficient constraints.- Nonlinear ops (GELU) → fused constraints to slash complexity.- Parallel-friendly layout to max out modern prover hardware. ...
KT(KT) - 2025 Q2 - Earnings Call Transcript
2025-08-11 07:00
Financial Data and Key Metrics Changes - Operating revenue increased by 13.5% year over year, reaching KRW 7,427.4 billion [6] - Operating profit rose by 105.4% year over year, amounting to KRW 1,014.8 billion, supported by balanced growth in the telco business and one-time gains from real estate sales [6] - Net income increased by 78.6% year over year to KRW 733.3 billion, driven by higher operating profit [6] - EBITDA grew by 36.3% year over year, reporting KRW 1,990.7 billion [6] - Operating expenses rose by 5.9% year over year, totaling KRW 6,412.6 billion [7] Business Line Data and Key Metrics Changes - Wireless revenue increased by 0.9% year on year, reporting KRW 1,781.7 billion, with 79.5% of total handset subscribers being 5G subscribers [8] - Fixed line broadband revenue grew by 2.1% year over year, reaching KRW 631.4 billion, driven by Giga Internet subscriber growth [9] - B2B service revenue posted a 4.5% year over year growth, supported by telecom and AI/IT services [11] - AIIT business revenues saw a significant increase of 13.8% year over year [11] - KT Cloud revenue grew by 23% year over year, driven by increased data center usage [12] Market Data and Key Metrics Changes - The company noted that the 5G penetration rate is above 80%, indicating a mature market [21] - The company observed no overheating of competition in the market following the launch of new flagship handsets, although future competition may arise with new iPhone releases [20] Company Strategy and Development Direction - The company is focused on transforming into an AICT company and enhancing corporate value through strategic initiatives [4][13] - A multi-model strategy is being implemented, including partnerships with global tech firms like Microsoft and Palantir to enhance competitiveness in AI services [17] - The company plans to invest KRW 1 trillion in information security over five years to improve customer safety in telecom services [5] Management Comments on Operating Environment and Future Outlook - Management expressed confidence in sustaining solid service revenue growth into the second half of the year, despite a significant one-off gain from real estate in Q2 [25] - Concerns were raised about potential increases in commissions and selling-related expenses, but these are linked to earnings performance [26] - The company is committed to maintaining a shareholder-friendly dividend policy, with a declared dividend of KRW 600 per share, a 20% increase year over year [4][27] Other Important Information - The company plans to complete a share buyback of KRW 250 billion and has outlined a future buyback plan totaling KRW 750 billion over the next three years [4][28] Q&A Session Summary Question: Future direction of AI business and impact of handset subsidy repeal - Management highlighted three main strategies for AI: partnerships with global tech firms, a multi-model strategy for AI service development, and leveraging AI capabilities for operational efficiency [17][19] - Regarding the M and P market, management noted that while competition may heat up with new handset launches, it is not expected to be long-lasting due to high 5G penetration and longer handset replacement cycles [20][21] Question: Outlook for the second half of the year and updates on the value plan - Management expressed optimism for continued strong performance in the second half, driven by solid service revenue and improved cost management [25] - The company confirmed its commitment to a shareholder-friendly dividend policy and plans for additional share buybacks as part of its value enhancement program [27][28]
X @Avi Chawla
Avi Chawla· 2025-08-11 06:31
Model Fine-tuning - Fine-tuning enables the LLM to generate reasoning tokens in French before the final English response [1] - The video demonstrates the LLM's behavior before and after fine-tuning [1]
How to look at your data — Jeff Huber (Choma) + Jason Liu (567)
AI Engineer· 2025-08-06 16:22
Retrieval System Evaluation - Industry should prioritize fast and inexpensive evaluations (fast evals) using query and document pairs to enable rapid experimentation [7] - Industry can leverage LLMs to generate queries, but should focus on aligning synthetic queries with real-world user queries to avoid misleading results [9][11] - Industry can empirically validate the performance of new embedding models on specific data using fast evals, rather than relying solely on public benchmarks like MTeb [12] - Weights & Biases chatbot analysis reveals that the original embedding model (text embedding three small) performed the worst, while voyage 3 large model performed the best, highlighting the importance of data-driven evaluation [17][18] Output Analysis and Product Development - Industry should extract structured data from user conversations (summaries, tools used, errors, satisfaction, frustration) to identify patterns and inform product development [28][29] - Industry can use extracted metadata to find clusters and identify segments for targeted improvements, similar to how marketing uses user segmentation [29][26] - Cura library enables summarization, clustering, and aggregation of conversations to compare evals across different KPIs, helping to identify areas for improvement [32] - Industry should focus on providing the right infrastructure and tools to support AI agents, rather than solely focusing on improving the AI itself [39] - Industry should define evals, find clusters, and compare KPIs across clusters to make informed decisions on what to build, fix, and ignore [40][41] - Industry should monitor query types and performance over time to understand how the product is being used and identify opportunities for improvement [45]
Practical tactics to build reliable AI apps — Dmitry Kuchin, Multinear
AI Engineer· 2025-08-03 04:34
Core Problem & Solution - Traditional software development lifecycle is insufficient for AI applications due to non-deterministic models, requiring a data science approach and continuous experimentation [3] - The key is to reverse engineer metrics from real-world scenarios, focusing on product experience and business outcomes rather than abstract data science metrics [6] - Build evaluations (evals) at the beginning of the process, not at the end, to identify failures and areas for improvement early on [14] - Continuous improvement of evals and solutions is necessary to reach a baseline benchmark for optimization [19] Evaluation Methodology - Evaluations should mimic specific user questions and criteria relevant to the solution's end goal [7] - Use Large Language Models (LLMs) to generate evaluations, considering different user personas and expected answers [9][11] - Focus on the details of each evaluation failure to understand the root cause, whether it's the test definition or the solution's performance [15] - Experimentation involves changing models, logic, prompts, or data, and continuously running evaluations to catch regressions [16][18] Industry Specific Examples - For customer support bots, measure the rate of escalation to human support as a key metric [5] - For text-to-SQL or text-to-graph database applications, create a mock database with known data to validate expected results [22] - For call center conversation classifiers, use simple matching to determine if the correct rubric is applied [23] Key Takeaways - Evaluate AI applications the way users actually use them, avoiding abstract metrics [24] - Frequent evaluations enable rapid progress and reduce regressions [25] - Well-defined evaluations lead to explainable AI, providing insights into how the solution works and its limitations [26]
Hacking the Inference Pareto Frontier - Kyle Kranen, NVIDIA
AI Engineer· 2025-08-01 13:45
Challenges in LLM Inference - LLM inference systems face challenges related to latency, cost, and output quality, impacting user experience, profitability, and applicability [1] - The trade-offs between cost, throughput, latency, and quality define a Pareto frontier, limiting the successful application of LLM systems [1] NVIDIA Dynamo and Inference Techniques - NVIDIA Dynamo, a datacenter-scale distributed inference framework, aims to improve the Pareto frontier of inference systems [1] - Techniques employed include disaggregation (separating LLM generation phases), speculation (predicting multiple tokens per cycle), KV routing, storage, and manipulation (avoiding redundant work), and pipelining improvements for agents (accelerating workflows) [1] Key Inference Optimization Strategies - Disaggregation enhances efficiency by separating phases of LLM generation [1] - Speculation predicts multiple tokens per cycle to improve throughput [1] - KV routing, storage, and manipulation prevent redoing work, optimizing resource utilization [1] - Pipelining improvements for agents accelerate workflows by leveraging agent information [1]
Building a Smarter AI Agent with Neural RAG - Will Bryk, Exa.ai
AI Engineer· 2025-07-29 07:01
Core Problem & Solution - The presentation introduces Exa, a search engine designed for AI, addressing the limitations of traditional search engines built for human users [5][23] - Exa aims to provide an API that delivers any information from the web, catering to the specific needs of AI systems [22][41] - Exa uses transformer-based embeddings to represent documents, capturing meaning and context beyond keywords [11][12] AI vs Human Search - Traditional search engines are optimized for humans who use simple queries and want a few relevant links, while AIs require complex queries, vast amounts of knowledge, and precise, controllable information [23][24] - AI agents need search engines that can handle multi-paragraph queries, search with extensive context, and provide comprehensive knowledge [31][32][33] - Exa offers features like adjustable result numbers (10, 100, 1000), date ranges, and domain-specific searches, giving AI systems full control [44] Market Positioning & Technology - Exa launched in November 2022 and gained traction for its ability to handle complex queries that traditional search engines struggle with [15] - The company recognized the need for AI-driven search after the emergence of ChatGPT, realizing that LLMs need external knowledge sources [17][18] - Exa combines neural and keyword search methods to provide comprehensive results, allowing agents to use different search types based on the query [47][48] Future Development - Exa is developing a "research endpoint" that uses multiple searches and LLM calls to generate detailed reports and structured outputs [51] - The company envisions a future where AI agents have full access to the world's information through a versatile search API [48] - Exa aims to handle a wider range of queries, including semantic and complex ones, turning the web into a controllable database for AI systems [38][39][40]