LLM
Search documents
X @Polyhedra
Polyhedra· 2025-08-11 09:34
Core Idea - The industry should not naively compile an LLM (Large Language Model) into a circuit [1] - The industry should exploit structure for efficiency [1] Optimization Strategies - The industry should use custom efficient constraints for linear operations like MatMul and LayerNorm [1] - The industry should use fused constraints for nonlinear operations like GELU to reduce complexity [1] - The industry should adopt a parallel-friendly layout to maximize modern prover hardware utilization [1]
KT(KT) - 2025 Q2 - Earnings Call Transcript
2025-08-11 07:00
Financial Data and Key Metrics Changes - Operating revenue increased by 13.5% year over year, reaching KRW 7,427.4 billion [6] - Operating profit rose by 105.4% year over year, amounting to KRW 1,014.8 billion, supported by balanced growth in the telco business and one-time gains from real estate sales [6] - Net income increased by 78.6% year over year to KRW 733.3 billion, driven by higher operating profit [6] - EBITDA grew by 36.3% year over year, reporting KRW 1,990.7 billion [6] - Operating expenses rose by 5.9% year over year, totaling KRW 6,412.6 billion [7] Business Line Data and Key Metrics Changes - Wireless revenue increased by 0.9% year on year, reporting KRW 1,781.7 billion, with 79.5% of total handset subscribers being 5G subscribers [8] - Fixed line broadband revenue grew by 2.1% year over year, reaching KRW 631.4 billion, driven by Giga Internet subscriber growth [9] - B2B service revenue posted a 4.5% year over year growth, supported by telecom and AI/IT services [11] - AIIT business revenues saw a significant increase of 13.8% year over year [11] - KT Cloud revenue grew by 23% year over year, driven by increased data center usage [12] Market Data and Key Metrics Changes - The company noted that the 5G penetration rate is above 80%, indicating a mature market [21] - The company observed no overheating of competition in the market following the launch of new flagship handsets, although future competition may arise with new iPhone releases [20] Company Strategy and Development Direction - The company is focused on transforming into an AICT company and enhancing corporate value through strategic initiatives [4][13] - A multi-model strategy is being implemented, including partnerships with global tech firms like Microsoft and Palantir to enhance competitiveness in AI services [17] - The company plans to invest KRW 1 trillion in information security over five years to improve customer safety in telecom services [5] Management Comments on Operating Environment and Future Outlook - Management expressed confidence in sustaining solid service revenue growth into the second half of the year, despite a significant one-off gain from real estate in Q2 [25] - Concerns were raised about potential increases in commissions and selling-related expenses, but these are linked to earnings performance [26] - The company is committed to maintaining a shareholder-friendly dividend policy, with a declared dividend of KRW 600 per share, a 20% increase year over year [4][27] Other Important Information - The company plans to complete a share buyback of KRW 250 billion and has outlined a future buyback plan totaling KRW 750 billion over the next three years [4][28] Q&A Session Summary Question: Future direction of AI business and impact of handset subsidy repeal - Management highlighted three main strategies for AI: partnerships with global tech firms, a multi-model strategy for AI service development, and leveraging AI capabilities for operational efficiency [17][19] - Regarding the M and P market, management noted that while competition may heat up with new handset launches, it is not expected to be long-lasting due to high 5G penetration and longer handset replacement cycles [20][21] Question: Outlook for the second half of the year and updates on the value plan - Management expressed optimism for continued strong performance in the second half, driven by solid service revenue and improved cost management [25] - The company confirmed its commitment to a shareholder-friendly dividend policy and plans for additional share buybacks as part of its value enhancement program [27][28]
X @Avi Chawla
Avi Chawla· 2025-08-11 06:31
Model Fine-tuning - Fine-tuning enables the LLM to generate reasoning tokens in French before the final English response [1] - The video demonstrates the LLM's behavior before and after fine-tuning [1]
How to look at your data — Jeff Huber (Choma) + Jason Liu (567)
AI Engineer· 2025-08-06 16:22
Retrieval System Evaluation - Industry should prioritize fast and inexpensive evaluations (fast evals) using query and document pairs to enable rapid experimentation [7] - Industry can leverage LLMs to generate queries, but should focus on aligning synthetic queries with real-world user queries to avoid misleading results [9][11] - Industry can empirically validate the performance of new embedding models on specific data using fast evals, rather than relying solely on public benchmarks like MTeb [12] - Weights & Biases chatbot analysis reveals that the original embedding model (text embedding three small) performed the worst, while voyage 3 large model performed the best, highlighting the importance of data-driven evaluation [17][18] Output Analysis and Product Development - Industry should extract structured data from user conversations (summaries, tools used, errors, satisfaction, frustration) to identify patterns and inform product development [28][29] - Industry can use extracted metadata to find clusters and identify segments for targeted improvements, similar to how marketing uses user segmentation [29][26] - Cura library enables summarization, clustering, and aggregation of conversations to compare evals across different KPIs, helping to identify areas for improvement [32] - Industry should focus on providing the right infrastructure and tools to support AI agents, rather than solely focusing on improving the AI itself [39] - Industry should define evals, find clusters, and compare KPIs across clusters to make informed decisions on what to build, fix, and ignore [40][41] - Industry should monitor query types and performance over time to understand how the product is being used and identify opportunities for improvement [45]
Practical tactics to build reliable AI apps — Dmitry Kuchin, Multinear
AI Engineer· 2025-08-03 04:34
Core Problem & Solution - Traditional software development lifecycle is insufficient for AI applications due to non-deterministic models, requiring a data science approach and continuous experimentation [3] - The key is to reverse engineer metrics from real-world scenarios, focusing on product experience and business outcomes rather than abstract data science metrics [6] - Build evaluations (evals) at the beginning of the process, not at the end, to identify failures and areas for improvement early on [14] - Continuous improvement of evals and solutions is necessary to reach a baseline benchmark for optimization [19] Evaluation Methodology - Evaluations should mimic specific user questions and criteria relevant to the solution's end goal [7] - Use Large Language Models (LLMs) to generate evaluations, considering different user personas and expected answers [9][11] - Focus on the details of each evaluation failure to understand the root cause, whether it's the test definition or the solution's performance [15] - Experimentation involves changing models, logic, prompts, or data, and continuously running evaluations to catch regressions [16][18] Industry Specific Examples - For customer support bots, measure the rate of escalation to human support as a key metric [5] - For text-to-SQL or text-to-graph database applications, create a mock database with known data to validate expected results [22] - For call center conversation classifiers, use simple matching to determine if the correct rubric is applied [23] Key Takeaways - Evaluate AI applications the way users actually use them, avoiding abstract metrics [24] - Frequent evaluations enable rapid progress and reduce regressions [25] - Well-defined evaluations lead to explainable AI, providing insights into how the solution works and its limitations [26]
Hacking the Inference Pareto Frontier - Kyle Kranen, NVIDIA
AI Engineer· 2025-08-01 13:45
Challenges in LLM Inference - LLM inference systems face challenges related to latency, cost, and output quality, impacting user experience, profitability, and applicability [1] - The trade-offs between cost, throughput, latency, and quality define a Pareto frontier, limiting the successful application of LLM systems [1] NVIDIA Dynamo and Inference Techniques - NVIDIA Dynamo, a datacenter-scale distributed inference framework, aims to improve the Pareto frontier of inference systems [1] - Techniques employed include disaggregation (separating LLM generation phases), speculation (predicting multiple tokens per cycle), KV routing, storage, and manipulation (avoiding redundant work), and pipelining improvements for agents (accelerating workflows) [1] Key Inference Optimization Strategies - Disaggregation enhances efficiency by separating phases of LLM generation [1] - Speculation predicts multiple tokens per cycle to improve throughput [1] - KV routing, storage, and manipulation prevent redoing work, optimizing resource utilization [1] - Pipelining improvements for agents accelerate workflows by leveraging agent information [1]
Building a Smarter AI Agent with Neural RAG - Will Bryk, Exa.ai
AI Engineer· 2025-07-29 07:01
Core Problem & Solution - The presentation introduces Exa, a search engine designed for AI, addressing the limitations of traditional search engines built for human users [5][23] - Exa aims to provide an API that delivers any information from the web, catering to the specific needs of AI systems [22][41] - Exa uses transformer-based embeddings to represent documents, capturing meaning and context beyond keywords [11][12] AI vs Human Search - Traditional search engines are optimized for humans who use simple queries and want a few relevant links, while AIs require complex queries, vast amounts of knowledge, and precise, controllable information [23][24] - AI agents need search engines that can handle multi-paragraph queries, search with extensive context, and provide comprehensive knowledge [31][32][33] - Exa offers features like adjustable result numbers (10, 100, 1000), date ranges, and domain-specific searches, giving AI systems full control [44] Market Positioning & Technology - Exa launched in November 2022 and gained traction for its ability to handle complex queries that traditional search engines struggle with [15] - The company recognized the need for AI-driven search after the emergence of ChatGPT, realizing that LLMs need external knowledge sources [17][18] - Exa combines neural and keyword search methods to provide comprehensive results, allowing agents to use different search types based on the query [47][48] Future Development - Exa is developing a "research endpoint" that uses multiple searches and LLM calls to generate detailed reports and structured outputs [51] - The company envisions a future where AI agents have full access to the world's information through a versatile search API [48] - Exa aims to handle a wider range of queries, including semantic and complex ones, turning the web into a controllable database for AI systems [38][39][40]
X @Avi Chawla
Avi Chawla· 2025-07-26 06:30
Tool Calling Overview - LLM 可以访问一组工具来完成任务,这些工具由人工定义 [1] - LLM 决定何时使用这些工具以及执行的参数 [1]
X @Avi Chawla
Avi Chawla· 2025-07-26 06:30
Core Concept - Router pattern involves human definition of paths/functions in a flow [1] - LLM makes basic decisions on function or path selection [1]
How to build Enterprise Aware Agents - Chau Tran, Glean
AI Engineer· 2025-07-24 09:22
[Music] Thanks Alex for the introduction. That was a very impressive LLM generated summary of me. Uh I've never heard it before but uh nice.Um so um today I'm going to talk to you about something that has been keeping me up at night. Uh probably some of you too. So how to build enterprise aware agents.How to bring the brilliance of AI into the messy complex realities of uh how your business operated. So let's jump straight to the hottest question of the month for AI builders. Uh should I build workflows or ...