Small Language Models
Search documents
India’s AI Ambition, Energy & Talent Pool in Focus | Insight with Haslinda Amin 02/19/2026
Bloomberg Television· 2026-02-19 06:58
Live from New Delhi. This is inside with Haslinda Amin, where we will dig into India's fast rising artificial intelligence ambitions and the shockwaves hitting the country's storied I. T.giants. As India hosts one of the world's biggest AI summits. We speak live with Schneider Electric CEO Olivia Bloom, ServiceNow president and CEO Omid Zaveri and Fractal Analytics co-founder and CEO.Trick on the Alarm, uncanny about how this technology is reshaping the world. And we bring you more from our conversations wi ...
A CPU-CENTRIC PERSPECTIVE ON AGENTIC AI
2026-01-22 02:43
Summary of Key Points from the Conference Call Industry and Company Overview - The discussion revolves around **Agentic AI** frameworks, which enhance traditional Large Language Models (LLMs) by integrating decision-making orchestrators and external tools, transforming them into autonomous problem solvers [2][4]. Core Insights and Arguments - **Agentic AI Workloads**: The paper profiles five representative agentic AI workloads: **Haystack RAG**, **Toolformer**, **ChemCrow**, **LangChain**, and **SWE-Agent**. These workloads are analyzed for latency, throughput, and energy metrics, highlighting the significant role of CPUs in these metrics compared to GPUs [3][10][20]. - **Latency Contributions**: Tool processing on CPUs can account for up to **90.6%** of total latency in agentic workloads, indicating a need for joint CPU-GPU optimization rather than focusing solely on GPU improvements [10][34]. - **Throughput Bottlenecks**: Throughput is bottlenecked by both CPU factors (coherence, synchronization, core over-subscription) and GPU factors (memory capacity and bandwidth). This dual limitation affects the performance of agentic AI systems [10][45]. - **Energy Consumption**: At large batch sizes, CPU dynamic energy consumption can reach up to **44%** of total dynamic energy, emphasizing the inefficiency of CPU parallelism compared to GPU [10][49]. Important but Overlooked Content - **Optimizations Proposed**: The paper introduces two key optimizations: 1. **CPU and GPU-Aware Micro-batching (CGAM)**: This method aims to improve performance by capping batch sizes and using micro-batching to optimize latency [11][50]. 2. **Mixed Agentic Workload Scheduling (MAWS)**: This approach adapts scheduling strategies for heterogeneous workloads, balancing CPU-heavy and LLM-heavy tasks to enhance overall efficiency [11][58]. - **Profiling Insights**: The profiling of agentic AI workloads reveals that tool processing, rather than LLM inference, is the primary contributor to latency, which is a critical insight for future optimizations [32][34]. - **Diverse Computational Patterns**: The selected workloads represent a variety of applications and computational strategies, showcasing the breadth of agentic AI systems and their real-world relevance [21][22]. Conclusion - The findings underscore the importance of a CPU-centric perspective in optimizing agentic AI frameworks, highlighting the need for comprehensive strategies that address both CPU and GPU limitations to enhance performance, efficiency, and scalability in AI applications [3][10][11].
KPMG and Uniphore form AI agent collaboration for regulated industries
Yahoo Finance· 2026-01-20 09:25
Core Insights - KPMG has formed a strategic partnership with Uniphore to implement AI agents utilizing industry-specific small language models (SLMs) in regulated sectors such as banking, insurance, energy, and healthcare [1][4] - The collaboration aims to enhance KPMG's global workforce with AI-enabled delivery models, integrating AI into core business processes to achieve operational value [3][4] Group 1: Partnership and Technology - The platform developed under this partnership is built on a sovereign, composable, and secure architecture, designed to integrate with KPMG's existing enterprise systems while meeting governance and compliance requirements [2] - KPMG plans to utilize Uniphore's Business AI Cloud to encode institutional knowledge, regulatory frameworks, and process playbooks into industry-specific SLMs [4] Group 2: AI Deployment and Applications - The initiative aims to deploy governed AI agents across various functions, including procurement, workforce optimization, finance, claims, and customer experience [5] - An SLM factory model is central to the collaboration, converting knowledge work traditionally performed by humans into scalable, reusable AI systems [5] - KPMG is developing an AI-enabled procurement and contracting capability that classifies high-value contracts, compares terms, extracts obligations, identifies risks, and routes exceptions for human review [6]
Straker Limited (ASX: STG) Announces Extension and Expansion of IBM Partnership
Prnewswire· 2025-10-30 07:29
Core Insights - Straker Limited has renewed and expanded its strategic partnership with IBM for an additional three years, effective January 1, 2026, with a contract value of approximately NZ$28 million (US$16.1 million) over the initial term [2][3]. Agreement Details - The renewed agreement allows IBM to extend the contract for an additional year beyond the initial three years and is based on customer usage, which may lead to revenue fluctuations [2][3]. - The agreement maintains core terms from the previous contract but emphasizes deploying Straker's AI-driven solutions across IBM's global operations, where 10,000 users are already utilizing Straker's AI-driven Slack translation application [4]. Expanded Strategic Partnership - Straker is now recognized as part of the IBM Ecosystem Partner network, with the collaboration managed primarily through IBM Japan, enhancing Straker's integration within IBM's technology ecosystem [5]. - A significant focus of the partnership includes the joint development of customized small language models using IBM's watsonx AI technology and Straker's Tiri platform, which has shown promising early results [6][7]. CEO Commentary - The CEO of Straker highlighted that the renewal and expansion of the partnership with IBM validate the company's strategy and provide a strong foundation for future growth, emphasizing the transformation of translation services and broader enterprise AI opportunities [8].
X @Solana
Solana· 2025-10-14 19:04
RT Sam Hogan 🇺🇸 (@0xSamHogan)I'm excited to announce @inference_net's $11.8M Series Seed funding round, led by @multicoincap & @a16zcrypto CSX, with participation from @topology_vc, @fdotinc, and an incredible group of angels.The next wave of AI adoption will be driven by companies building AI natively into their products at scale.As scaling laws continue to demand larger models and more compute, margins become thin, and operating at scale becomes untenable.We're taking a different approach -- training task ...
X @The Economist
The Economist· 2025-09-14 14:40
Market Trends - Corporate demand for small language models is projected to grow twice as fast as it is for large models [1] - The growth of small language models is starting from a much lower base [1]