Workflow
Serverless AI
icon
Search documents
企业级AI应用开发:从技术选型到生产落地
阿里云· 2025-11-28 13:53
Investment Rating - The report does not explicitly state an investment rating for the industry Core Insights - The report emphasizes the transition from traditional application development to Serverless AI application development, highlighting the need for a new infrastructure paradigm that supports AI agents and their unique requirements [10][27] - It identifies the importance of dynamic elasticity and task-driven orchestration in AI-native architectures, which allows for efficient resource allocation and management [19][24] - The report discusses the advantages of Serverless AI runtimes, including reduced operational burdens, cost efficiency, and enhanced developer focus on business innovation rather than infrastructure [26][34] Summary by Sections 01 Enterprise-level AI Application Development Runtime Selection - AI-native paradigms demand new infrastructure requirements, focusing on agent-centric services rather than traditional user-centric models [13][15] - The infrastructure must support state persistence and low-latency access, enabling agents to maintain memory and personality [17] - Embracing uncertainty is crucial, with infrastructure designed to lower risks associated with non-deterministic outputs from large language models (LLMs) [21] - The transition from traditional architectures to AI-native architectures is necessary for effective application development [26] 02 Key Technologies of Serverless AI Runtime - Serverless platforms provide heterogeneous computing capabilities, integrating various programming languages and event-driven architectures [33][34] - The report highlights the importance of security isolation and automatic disaster recovery in Serverless AI runtimes [38][42] - Serverless GPU services are emphasized for their rapid cold start capabilities and efficient resource utilization, significantly reducing costs [43][49] 03 Customer Cases – Serverless + AI Simplifying Application Development - The report presents internal case studies from Alibaba, showcasing successful implementations of Serverless runtimes in building models and AI tools [87][90] - It illustrates how Serverless AI runtimes have become core to Alibaba Cloud's AI-native applications, enhancing performance and reducing operational costs [90][92] - The case studies demonstrate the ability to handle high concurrency and low latency requirements in real-time AI applications [93][99]
均降40%的GPU成本,大规模Agent部署和运维的捷径是什么?| 直播预告
AI前线· 2025-10-28 09:02
Core Insights - The article discusses the challenges and solutions for large-scale deployment and operation of AI agents in enterprises, emphasizing the need for innovation in this area [2]. Group 1: Event Details - The live broadcast is scheduled for October 28, 2025, from 19:30 to 20:30 [5]. - The theme of the live broadcast is "Accelerating Hundredfold Startup: What are the Shortcuts for Large-scale Agent Deployment and Operation?" [3][7]. Group 2: Guest Speakers - The live broadcast features key speakers including Yang Haoran, the head of Alibaba Cloud's Serverless Computing, and Zhao Yuying, the chief editor of Geekbang Technology [4]. Group 3: Key Topics - The discussion will cover the technological transition from "Cloud Native" to "AI Native" [8]. - It will highlight the AgentRun platform, which claims to achieve a hundredfold acceleration and an average reduction of 40% in GPU costs [9]. - The session will address the full lifecycle governance of AI agents, from development to operation [9]. - Future evolution of Serverless AI will also be a topic of discussion [9].