Workflow
大模型智能调度路由
icon
Search documents
国产AI路由系统开源逆袭!仅用19%成本达到Gemini-2.5-Pro同等性能
量子位· 2025-08-20 04:33
Core Viewpoint - The article discusses the launch of the Avengers-Pro multi-model scheduling routing solution, which balances performance and cost for users of large models, making advanced AI capabilities more accessible [3][12]. Group 1: Performance and Cost Efficiency - Avengers-Pro integrates eight leading large models and achieves superior performance on six challenging datasets, surpassing GPT-5-medium by 7% and Gemini-2.5-Pro by 19% [5]. - The solution offers a 27% cost reduction while achieving performance equivalent to GPT-5-medium, and only 19% of the cost to match Gemini-2.5-Pro [5][20]. - Avengers-Pro achieves Pareto optimality, providing the highest accuracy at any given cost level and minimizing costs for specified accuracy targets [5][23]. Group 2: Technical Mechanism - The core mechanism of Avengers-Pro involves embedding and clustering user requests to dynamically match and allocate the most suitable model for different tasks [15][25]. - The framework consists of three main steps: embedding user requests into high-dimensional vectors, clustering similar tasks, and scoring models based on performance-cost evaluations [16][25]. - The system allows flexible switching between performance and cost optimization by adjusting a parameter α, catering to diverse application needs [17][30]. Group 3: Competitive Landscape - Avengers-Pro outperforms any single model in its pool, achieving an average accuracy of 0.66 compared to GPT-5-medium's 0.62 [20]. - The solution demonstrates significant cost savings while maintaining performance, proving its effectiveness in the current large model ecosystem [30][32]. - The intelligent routing concept is expected to lead to further breakthroughs in large model applications in the future [32].