Agentic Coding表现创新高，全新KAT系列模型上榜SWE-Bench

Core Insights - The article discusses the launch of two groundbreaking models in the Code Intelligence field by the Kuaipilot team: the open-source 32B parameter model KAT-Dev-32B and the closed-source flagship model KAT-Coder, showcasing their strong performance and capabilities in coding tasks [2][26]. Model Performance - KAT-Dev-32B achieved a 62.4% solution rate on the SWE-Bench Verified, ranking 5th among all open-source models of various sizes [2]. - KAT-Coder demonstrated an impressive 73.4% solution rate on the same benchmark, comparable to top global closed-source models [2][11]. Model Accessibility - KAT-Dev-32B is available on the Hugging Face platform for further research and development [7]. - The API key for KAT-Coder has been made available for application on the "Kuaishou Wanqing" enterprise-level model service and development platform, allowing users to access coding tools directly [7]. Training Innovations - The KAT series models underwent several innovative training phases, including Mid-Training, Supervised Fine-Tuning (SFT), Reinforcement Fine-Tuning (RFT), and large-scale Agentic Reinforcement Learning (RL) [9][12]. - Mid-Training focused on enhancing the model's capabilities related to "LLM-as-Agent," improving tool usage, multi-turn interaction, and instruction adherence [10][12]. - SFT involved collecting real demand delivery trajectories marked by human engineers to enhance end-to-end delivery capabilities [13]. - RFT introduced ground truth for trajectory exploration, improving the efficiency and stability of the reinforcement learning phase [15]. Advanced Techniques - The team implemented entropy-based tree pruning to efficiently learn from non-linear trajectory histories and maximize throughput while minimizing costs [19]. - The SeamlessFlow framework was developed to manage trajectory trees and ensure high throughput training by decoupling RL training from the agent's internal logic [21][22]. Emergent Capabilities - Post-training analysis revealed two significant emergent phenomena: a reduction in dialogue rounds by 32% compared to SFT models and the ability to call multiple tools in parallel [33][35]. - The model's efficiency preference and parallel calling capabilities were attributed to the implicit optimization pressure from the trajectory tree structure [33]. Future Prospects - The Kuaipilot team aims to explore the frontiers of code intelligence, including enhancing tool integration, expanding language support, and developing collaborative coding systems [35].