基于文本AI的终结?Agent协作可直接「复制思维」,Token效率暴涨
机器之心·2025-12-05 04:08

Core Insights - The article discusses the emergence of multi-agent systems (MAS) in the Agentic AI era, emphasizing the shift from individual models to collaborative problem-solving among AI agents [2][5] - A new framework called LatentMAS is introduced, which allows agents to collaborate in latent space rather than through traditional text communication, enhancing efficiency and performance [5][14] Group 1: LatentMAS Framework - LatentMAS enables agents to exchange internal hidden layer representations and KV-cache working memory, resulting in higher performance and reduced token usage [5][10] - The framework is designed to support richer latent reasoning and lossless communication between agents, significantly lowering computational complexity compared to text-based MAS [15][16] Group 2: Experimental Results - Comprehensive experiments on nine benchmark tasks show that LatentMAS outperforms both single models and text-based MAS, with accuracy improvements of up to 14.6% and token usage reductions of 70.8% to 83.7% [6][20][22] - LatentMAS achieves end-to-end reasoning speed increases of 4× to 4.3× compared to traditional methods, demonstrating its efficiency [21][25] Group 3: Efficiency and Performance - The framework allows for complex reasoning processes while significantly reducing the number of tokens used, achieving higher accuracy with fewer output tokens [28][29] - LatentMAS can provide additional speed improvements of 2.6× to 7× over text-based MAS, even when the latter is optimized with vLLM services [25][28] Group 4: Semantic Richness - The latent representations generated by LatentMAS are shown to be semantically rich and diverse, surpassing the expressiveness of discrete tokens used in text-based systems [30][31] - The study indicates that the potential reasoning captured in LatentMAS is not only effective but also contains more nuanced internal representations compared to traditional methods [31][32]