Workflow
2025年Perplexity Comet电商选购类任务测试报告
Sou Hu Cai Jing·2025-08-15 04:06

Core Insights - The report evaluates the performance of various AI tools in e-commerce shopping tasks, specifically focusing on Perplexity Comet, OpenAI Agent, Manus, and Genspark [1][2]. Summary by Sections Testing Overview - The report includes a total of 51 pages and was completed on August 12, 2025, by a team led by Lang Hanwei and Maomao Head [1][6]. - Five specific tasks were tested: Amazon product purchase and repurchase, finding the fastest shipping bicycle, purchasing party supplies, selecting a windbreaker within a budget, and buying a refrigerator under specified conditions [1][2]. Performance Results - Perplexity Comet had the shortest average time of 318 seconds, while OpenAI Agent took the longest at 1193 seconds [1][2]. - In terms of accuracy, both Perplexity Comet and Genspark achieved a correct/incorrect ratio of 5/0, outperforming OpenAI Agent and Manus, which had a ratio of 4/1 [1][2]. Task-Specific Outcomes - For the Amazon repurchase task, Perplexity Comet and Genspark succeeded, while OpenAI Agent and Manus failed [2]. - In the task of finding the fastest shipping bicycle, only OpenAI Agent partially succeeded, with Perplexity Comet completing it in just 20 seconds [2]. - All tools successfully completed the task of selecting a windbreaker within a budget, while Genspark was the only one to succeed in the refrigerator purchase task [2]. Capability Assessment - All four tools met the standards for levels 1 to 7 in capability (from intent parsing to real-time interaction) [2]. - In levels 8 to 10 (from shopping cart operations to payment completion), Manus showed weaknesses, while Perplexity Comet was likely capable of completing payment operations [2][9]. User Experience Feedback - Team members rated Perplexity Comet as the most capable, followed by Genspark, OpenAI Agent, and Manus as the weakest [2][10]. - Perplexity Comet excelled in efficiency and full-process operations, while Genspark was noted for its information integration and execution details [2][10]. Additional Insights - The report also includes traffic analysis and update timelines for the AI tools, providing a comprehensive view of their capabilities and characteristics in the e-commerce sector [3].