Workflow
AI Operating System
icon
Search documents
全球首个AI Agent操作系统FlowithOS跑分超Atlas,网友:它「杀死」了比赛
机器之心· 2025-10-29 09:25
Core Viewpoint - Flowith has launched FlowithOS, the world's first operating system specifically designed for AI Agents, which is expected to revolutionize human interaction with networks, information, and services [2][4]. Product Overview - FlowithOS exists as a standalone application that combines an agentic workspace and web browser, eliminating the boundaries previously defined by applications and web pages [3][4]. - It operates with a 97.7% success rate in executing tasks, surpassing any existing AI Agent capabilities [3]. Unique Features - Unlike traditional AI tools, FlowithOS autonomously searches across multiple web pages based on user instructions, understanding visual and coded content to perform various actions [3][4]. - The system is designed to evolve and improve its memory and skills with each user interaction, providing increasingly personalized services [14]. Performance Metrics - FlowithOS achieved an average accuracy rate of 95.4% in benchmark tests, outperforming top competitors, including ChatGPT Atlas, which scored 75.7% [12]. User Experience - Users have described FlowithOS as a system that can think for itself, capable of automating tasks such as content creation and social media interactions [5][6][17]. - The system integrates visual reasoning and execution in real-time, transforming user intentions into actions seamlessly [6]. Market Position - FlowithOS is seen as a significant competitor to OpenAI's ChatGPT Atlas, with the potential to redefine the AI application landscape [6][14]. - The product is currently in public beta and is compatible with both macOS and Windows, unlike ChatGPT Atlas, which is limited to macOS [19]. Company Background - Flowith was founded in 2023 by a team of ten young entrepreneurs, with Derek as the founder, who has nine years of entrepreneurial experience [20]. - The company has previously launched several successful AI products, including Flowith and Flowith Neo, which have received positive feedback from users [21][22].
手机AGI助手还有多远?移动智能体复合长程任务测试基准与调度系统发布
机器之心· 2025-07-26 09:32
Core Insights - The article discusses the transition from atomic task automation to complex long-range task management in mobile agents, highlighting the challenges faced by current systems in handling composite tasks that require multi-application interaction and information synthesis [4][6][10]. Group 1: Current State of Mobile Agents - Multi-modal large models (MLLM) have shown promising results in single-screen actions and short-chain tasks, indicating initial maturity in edge task automation [4]. - Existing mobile GUI agents exhibit significant capability gaps when faced with complex long-range tasks, struggling with generalization from atomic to composite tasks [6][10]. Group 2: Proposed Solutions - Researchers introduced a dynamic evaluation benchmark called UI-Nexus, which covers complex long-range tasks across 50 applications, designed with 100 task templates averaging 14.05 optimal steps [7][21]. - The multi-agent task scheduling system, AGENT-NEXUS, was proposed to facilitate instruction distribution, information transfer, and process management without modifying the underlying agent models [7][19]. Group 3: Task Complexity and Types - The article categorizes composite tasks into three types based on subtask dependencies: Independent Combination, Context Transition, and Deep Dive, each presenting unique challenges for mobile agents [11][13][21]. - A detailed analysis of error cases revealed that mobile agents often fail due to poor progress management and information handling, leading to issues like context overflow and information transfer failures [16][32]. Group 4: Experimental Findings - Testing across various mobile agents showed that task completion rates were below 50%, with AGENT-NEXUS improving completion rates by 24% to 40% while only increasing inference costs by about 8% [27][30]. - The performance of agents improved significantly when given manually split atomic instructions, particularly for UI-TARS, which increased its completion rate from 11% to 60% [29]. Group 5: Future Outlook - The article envisions a new generation of AI operating systems capable of efficiently coordinating and managing complex task demands, transforming mobile devices into intelligent personal assistants [34][36].