浏览器架构重组
Search documents
「套壳」的最高境界:OpenAI揭秘Atlas浏览器架构OWL
3 6 Ke· 2025-10-31 03:34
Core Insights - OpenAI has launched the AI browser Atlas, which is built on a restructured architecture called OWL, differentiating it from a simple Chromium skin [1][5][19] Group 1: Foundation of Atlas - Atlas is fundamentally based on Chromium, but OpenAI aims to decouple it from the main application process to achieve faster startup speeds, smooth performance with multiple tabs, and a robust foundation for agent scenarios [2][5] - Chromium is highlighted as a natural building block due to its advanced web engine, security model, performance, and compatibility, supported by a global developer community [2][5] Group 2: Redefining Browser Experience - OpenAI emphasizes a complete redesign of the user interface for Atlas, utilizing modern native frameworks like SwiftUI and AppKit, rather than merely re-skinning Chromium [4] - The engineering team has optimized Chromium to ensure quick startup and the ability to handle hundreds of tabs without lag, while maintaining a straightforward integration approach to facilitate rapid development [4][5] Group 3: OWL Architecture - OWL (OpenAI's Web Layer) is the new architecture that allows Chromium to run independently from the Atlas main application process, enhancing application simplicity and performance [5][7] - This architecture enables faster startup, crash isolation, reduced merge conflicts, and a quicker development pace, allowing new engineers to contribute immediately [7][8] Group 4: Functionality of OWL - Atlas operates as the OWL client, while the Chromium process serves as the OWL host, communicating through Mojo, which is Chromium's inter-process communication system [8] - OWL provides a simplified Swift API for managing sessions, user profiles, rendering, and input events, facilitating efficient interaction between the client and Chromium [8][10] Group 5: Rendering and Input Handling - The rendering mechanism allows for dynamic content exchange between tabs, ensuring that the visual output is correctly displayed [11][13] - Input events are captured and forwarded by OpenAI's Swift client, maintaining security by ensuring that agent-generated events bypass the privileged browser layer [14][18] Group 6: Agent Mode - The Agent mode in Atlas presents unique challenges, requiring a complete screen image for input, and it manages UI elements like dropdowns by integrating them into the main page context [18] - Each Agent session operates in a temporary context, ensuring user privacy by creating independent memory storage that clears data after each session [18] Conclusion - OpenAI reiterates the importance of the global Chromium community in enabling the development of OWL, which aims to decouple the engine from the application, combining top web platforms with modern native frameworks for a faster and more flexible architecture [19]
「套壳」的最高境界:OpenAI揭秘Atlas浏览器架构OWL
机器之心· 2025-10-31 03:01
Core Viewpoint - OpenAI's new AI browser, Atlas, is built on a restructured architecture called OWL, which separates the Chromium runtime from the main application process, aiming to enhance browser performance and user experience [1][3][11]. Group 1: Foundation and Architecture - OpenAI emphasizes that Chromium serves as a foundational building block, providing advanced web engine capabilities, security models, performance, and compatibility, supported by a global developer community [5]. - The OWL architecture allows Chromium's browser process to run independently from the Atlas main application process, enhancing modularity and performance [12][14]. - OpenAI's approach involves a complete redesign of the Chromium integration, focusing on rapid development and maintaining engineering culture [10][11]. Group 2: User Experience Enhancements - Atlas aims to redefine the browser experience with features like instant startup speed, smooth performance even with multiple tabs, and a strong foundation for agent scenarios [7]. - The user interface of Atlas is almost entirely rebuilt from scratch, incorporating modern native frameworks rather than merely re-skinning the open-source Chromium interface [9][10]. - The architecture allows for faster loading times, crash isolation, and reduced merge conflicts, facilitating a quicker development cycle [18]. Group 3: Technical Implementation - Atlas operates as an OWL client, while the Chromium browser process acts as the OWL host, communicating through Mojo, a process communication system [17]. - The OWL client library provides a simplified Swift API for key functionalities, ensuring a clean codebase and modern application design [18]. - Input events are captured and forwarded efficiently, maintaining a seamless interaction between the Atlas interface and the Chromium rendering engine [30][32]. Group 4: Agent Mode and Security - The Agent mode in Atlas presents unique challenges, requiring complete screen images for input while ensuring security through sandboxing and session isolation [36][37]. - Each Agent session operates independently, clearing all cookies and data upon completion, allowing multiple concurrent sessions without interference [37]. Conclusion - OpenAI reiterates the critical role of the global Chromium community in enabling these advancements, with OWL paving the way for a decoupled engine and application architecture that combines top-tier web platforms with modern native frameworks [38].