图形用户界面(GUI)
Search documents
拜拜了GUI,中科院团队“LLM友好”计算机使用接口来了
3 6 Ke· 2025-10-27 07:31
大模型Agent帮你自动操作电脑,理想很丰满,现实却骨感。 现有的LLM智能体,几乎都绕不开两大核心"痛点": 成功率低:稍微复杂一点的任务,Agent就"翻车",常常卡在某个步骤不知所措。 效率差:完成一个简单任务,Agent需要和系统进行几十轮"极限拉扯",耗时漫长,看得人着急。 问题到底出在哪?难道是现在的大模型还不够聪明吗? 来自中国科学院软件研究所团队的最新研究给出了一个出乎意料的答案:真正的瓶颈,在于那个我们用了40多年、无比熟悉的图形用户界面(GUI) 。 例如,GUI功能控件藏在层层菜单、选项卡和对话框后面,控件的访问需要点击菜单、下拉框等进行导航,以使控件出现在屏幕上。其次,许多控件的使 用(如滚动条、文本选取)需要反复调整并观察反馈,形成高频"观察-操作"循环。 研究团队一针见血地指出,GUI的这种命令式(Imperative)设计背后,隐藏着对人类用户的四个"关键假设" : 将"命令式"GUI转换为"声明式" 没错,就是那个从上世纪80年代开始流行,彻底改变了人机交互方式的GUI。它一直以来都是为人类量身定制的,其设计哲学与LLM的能力模型,简直是 背道而驰。 研究团队指出了GUI的核心 ...
卡帕西预言成真!华人团队开源全AI操作系统:神经网络模拟Windows,预测下一帧屏幕图像
量子位· 2025-07-15 06:28
Core Viewpoint - The article discusses the development of NeuralOS, a neural network-driven operating system that can simulate a graphical user interface (GUI) similar to Windows, predicting the next frame of screen images based on user interactions [1][2][4]. Group 1: NeuralOS Development - NeuralOS was inspired by a prediction from expert Karpathy about the future of AI-driven GUIs, which will be fluid, magical, and interactive [4][5]. - The research team from the University of Waterloo and the National Research Council of Canada created a demo version of NeuralOS [5][6]. Group 2: Technical Mechanism - NeuralOS utilizes two core components: Recurrent Neural Networks (RNN) for tracking computer state changes and a Renderer for generating corresponding screen images [7][8]. - The training process involved using extensive video recordings of user interactions with the Ubuntu XFCE system, including both random and realistic user behaviors [10][11]. Group 3: Performance Evaluation - The model demonstrated high accuracy in predicting screen states, with most predictions aligning closely with actual states, although it struggled with rapid keyboard inputs [14][15]. - The interface changes generated by NeuralOS during continuous operations appeared nearly indistinguishable from a real system, showcasing its potential for realistic simulations [15]. Group 4: Research Team - The research team consists of five members, with four being of Chinese descent, highlighting a diverse background in AI and machine learning [17][19][21][23][27][29]. Group 5: Future Implications - The development of NeuralOS suggests a shift towards dynamic, AI-generated operating systems, moving away from traditional static interfaces [37].