Windows Copilot
Search documents
美国大厂为什么集体绕开“龙虾”?
阿尔法工场研究院· 2026-03-26 00:04AI Processing
以下文章来源于世界模型工场 ,作者世界模型工场 世界模型工场 . 看世界的模型,也看模型的世界。在这里,认识AI,通向未来。 (世界模型工场关注AI圈内部消息,交流八卦请添加作者微信:lovelisa1005,获取更多一手消息) 导语:在美国市场,Claude Code这类"安全版电脑操控"成为美国主流企业标配。 3月23日深夜,Anthropic突然扔出一颗重磅炸弹,旗下产品迎来关键更新。 这一次,AI真正学会了操控电脑。 更新后的Claude Code,可以直接打开本地文件、使用浏览器、操作IDE。 更夸张的是,当系统没有现成接口时,它会像一个真人一样,直接接管鼠标、键盘和屏幕,无需任何复杂设置。 Claude Cowork则把这套能力从程序员专属扩展到普通知识工作者。你只需在Mac桌面App里给它一个文件夹权限,它就能自主读写文件、整理 资料、执行多步工作流。 业内瞬间沸腾,这被视为Claude体系迄今为止最激进的一次能力跃迁:AI从会聊天进化到会动手。 一夜之间,全球开发者社区都在刷屏"AI终于长出手了"。 恰恰相反,它们早已把这一底层范式视为下一代AI的核心竞争力,并坚定地把它做进自家模型和产品闭环 ...
用SFT打出RL的效果?微软联合提出高效后训练算法
机器之心· 2026-03-25 07:44
Core Insights - The article discusses the importance of Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) in the post-training phase of large models, highlighting their respective strengths and weaknesses [2] - A new approach, "Towards On-Policy SFT," is proposed to combine the advantages of SFT and RL by generating On-policy data and training efficiently [3] Group 1: On-Policy Data and Its Measurement - On-policy data is defined as data generated by the model using its current capabilities, contrasting with Off-policy data, which is derived from external sources [4] - Traditional metrics like Perplexity (PPL) and Log-Likelihood are insufficient for measuring the distribution shift between On-policy and Off-policy data due to noise from problem difficulty [6] - The article introduces a new quantification metric, Centered Log-Likelihood (CLL), which separates the noise and provides a clearer distinction between data sources [7] Group 2: Challenges of Supervised Fine-Tuning - SFT operates under the assumption that every word in the training set is an absolute truth, leading to severe penalties for prediction errors, which can cause catastrophic forgetting [12][13] - The article proposes In-Distribution Fine-Tuning (IDFT) as a solution to mitigate the issues caused by rigid fitting and noise in training data [14][17] Group 3: Hinted Decoding and Data Transformation - Hinted Decoding is introduced as a method to convert datasets into On-policy versions by allowing the model to rewrite examples while maintaining its style [20] - The approach involves switching between Self-distillation and normal training based on the entropy of the Teacher model, which improves the model's distribution metrics [22] Group 4: Experimental Results - The new methods proposed in the article outperform well-known Offline RL algorithms while using significantly fewer resources [25] - The results indicate that the adaptive switching mechanism based on entropy is crucial for achieving better performance [25] Group 5: Broader Implications - The work has potential applications across various fields, including CoT completion and On-policy Distillation, indicating its relevance beyond the immediate context [28]
AI新贵23亿融资刚到手!微软直接杀上门,没真本事根本撑不住
Sou Hu Cai Jing· 2025-11-20 11:24
Core Insights - The recent significant news in the AI sector is Cursor's successful fundraising of $2.3 billion, raising its valuation to $29.3 billion, a nearly 12-fold increase in just over a year [3] - The competitive landscape is intensifying as major players like Microsoft launch advanced tools such as Copilot, which threaten the market positions of emerging AI companies like Cursor and Gamma [2][8] Company Highlights - Cursor, founded in 2023, has quickly become a "programming savior" for millions of developers, addressing common pain points like writing boilerplate code and debugging [5][7] - Gamma, with a small team of 52, has developed a tool that simplifies the creation of presentations, achieving an annual revenue exceeding $100 million and nearly 30 million monthly visits [5][7] Competitive Threats - Microsoft’s Copilot can generate documents and presentations with voice commands, posing a significant threat to Gamma, which relies on users installing additional software [8][10] - The entry of domestic giants into the programming tool market, such as Volcano Engine's low-cost programming model, intensifies the competition for Cursor, which may struggle to maintain its market share amid price wars [12][14] Market Dynamics - The success of Cursor and Gamma is attributed to their ability to address existing inefficiencies in traditional tools, but they lack substantial barriers to entry against larger competitors [7][14] - The current landscape suggests that companies relying solely on AI for efficiency improvements without building a robust user ecosystem or unique technology may face significant challenges [14][15]
Windows AI助手免费进化,能操作电脑、登录网页、生成代码
3 6 Ke· 2025-10-31 02:54
Core Insights - Microsoft has officially updated Windows Copilot, making the AI assistant available for free to all users, enhancing the capabilities of Microsoft 365 Copilot with a new feature called "Researcher" that includes "Computer Use" capabilities [1] Group 1: New Features and Capabilities - The "Computer Use" capability allows for smarter research, deeper insights, and more comprehensive reports by securely accessing enterprise internal data that requires login authentication [1] - The Researcher agent can generate PowerPoint presentations, spreadsheets, or applications using code [1] - Users can enhance work reports by utilizing private meeting notes, documents, and chat records [1] Group 2: Technical Implementation - The Researcher capability is supported by a series of new tools that can be orchestrated by the Researcher agent, connecting to a sandbox environment that provides screenshots of each operation [2] - When an operation is required, a virtual machine running on Windows 365 is initiated, isolated from the internal network and user devices, ensuring security [4] - The virtual machine operates in a temporary sandbox environment, with all necessary components pre-installed, and user credentials are not stored or transmitted outside the sandbox [4] Group 3: Performance Testing - The Researcher with Computer Use was evaluated using GAIA and BrowseComp benchmark tests, showing a 44% performance improvement in complex multi-step browsing tasks compared to the current version [6] - In the GAIA test, the performance improved by 6%, demonstrating the model's ability to find, verify, and reason with real-world data [6] - Specific examples of tasks completed by the Researcher include piecing together information from various web pages to answer complex questions [6] Group 4: Competitive Context - Microsoft has not disclosed the original scores of the tests, making it difficult to assess the absolute performance improvements [7] - The performance metrics can be indirectly compared to OpenAI's DeepResearch results, with recent data from Qwen providing a reference point [7]
Windows AI助手免费进化!能操作电脑、登录网页、生成代码
量子位· 2025-10-31 00:58
Core Viewpoint - Microsoft has officially updated Windows Copilot, making the AI assistant available for free to enhance computer interface usage through Microsoft 365 Copilot's Researcher agent [1] Group 1: Features and Capabilities - The Researcher agent now includes a "Computer Use" capability, allowing for smarter research, deeper insights, and more comprehensive reports [1][2] - The AI assistant has evolved from merely "speaking" to "doing," utilizing a series of new tools orchestrated by the Researcher [3] - The orchestration layer connects to a sandbox environment, providing screenshots of each operation step [4] Group 2: Security and Data Access - Secure access requires authentication for enterprise internal data, enabling the generation of presentations, spreadsheets, or applications [5] - When the model determines an action is needed, it initiates a virtual machine running on Windows 365, isolated from the internal network and user devices [7] - The virtual machine operates in a temporary sandbox environment, with a default browser and all necessary components for executing model predictions [8] Group 3: Operation and User Interaction - Instructions from the intelligent agent are sent through a secure channel, ensuring no user credentials are permanently stored or transmitted outside the sandbox [9] - All intermediate reasoning steps include screenshots and terminal outputs, allowing real-time monitoring of the agent's operations [10] - When user confirmation or password entry is required, a secure screen-sharing connection can be used to control the sandbox [11] Group 4: Performance Testing - The Researcher with Computer Use was evaluated using GAIA and BrowseComp benchmark tests, showing a 44% performance improvement in complex multi-step browsing tasks compared to the current version [12] - In the GAIA test, the model's performance improved by 6%, successfully answering questions by accessing and processing real-world data [12]