Workflow
通用Agent
icon
Search documents
Manus是通用Agent的未来,还是一个不可复制的孤例?
Tai Mei Ti A P P· 2025-12-31 02:54
Core Insights - Meta announced the acquisition of Chinese AI company Butterfly Effect for several billion dollars, marking its third-largest acquisition in history, following WhatsApp and Scale AI [1] - The product Manus, launched less than a year ago, achieved an annual recurring revenue (ARR) of over $100 million within 270 days, showcasing rapid growth and market validation [2] - The acquisition raises questions about the future of the Agent sector and whether Manus represents a model for AI commercialization or merely a fortunate exception [1][2] Group 1: Manus's Growth and Business Model - Manus's rapid growth is characterized by a strategic exit rather than a miraculous success, balancing product capability, revenue structure, and market timing [1] - The product operates on a "large model + cloud virtual machine" architecture, enabling it to autonomously understand tasks and deliver complex outputs, distinguishing it from traditional chatbots [2] - Despite its success, Manus faces high operational costs due to the reliance on substantial computational resources, raising concerns about its long-term sustainability [3] Group 2: Meta's Strategic Acquisition - Meta's acquisition of Manus is a strategic move to fill a gap in its AI capabilities, as it seeks a commercially viable and well-engineered Agent model [4] - Competitors like OpenAI, Google, and Microsoft have successfully established commercial applications, while Meta has struggled to convert its AI model capabilities into revenue [5] - Manus serves as a ready-made solution for Meta, providing a subscription model and a potential platform for future AI applications [5] Group 3: Industry Implications and Future Outlook - The acquisition of Manus has shifted market perceptions regarding the value of AI application companies, challenging the notion that Agent products lack intrinsic value [7] - The success of Manus may not lead to a widespread boom in the Agent sector, as major companies may prefer to develop their own solutions rather than acquire existing ones [8] - The high valuation of Manus is attributed to its global user base, engineering capabilities, and venture capital backing, suggesting that similar companies may become rare in the market [9]
数十亿美金收购在赌什么?Manus倒向Meta的N个关键解读
3 6 Ke· 2025-12-30 10:38
北京时间12月30日,Meta宣布完成对通用自主AI智能体公司Manus(蝴蝶效应)的收购。交易金额约数十亿美元。 国内科技圈对这项交易的评论,用的最多的两个字就是"震惊"。 腾讯科技与多位投资人、Agent领域创作者交流之后,震惊主要集中在以下几点: Manus近期被爆出以 20亿美金的估值寻求下一轮融资,没想到这么快就被Meta收购; "数十亿美金"的描述给出了很大的想象空间,如果是 40亿美金之上,Meta真的是显示出了巨大的诚意(焦虑); 创办于中国本土的创业公司,为什么最终收购它的是Meta。 但也有几点共识: 这个价格卖的很好,Manus创始团队做的几次关键选择都很棒,为Manus感到开心; 通用Agent毕竟走在大模型的"主航道"上,被大厂收购是注定的归宿; Manus的产品力很强,在硅谷也很有口碑和人气,这也是估值的坚实根基; 这是最好的结局,最好的时机了。 在 2025年的末尾,MM(Manus-Meta)的联合,也许就像Manus的公司名字——"蝴蝶效应"一般,在科技圈引起层层涟漪。 01 Manus值多少钱? 根据Meta和Manus的官方公告及公开信息显示,这笔交易金额达数十亿美元,成 ...
Manus卖给了Meta!年初火爆年底数十亿美元被收购
量子位· 2025-12-30 00:02
衡宇 发自 凹非寺 量子位 | 公众号 QbitAI WHAAAT,一睁眼, Manus被Meta收购了 ! Meta和Manus官网同时发布最新消息,"Manus将加入Meta"。 消息显示,Meta收购Manus,主要目的在于提升通用Agent的能力。 Manus官网文章显示, Manus将继续通过app和网站为用户提供产品和订阅服务,同时公司将继续在新加坡运营 。 这绝对是继收购Scale AI之后,Meta的又一重磅收购消息。 援引彭博社报道: 该投资已引起CEO马克·扎克伯格的注意,并 成为公司最高优先事项。 Meta的AI话事人亚历山大王已经在上发布了对Manus团队和创始成员之一肖弘的欢迎贴 不知道等团队开始一起工作过后,两人将是什么样的汇报线关系。 收购完成后,Manus创始人肖弘将出任Meta副总裁。 Meta成立以来第三大收购事件 据《晚点LatePost》报道,此次Meta收购Manus的费用为数十亿美元。 这个价格听起来不是很"天价"——昨天我们刚聊过老黄200亿美金收购Groq的内幕详情——但 在Meta的收购史上,这个价格排到了前三。 第一是WhatsApp。 关键词:移动互联网船 ...
AI大牛张祥雨:Transformer撑不起Agent时代
Di Yi Cai Jing· 2025-12-18 10:52
人脑是"无限流"压缩大师,大模型靠堆层数无法学会人类记忆,到8万Token就不可用了。 "但是很快我们发现了一个巨大的副作用。"张祥雨说,真正的难点是模型的智商会随着文本变化快速下降。"今天的Transformer,不管号称发布出来说支持 到多少Token,基本上到8万个就不可用了。" 这个问题指向了Transformer的一个缺陷,就是它的单向信息流设计。无论输入序列(Context)多长,模型的有效"思考深度"的信息只能从浅层向深层单向 传递,缺乏从深层向浅层的反馈与压缩机制,这与人类大脑"无限流"的记忆机制存在本质差异。 "我今天讲过的每一句话,都是历史上我见过的所有信息的函数。"张祥雨用比喻阐明,"这个函数能用层数固定的网络来表示吗?肯定不可以。"他说人类大 脑能够对从小到大的海量经历进行动态压缩和选择性回溯,而当前Transformer结构无法实现这种类似"无限流"世界的智能处理需求,这制约了AI向具备高度 自主性、能长期持续学习的通用Agent演进。 事实上,当前已经开始有研究者讨论Transformer是否存在根本局限性。就在今年10月,Transformer 架构的共同创造者Llion Jon ...
昆仑万维方汉:通用Agent是伪命题,AI Office仍有存在空间丨MEET2026
量子位· 2025-12-15 05:57
编辑部 整理自 MEET2026 量子位 | 公众号 QbitAI 从通用大模型到可执行智能体,AI正沿着过程自动化历经技术的拐点。 正值Agent迅速迭代时期, 昆仑万维董事长兼CEO方汉 从产业出发,提出了一个全新的视角: Agent并不是通用人工智能的雏形,而是一种可验证过程的自动化系统。它不擅长创造新范式,却极其擅长把已经被人类验证过的流程 规模化复制。 在他的判断中,从ChatGPT到DeepSeek,标志着大模型完成了从"背答案"到"背过程"的关键跃迁:前者打开了通用对话与生成能力的入口, 后者则通过更高效、更长推理的方式,把大模型推向以过程泛化为核心的新阶段。 也正因此,Agent最先落地的并非千行百业,而是流程稳定、结果可验证的AI Office。所以长期来看,将是基础模型逐步收敛,而围绕具体流 程构建的 专业Agent 成为组织运行的基本单元。 在他看来,最终Agent重塑的不是某个岗位,而是整个组织——人类将从重复执行者,转变为过程的架构者,这也将成为Agent时代人类表达 自我的新范式。 为了完整呈现方汉的思考,在不改变原意的基础上,量子位对演讲内容进行了整理编辑,希望能提供新的视角与洞 ...
原神Agent,字节出品
猿大侠· 2025-11-16 04:11
Core Viewpoint - ByteDance has developed a new gaming agent named Lumine, capable of autonomously playing games like Genshin Impact, showcasing advanced skills in exploration, combat, and puzzle-solving [1][4][16]. Group 1: Agent Capabilities - Lumine can perform complex tasks such as dynamic enemy tracking, precise long-range shooting, and smooth character switching, effectively handling various game scenarios [4][6][10]. - The agent demonstrates strong understanding in boss battles and can solve intricate puzzles, indicating high spatial awareness [6][8][10]. - Lumine is capable of executing GUI operations and can follow complex instructions with clear prior information, enhancing its usability in gaming [12][14]. Group 2: Technical Framework - Lumine is built on the Qwen2-VL-7B-Base model, leveraging multimodal understanding and generation capabilities acquired from extensive training on web data [16]. - The agent employs a unified language space for modeling operations and reasoning, facilitating seamless integration of perception, reasoning, and action [16][19]. - Three core mechanisms are designed for Lumine: Observation Space for visual input processing, Hybrid Thinking for decision-making efficiency, and Keyboard and Mouse Modelling for operational commands [19][22][23]. Group 3: Training Process - The training process consists of three phases: pre-training for basic actions, instruction-following training for task comprehension, and decision reasoning training for long-term task execution [25][27][29]. - Lumine-Base model emerges with core capabilities like object interaction and basic combat, while Lumine-Instruct model achieves over 80% success in short tasks [26][28]. - The Lumine-Thinking model can autonomously complete long-term tasks without human intervention, showcasing its advanced planning and reasoning abilities [30]. Group 4: Performance Evaluation - In comparative tests, Lumine-Base shows over 90% success in basic interactions but lacks goal-oriented behavior in untrained areas [39]. - Lumine-Instruct outperforms mainstream VLMs in task completion rates, achieving 92.5% in simple tasks and 76.8% in difficult tasks, demonstrating superior tactical planning [41]. - Lumine-Thinking completes main story tasks in Genshin Impact with a 100% completion rate in 56 minutes, significantly outperforming competitors like GPT-5 [44][45]. Group 5: Industry Implications - The development of gaming agents like Lumine represents a significant step towards creating general-purpose AI capable of operating in complex 3D environments [50][55]. - Companies like Google are also exploring similar paths with their SIMA 2 agent, indicating a broader industry trend towards utilizing gaming scenarios for training AI [52][56]. - The belief in the eventual transition of gaming agents into real-world applications highlights the potential for embodied intelligence in various sectors [56].
原神Agent,字节出品
量子位· 2025-11-14 12:10
Core Viewpoint - ByteDance has developed a new gaming agent named Lumine, capable of autonomously playing games like Genshin Impact, showcasing advanced skills in exploration, combat, and puzzle-solving [1][4][9]. Group 1: Agent Capabilities - Lumine can perform complex tasks in Genshin Impact, including dynamic enemy tracking, precise long-range shooting, and smooth character switching [4][5]. - The agent demonstrates strong understanding in boss battles and can solve various puzzles, such as collecting items based on environmental cues [6][12]. - Lumine is capable of executing GUI operations and can follow complex instructions by understanding prior task information [7][8]. Group 2: Technical Framework - Lumine is built on the Qwen2-VL-7B-Base model, leveraging multimodal understanding and generation capabilities from extensive web data training [9][10]. - The agent employs three core mechanisms: Observation Space for visual input processing, Hybrid Thinking for decision-making efficiency, and Keyboard and Mouse Modelling for action representation [12][14][15]. - A three-phase training process was implemented, including pre-training for basic actions, instruction-following training, and decision reasoning training, leading to high task completion rates [17][20][23]. Group 3: Performance Metrics - Lumine-Base shows a stepwise emergence of capabilities, achieving over 90% success in basic interactions but lacking goal-directed behavior [38]. - Lumine-Instruct outperforms mainstream VLMs in short-cycle tasks, achieving a success rate of 92.5% in simple tasks and 76.8% in difficult tasks [33][35]. - Lumine-Thinking demonstrates exceptional performance in long-term tasks, completing the main storyline of Genshin Impact in 56 minutes with a 100% task completion rate, significantly faster than competitors [41][42]. Group 4: Cross-Game Adaptability - Lumine-Thinking exhibits strong adaptability across different games, successfully completing tasks in titles like Honkai: Star Rail and Black Myth: Wukong, showcasing its general agent characteristics [45][46]. - The agent's ability to navigate unfamiliar environments and execute complex tasks highlights its potential for broader applications beyond gaming [45][46]. Group 5: Industry Implications - The development of Lumine reflects a trend in the industry where companies like Google are also creating agents capable of operating in 3D game environments, indicating a clear path towards embodied AGI [48][51]. - The belief in the eventual transition of gaming agents into real-world applications underscores the significance of advancements in AI and gaming technology [51].
Meta最新论文解读:别卷刷榜了,AI Agent的下一个战场是“中训练”
3 6 Ke· 2025-10-13 07:19
Core Insights - The focus of AI competition is shifting from benchmarking to the ability of agents to autonomously complete complex long-term tasks [1][2] - The next battleground for AI is general agents, but practical applications remain limited due to feedback mechanism challenges [2][4] - Meta's paper introduces a "mid-training" paradigm to bridge the gap between imitation learning and reinforcement learning, proposing a cost-effective feedback mechanism [2][7] Feedback Mechanism Challenges - Current mainstream agent training methods face significant limitations: imitation learning relies on expensive static feedback, while reinforcement learning depends on complex dynamic feedback [4][5] - Imitation learning lacks the ability to teach agents about the consequences of their actions, leading to poor generalization [4] - Reinforcement learning struggles with sparse and delayed reward signals in real-world tasks, making training inefficient [5][6] Mid-Training Paradigm - Meta's "Early Experience" approach allows agents to learn from their own exploratory actions, providing valuable feedback without external rewards [7][9] - Two strategies are proposed: implicit world modeling (IWM) and self-reflection (SR) [9][11] - IWM enables agents to predict outcomes based on their actions, while SR helps agents understand why expert actions are superior [11][15] Performance Improvements - The "Early Experience" method has shown significant performance improvements across various tasks, with an average success rate increase of 9.6% compared to traditional imitation learning [15][17] - The approach enhances generalization capabilities and lays a better foundation for subsequent reinforcement learning [15][21] Theoretical Implications - The necessity of a world model for agents to handle complex tasks is supported by recent research from Google DeepMind [18][20] - "Early Experience" helps agents build a causal understanding of the world, which is crucial for effective decision-making [21][22] Future Training Paradigms - A proposed three-stage training paradigm (pre-training, mid-training, post-training) may be essential for developing truly general agents [23][24] - The success of "Early Experience" suggests a new scaling law that emphasizes maximizing parameter efficiency rather than merely increasing model size [24][28]
朱啸虎:搬离中国,假装不是中国AI创业公司,是没有用的
Hu Xiu· 2025-09-20 14:15
Group 1 - The discussion highlights the impact of DeepSeek and Manus on the AI industry, emphasizing the importance of open-source models in China and their potential to rival closed-source models in the US [3][4][5] - The conversation indicates that the open-source model trend is gaining momentum, with Chinese models already surpassing US models in download numbers on platforms like Hugging Face [4][5] - The competitive landscape is shifting towards "China's open-source vs. America's closed-source," with the establishment of an open-source ecosystem being beneficial for China's long-term AI development [6][7] Group 2 - Manus is presented as a case study for Go-to-Market strategies, illustrating that while Chinese entrepreneurs have strong product capabilities, they often lack effective market entry strategies [10][11] - Speed is identified as a critical barrier for AI application companies, with the need to achieve rapid growth to outpace competitors [11][12] - Token consumption is discussed as a significant cost indicator, with Chinese companies focusing on this metric due to lower willingness to pay among domestic users [12][13][14] Group 3 - The AI coding sector is characterized as a game dominated by large companies, with high token costs making it challenging for startups to compete effectively [15][16] - The conversation suggests that AI coding is not a viable area for startups due to the lack of customer loyalty among programmers and the high costs associated with token consumption [16][18] - Investment in vertical applications rather than general-purpose agents is preferred, as the latter may be developed by model manufacturers themselves [20] Group 4 - The discussion on robotics emphasizes investment in practical, value-creating robots rather than aesthetically pleasing ones, with examples of successful projects like a boat-cleaning robot [21][22] - The importance of combining functionality with sales capabilities in robotic applications is highlighted, as this can lead to a more favorable ROI [22][23] Group 5 - The conversation stresses the need for AI hardware companies to focus on simplicity and mass production rather than complex features, as successful hardware must be deliverable at scale [28][29] - The potential for new hardware innovations in the AI era is questioned, with a belief that significant breakthroughs may still be years away [30][31] Group 6 - The dialogue addresses the challenges of globalization for Chinese companies, noting that successful market entry in the US requires a deep understanding of local dynamics and compliance [36][37] - The importance of having a local sales team for B2B applications in the US is emphasized, as relationships play a crucial role in sales success [38][39] Group 7 - The conversation highlights the risks associated with high valuations, which can limit a company's flexibility and increase pressure for performance [42][43] - The discussion suggests that IPOs for Chinese companies may increasingly occur in Hong Kong rather than the US, as liquidity issues persist in the market [46][48] Group 8 - The need for startups to operate outside the influence of large companies is emphasized, with a call for rapid growth and innovation in the AI sector [49][53] - The potential for AI startups to achieve significant scale quickly is acknowledged, but the conversation warns that the speed of evolution in the AI space may outpace traditional exit strategies [52][53]
AutoGLM2.0升级发布,智谱:给每个手机装上通用Agent
Xin Lang Ke Ji· 2025-08-20 07:45
Core Viewpoint - The launch of AutoGLM 2.0 by Zhiyuan represents a significant upgrade, allowing the AI to operate independently across various devices and scenarios, enhancing user experience and accessibility [1] Group 1: Product Features - AutoGLM 2.0 can now function as an executive assistant, autonomously completing diverse tasks in the cloud without hardware limitations [1] - In daily life scenarios, users can command AutoGLM to perform tasks on popular applications like Meituan, JD.com, Xiaohongshu, and Douyin with simple voice commands [1] - In professional settings, AutoGLM 2.0 can execute full workflows across websites, including information retrieval, content creation, and social media posting [1] Group 2: User Experience - The upgrade allows users to engage with other applications on their devices while AutoGLM 2.0 operates in the background, enhancing multitasking capabilities [1] - The AI is equipped with dedicated intelligent agents for mobile and computer platforms, enabling it to work independently in the cloud [1]