Agent
Search documents
数十亿AI员工上岗倒计时!云计算一哥“没有魔法,只有真能解决问题的Agent”
Xin Lang Cai Jing· 2025-12-03 13:24
当AI炒作退潮,谁在真正交付价值?云计算一哥打响价值落地之战! 作者 | 李水青 AI的价值实现路径,正从"模型能力展示"转向"Agent实际部署"。 亚马逊云科技CEO马特·加曼(Matt Garman)在今日凌晨举办的2025 re:Invent主题演讲中直言:"Agent的出现使我们在AI轨迹上发生了变化——从一个技 术奇迹的时代,转向真正获得价值的时代。" 他的判断基于一组反差强烈的数据:一方面,生成式AI引发全球狂欢,Amazon Bedrock已服务超过10万家企业,其中50多家客户处理了超1万亿tokens; 另一方面,许多企业仍未看到AI投资带来相匹配的业务回报。 ▲Garman在讲解Amazon Bedrock落地情况 "Agent是企业从AI投资中获得实质性商业回报的地方。"Garman揭示了一个关键转折点,"我相信,在未来每个公司内部和每个可以想象的领域都会有数 十亿的Agent。" 一场重新定义AI价值实现的竞赛已经打响。在亚马逊云科技2025 re:Invent的舞台上,AI芯片性能飙涨600%,构建AI Agent的四大技术支柱同步升级, Agent部署的全栈战争已经升级……到底什 ...
DeepSeek杀出一条血路:国产大模型突围不靠运气
3 6 Ke· 2025-12-03 03:21
进入2025年末,全球大模型赛道的技术焦点几乎被Google重新夺回。Gemini 3 Pro横空出世,在多个权 威基准上超越所有开源模型,重新确立了闭源阵营的技术高地。一时间,业内关于"开源模型是否已到 极限""Scaling Law是否真的撞墙"的质疑声再起,一股迟滞情绪在开源社区弥漫。 但就在此时,DeepSeek没有选择沉默。12月1日,它一口气发布了两款重磅模型:推理性能对标GPT-5 的DeepSeek-V3.2,以及在数学、逻辑和多轮工具调用中表现异常强势的Speciale版本。这不仅是对技术 能力的集中展示,也是在当前算力资源并不占优的前提下,对闭源"新天花板"的正面回应。 这不是一次简单的模型更新。DeepSeek试图在后Scaling时代找出一条全新路径:如何用架构重塑弥补 预训练差距?如何通过"工具使用中的思考链"实现低token高效率的智能体表现?更关键的是,Agent为 何从附属功能变成了模型能力跃迁的核心引擎? 本文将围绕这三条主线展开分析:DeepSeek是如何在技术瓶颈下突破的?为何率先在开源阵营中重注 Agent?而这是否意味着,开源模型仍有穿透闭源护城河的那条路? 这背后的 ...
DeepSeek V3.2 正式版发布,V4 还没来,但已经是开源模型里 Agent 能力最强了
Founder Park· 2025-12-01 13:14
Core Insights - DeepSeek has released the official version of its V3.2 model, which significantly enhances reasoning and agent capabilities compared to previous versions [2][9] - The V3.2-Speciale version is an open-source model that performs comparably to Gemini-3.0-Pro on mainstream reasoning benchmarks and has achieved gold medal levels in several prestigious competitions [3][11] - The integration of the DeepSeek Sparse Attention (DSA) technology in V3.2 improves long text processing efficiency and reduces costs by over 50% [3][10] Model Development - The V3 series has been iterated over the past year, with V3.2 being the latest release, focusing on unifying thinking and non-thinking models, a trend seen in other closed-source models like Gemini and GPT-5 [6][9] - The release timeline for DeepSeek models in 2025 includes various versions, each with specific enhancements, such as the introduction of DSA in V3.2 for stability and reasoning improvements [7][8] Performance Metrics - DeepSeek-V3.2 has achieved reasoning capabilities on par with GPT-5 and has shown significant improvements in output length and computational efficiency compared to Kimi-K2-Thinking [10][14] - The V3.2-Speciale version excels in complex tasks, achieving high scores in various academic competitions, including IMO 2025 and ICPC 2025, with notable rankings among human competitors [11][14] Tool Utilization - A key advancement in V3.2 is the incorporation of thinking processes into tool calls, allowing the model to support both thinking and non-thinking modes in its operations [15][18] - DeepSeek has developed a large-scale agent training data synthesis method that enhances the model's generalization capabilities by creating numerous "hard-to-answer, easy-to-verify" tasks [16][18]
DeepSeek-V3.2系列开源,性能直接对标Gemini-3.0-Pro
量子位· 2025-12-01 12:13
衡宇 发自 奥特赛德 量子位 | 公众号 QbitAI 突袭! ChatGPT发布三周年,DeepSeek嚯一下发出两个模型: 前者聚焦平衡实用 ,适用于日常问答、通用Agent任务、真实应用场景下的工具调用。 推理达GPT-5水平,略低于Gemini-3.0-Pro。 下图展示的是DeepSeek-V3.2与其他模型在各类Agent工具调用评测集上的得分 ——特别强调,DeepSeek-V3.2并没有针对这些测试集的工具做特殊训练。 划重点,ICPC达到人类选手第二、IOI人类选手第十名水平。 具体来说,DeepSeek-V3.2侧重于平衡推理能力与输出长度,降低计算开销。 DeepSeek官微推文中写道,"DeepSeek-V3.2模型在Agent评测中达到了当前开源模型的最高水平"。 该模型其他情况如下: DeepSeek-V3.2 DeepSeek-V3.2-Speciale 推理能力比肩GPT-5; 相比Kimi-K2-Thinking大幅缩短输出长度,减少用户等待时间; DeepSeek旗下首个"思考融入工具调用" 的模型,支持思考/非思考双模式工具调用; 基于1800+环境、85000+复杂指令 ...
锦秋基金被投企业Hogi产品一码难求,动画 Agent 导演作品离「疯狂动物城」有多远?|Jinqiu Spotlight
锦秋集· 2025-12-01 11:15
以下文章来源于极客公园 ,作者金光浩 极客公园 . 用极客视角,追踪你最不可错过的科技圈。欢迎同步关注极客公园视频号 「Jinqiu Spotlight」 追踪锦秋基金与被投企业的每一个光点与动态, 为创业者传递一线行业风向。 锦秋基金已完成对Hogi 的投资。 锦秋基金,作为12年期的 AI Fund,始终以长期主义为核心投资理念,积极寻找那些具有突破性技术和创新商业 模式的通用人工智能初创企业。 最近 AI 圈出了一款有趣的产品:来自Hogi的 「OiiOii」,一款专注 AI 生成动画的 Agent。 它异常火爆,7210 个内测名额很快被抢光,闲鱼上免费邀请码被炒到 30 块,内测群有2万人,甚至据说内测用 户里还出现了全网 2000w 的顶级创作者。 出现这种现象级产品传播的背后原因,本篇文章(原创:极客公园)总结道: 技术上,Sora2 与 nanobanana2 让"人物一致性"这一 AI 视频动画的最大痛点被攻破,技术窗口正式打开, OiiOii 成为最快把前沿能力产品化、吃到红利的玩家。 需求上,在短视频时代,人人都有视觉表达需求,而 OiiOii 用简单工具补上专业产能的缺口,让动画创作从 ...
但斌:AI、Agent的实现很可能仅被几家公司所控制 他们的市值可能大得不可思议
Xin Lang Zheng Quan· 2025-11-30 04:29
Core Insights - The 2025 Analyst Conference highlighted the intense competition in the AI sector, with significant R&D investments from major companies like Amazon, Google, and Microsoft, indicating a potential shift in market dynamics [1] Group 1: Investment Trends - Amazon's R&D investment over the past year reached $125 billion, while Google invested $90 billion, and Microsoft, in collaboration with OpenAI, announced an investment of approximately $100 billion [1] - The conference emphasized the potential for AI to create a more monopolistic market structure, similar to trends observed in the internet and mobile internet sectors [1] Group 2: Market Implications - The success of AI technologies could challenge existing business models, including those of major players like Tencent and WeChat, suggesting a transformative impact on the industry landscape [1] - The concentration of market power in a few companies due to AI advancements could lead to unprecedented market valuations for these firms [1]
为什么我判断90%的中国ToB公司不需要GEO
Tai Mei Ti A P P· 2025-11-26 02:24
Core Viewpoint - The article discusses the current trends in the ToB (business-to-business) sector, particularly focusing on the concepts of GEO (Generative Engine Optimization) and AI Agents, arguing that while GEO is a trend, it is not yet a viable or effective strategy for most ToB companies [1][20]. Summary by Sections GEO vs. AI Agents - The author opposes the idea that GEO is the future of traffic acquisition, stating that 90% of Chinese ToB companies do not need to invest in GEO at this time [2][3]. - GEO is seen as a potential trend but lacks a mature product and commercial model, making it difficult to establish a stable profit loop [3][8]. Current State of GEO - The current form of GEO resembles a "next-generation SEO," but it has not yet developed a solid commercial framework [4][5]. - The effectiveness of GEO in driving traffic is questioned, as it does not significantly outperform traditional search engines like Baidu in terms of user acquisition [6][10]. User Behavior and Market Dynamics - User behavior in the ToB sector remains stable, with search engines still being the primary source of traffic, despite the rise of AI models [11][12]. - The article emphasizes that the decline in traffic is a macro issue rather than a result of competitors using GEO to steal market share [12][13]. Challenges in Implementing GEO - Many ToB companies lack a solid foundation in SEO, which hampers their ability to leverage GEO effectively [17][19]. - The article suggests that companies should focus on strengthening their SEO and content strategies across various platforms before attempting to implement GEO [19][20]. Future Outlook - The author posits that the future of traffic acquisition lies in AI Agents, which will integrate more seamlessly into user experiences and business needs [21][22]. - Companies should aim to become part of the AI ecosystem, transforming their products into "callable capabilities" within AI models, rather than relying solely on traditional traffic sources [22].
最后一周!2025年度中国技术力量榜单申报即将截止
AI前线· 2025-11-24 05:52
Core Insights - The article announces the upcoming deadline for the "2025 China Technology Power Annual List" registration, which is set for November 30, 2023 [3] - This year marks the fifth consecutive year of the InfoQ list evaluation, with participation from over 100 companies, including major industry players and innovative representatives [4] - The theme for this year's list is "Insight into AI Transformation, Witnessing Intelligent Future," focusing on eight key areas related to AI advancements [4] Summary by Categories - The evaluation will cover eight award categories, including: - 2025 AI Infrastructure Excellence Award TOP20 - 2025 AI Engineering and Deployment Excellence Award TOP20 - "Artificial Intelligence +" Best Industry Solution TOP20 - AI Agent Most Productive Product/Application/Platform TOP15 - Data & AI Most Valuable Product/Platform TOP10 - AI Coding Most Productive Product TOP5 - Embodied Intelligence Star Product TOP10 - AI Open Source Star Project TOP10 [5] Event Details - The results of the annual list evaluation will be announced on December 19, 2023, during the AICon·Beijing event, which will also feature an award ceremony [8] - The two-day event will gather industry experts from leading companies and innovative teams to discuss trending AI topics, including Agents, AI Programming, Embodied Intelligence, and Multimodal [8] Keynote Sessions - The event will feature various keynote sessions focusing on topics such as: - The revolution in content creation driven by multimodal large models - The evolution and implementation of Agent technology - New paradigms in software development in the LLM era - Practical challenges and experiences in deploying Coding Agents at scale [10][11][12] Participation Invitation - Companies and teams are encouraged to share their latest achievements and outstanding projects in the AI field, covering areas such as infrastructure development, innovative engineering and deployment, and productivity enhancement through intelligent agents [25]
把世界拆成最小单元,然后重新拼装 | 42章经 AI Newsletter
42章经· 2025-11-23 13:01
Core Insights - The article discusses the strategic shift of Grammarly, which has transformed from a grammar-checking tool into a more comprehensive productivity suite by acquiring Coda and Superhuman, aiming to create a robust AI-driven platform [4][14][28]. Group 1: Grammarly's Strategic Transformation - Grammarly has achieved over $700 million in annual revenue and surpassed 40 million users, defying expectations of decline in the AI era [4]. - The company rebranded itself as Superhuman after acquiring Coda and Superhuman, with Coda's founder becoming the new CEO [4][5]. - Grammarly's core strength lies in its distribution capabilities, allowing it to integrate AI into over 500,000 applications and websites [11][12]. Group 2: The Concept of Bundling - The article emphasizes the importance of bundling in business strategy, highlighting that bundling can activate non-essential users and spread user acquisition costs [31][34]. - Shishir Mehrotra, the new CEO, has extensive experience in bundling strategies, having worked with successful companies like Microsoft and Spotify [31][38]. - The best bundling strategy involves ensuring that essential users are as distinct as possible while overlapping non-essential users [40][41]. Group 3: AI and Future Opportunities - The emergence of AI is expected to lead to a rapid unbundling of tools, followed by a rebundling phase where platforms will integrate various AI components [50][51]. - AI will enable the creation of dynamic bundles tailored to individual user needs, potentially leading to unprecedented levels of customization and efficiency [51][66]. - The article draws parallels between the impact of containerization on global supply chains and the potential of AI to revolutionize knowledge and capability distribution [68][80]. Group 4: Market Dynamics and User Context - The article argues that user context is highly fragmented, providing opportunities for startups to create neutral, cross-platform AI layers that connect various applications [28][29]. - The competition will likely split into two extremes: specialized component experts and integrators who can effectively bundle these components into cohesive solutions [82].
Agents Gone Wild? Use Tool Call Limits in LangChainJS to Keep Them in Check!
LangChain· 2025-11-20 16:30
Hi, this is Christian from LChain. Have you ever built an agent that just goes nuts with your API calls. Tools can give an agent incredible power, but can also cost you a lot of money to run.In this video, I will show you how you could keep your agent under control without any hard-coded guardrails within your system prompt. Today, we're taking a look at the tool called middleware within LChain. It's a clean declarative way to set credit limits, rate limits, or usage caps on any tools your agent uses.Think ...