Workflow
智能体
icon
Search documents
最爱喝奶茶的AI科学家,要做最能懂你的“智能体”
3 6 Ke· 2025-11-24 08:02
Core Insights - The article emphasizes the importance of maintaining an entrepreneurial mindset in AI research and development, focusing on rapid iteration and learning from failures [1][2][4] Group 1: Innovation and AI Development - Wu Yi's team developed the AReaL-lite framework, which significantly enhances AI training efficiency and reduces GPU waste [1] - The shift from traditional supervised learning to reinforcement learning is highlighted as crucial for developing intelligent AI capable of long-term task execution [6][33] - Wu Yi believes that the future of AI lies in creating intelligent agents that can understand vague human commands and perform complex tasks autonomously [12][13] Group 2: Entrepreneurial Spirit and Team Dynamics - Wu Yi stresses the need for innovation and resource creation within entrepreneurial teams, rejecting the notion of waiting for perfect conditions to act [25][26] - The article discusses the challenges faced by Wu Yi's early startup team, emphasizing the importance of having a committed and innovative mindset among team members [25][28] - Wu Yi's approach to team organization in the AI era involves creating a minimalistic structure that leverages AI to enhance productivity and efficiency [50][52] Group 3: Future of AI and Robotics - The concept of embodied intelligence is introduced, where intelligent agents can interact with the physical world and perform tasks based on minimal instructions [13][14] - Wu Yi envisions a future where multiple intelligent agents can collaborate to complete complex tasks, similar to a coordinated sports team [15][20] - The transition from digital to physical world applications of AI requires advancements in multi-modal data and training environments [21][22] Group 4: Learning and Adaptation - Wu Yi likens his career journey to a reinforcement learning process, emphasizing the value of learning through trial and error [29][30] - The article highlights the significance of prompt engineering in reinforcement learning, which is essential for effective AI training [35][36] - Wu Yi advocates for a layered approach in developing intelligent agents, combining low-level control with high-level reasoning capabilities [43][44]
长三角金融科技“嘉年华”启幕,探讨AI与金融深度融合路径
Guo Ji Jin Rong Bao· 2025-11-23 04:13
帅师指出,上海金融业联合会作为连接政府、金融机构、科技企业的桥梁,积极发挥平台的作用,推动人工智能在金融领域的应用与创新:一是成立了 金融科技专业委员会,搭建跨业态金融机构间的交流合作平台,促进人工智能等数字技术在金融领域的应用与推广;二是积极推进普惠金融顾问制度,引导 金融机构加大力度服务科技型企业,进一步促进创新成果转换,推动前沿科技在金融领域的落地和发展;三是协同推动区域数字金融高质量发展,联合会积 极发挥长三角金融业发展联盟的作用,与同业共谋长三角乃至全国金融科技高质量发展的新路径。 华东师范大学党委副书记孟钟捷表示,本次论坛以"AI FOR ALL"为主题,聚焦新科技时代的智能金融应用,意义深远。我们充分发挥经济数据科学、 计算机、人工智能等多学科交叉优势,积极布局金融科技前沿,通过成立长三角金融科技研究院,创设上海人工智能金融学院等,致力于构建"人工智能 +金融"的协同创新平台和人才培养高地。 长三角金融科技创新与应用全球大赛组委会主席、华东师范大学长三角金融科技研究院院长陈琦伟认为,长三角作为中国经济最发达的地区在新时期要 深化价值的变现,这是独特的任务和机会,在这个过程当中人工智能起到独特的作 ...
喝点VC|a16z对话AI领袖:AI的“蛮力”之路能走多远?从根本上具备人性,才能真正理解人们想要什么
Z Potentials· 2025-11-22 03:21
Core Insights - The discussion highlights the rapid advancements in AI technology and its potential to create a new wave of independent entrepreneurs, transforming the software development landscape [5][30]. - There is a divergence in opinions regarding the timeline and feasibility of achieving Artificial General Intelligence (AGI), with some experts expressing optimism about imminent breakthroughs while others remain skeptical [9][19]. AI Development Status and Path to AGI - Adam D'Angelo emphasizes that there are no fundamental challenges that cannot be solved by the brightest minds in the coming years, citing significant progress in reasoning models and code generation [3][8]. - Amjad Masad compares the current AI evolution to historical revolutions, suggesting that humanity is undergoing a transformative change that may not be easily defined [4][27]. - D'Angelo believes that the next five years will see a drastically different world, contingent on resolving current limitations in AI context and usability [8][10]. Economic Transformation and Future Societal Landscape - D'Angelo predicts that the economic impact of AI could lead to GDP growth far exceeding 4-5% if AI can perform tasks at a lower cost than human labor [21]. - Masad raises concerns about the second-order effects of AI on the job market, particularly the potential for entry-level jobs to be automated while expert roles remain [22][23]. - The conversation suggests that as AI automates more tasks, the nature of work will shift, with a potential increase in demand for roles that leverage human creativity and emotional intelligence [24][25]. Technological Landscape Evolution and Entrepreneurial Ecosystem Outlook - D'Angelo expresses excitement about the increase in independent entrepreneurs enabled by AI technologies, which allow individuals to bring ideas to fruition without the need for large teams [28][30]. - The discussion touches on the balance between large-scale companies and new entrants in the market, suggesting that both can coexist and thrive in the evolving landscape [32][36]. - Masad highlights the importance of AI in programming, indicating that as these tools improve, they will democratize software development, allowing more people to create complex applications [44]. Future Challenges and Ultimate Thoughts - The conversation reflects on the cultural implications of increased reliance on AI, particularly regarding knowledge sharing and collaboration among employees [49]. - D'Angelo and Masad both acknowledge the need for ongoing research and innovation in AI to unlock its full potential and address the challenges that arise from its integration into society [41][42].
低成本叫板GPT-5.1!马斯克杀入智能体
Sou Hu Cai Jing· 2025-11-22 02:41
Core Insights - xAI has launched two major updates for its xAI API: Grok 4.1 Fast and Agent Tools API, focusing on fast, low-cost, and agent-centric models [2][5] - Grok 4.1 Fast is the best-performing tool invocation model to date, supporting a context window of 2 million tokens, excelling in customer support and financial applications [2][8] - The model has improved its ranking in the Artificial Intelligence Index (AII) to sixth place and achieved a top score of 93.3% in the τ²-bench Telecom ranking, outperforming models like GPT-5.1 and Gemini 3 Pro [2][7] Pricing Structure - The pricing for Grok 4.1 Fast is set at $0.20 per million tokens for input, $0.05 for cached input, and $0.50 for output tokens, while the Agent Tools API starts at $5 for 1,000 successful calls [5][6] - Users can experience the services for free for two weeks until December 3 [5][29] Performance and Features - Grok 4.1 Fast has shown significant improvements in real-time information retrieval compared to its predecessor, Grok 4 Fast, but has underperformed in classic programming tasks [11][15] - The model has been trained using reinforcement learning in simulated environments, enhancing its tool invocation capabilities while maintaining cost-effectiveness [7][8] - The Agent Tools API allows developers to create autonomous agents capable of web browsing, searching X posts, executing code, and retrieving documents with minimal coding effort [20][22] Competitive Edge - Grok 4.1 Fast has set a new standard in factual accuracy, reducing hallucination rates by half compared to Grok 4 Fast, while maintaining competitive performance in the FactScore evaluation [25][27] - xAI's focus on integrating real-time data and deep research capabilities positions it favorably in the evolving AI landscape, emphasizing practical applications [30]
2025年度十大科普热词发布 大模型、人形机器人、智能体等入选
Zhong Guo Xin Wen Wang· 2025-11-21 06:59
人形机器人是一类在外观结构和运动方式上尽量接近人类、能够模仿人类行为的机器人,通常具有与人 类相似的躯干、四肢等身体结构。2025年5月,全球首个《人形机器人智能化分级标准》团体标准正式 发布,为人形机器人智能化能力的分级评估、技术研发和应用推广提供了统一的技术语言和评价体系。 2025年度十大科普热词发布 大模型、人形机器人、智能体等入选 中新网北京11月21日电 (记者 孙自法)记者11月21日从中国科普作家协会获悉,在当日举行的2025年全 国科普创作大会上,中国科普作家协会发布2025年度十大科普热词,大模型、人形机器人、智能体、科 幻产业等入选。 2025年度十大科普热词具体包括:全国科普月、科学家精神、大模型、低空经济、人形机器人、智能 体、创新文化、工业遗产、场景创新、科幻产业。它们分别从科技、文化、社会等维度,综合勾勒出 2025年中国科普事业发展、科技前沿动态、科学传播与社会文化融合的整体态势和核心方向,是当下以 及接下来一段时间科普创作者应该重点关注的方向和领域。 其中,大模型是一类多基于深度神经网络构建、具有海量参数的人工智能模型,包括大语言模型、视觉 大模型、多模态大模型以及面向科研的 ...
国泰海通|计算机:谷歌Gemini 3实现断层式领先,大模型竞争格局加速重构
Core Insights - The launch of Google's Gemini 3 marks a significant leap in large model technology, showcasing breakthroughs in reasoning, multi-modal capabilities, and code generation, while introducing a generative UI and the Antigravity agent platform [1][2][3] Group 1: Model Performance - Gemini 3 demonstrates substantial advancements in reasoning abilities, achieving a score of 37.5% in Humanity's Last Exam, up from 21.6% with the previous model, and scoring 31.1% in the ARC-AGI-2 test, nearly doubling the performance of GPT-5.1 [1] - The model excels in multi-modal understanding, setting new records in complex scientific chart analysis and dynamic video comprehension, laying a solid foundation for practical AI agents [1] - In mathematical reasoning, Gemini 3 has improved from basic operations to solving complex modeling and logical deduction problems, providing a reliable technical basis for high-level applications in engineering and financial analysis [1] Group 2: Code Generation and Design - Gemini 3 shows revolutionary progress in code generation and front-end design, reversing Google's competitive stance in programming contests and paving the way for large-scale commercial applications [2] - The model leads in LiveCodeBench and ranks first in four categories of the Design Arena, demonstrating its ability to generate functional code and aesthetically intelligent user interfaces that align with modern design standards [2] - The new architecture of Gemini 3, featuring sparse MoE design, supports a context length of millions of tokens, excelling in long document comprehension and fact recall tests [2] Group 3: Agent Capabilities - Gemini 3 achieves a qualitative leap in agent capabilities, becoming the first foundational model to deeply integrate general agent abilities into consumer products [3] - The model's tool usage capability has improved by 30% compared to its predecessor, excelling in terminal environment tests and long-duration business simulations, enabling it to autonomously plan and execute complex end-to-end tasks [3] - The introduction of the Antigravity agent development platform allows developers to engage in task-oriented programming at a higher abstraction level, transforming AI from a mere tool to an "active partner" [3]
刘德兵说上限,刘知远讲拐点:中国AI十年剧本被他们提前揭开了
3 6 Ke· 2025-11-20 09:57
他把当前在未来十年的阶段性,形容为"即将进入到人工智能革命高潮的前夜"。 在中关村举办的2025人工智能+大会,中国AI未来十年的关键"进度条"正在变得清晰。 大会间隙,人工智能百人会高级顾问——智谱董事长刘德兵与面壁智能联合创始人兼首席科学家、清华大学副教授刘知远接受了智东西的独家 采访。两位长期深耕一线的实践者,从基础模型到智能体演进,分享了他们对未来十年的观察与思考。 在谈到基础模型竞争时,刘德兵并不回避现实:在开源成为主流、结果可公开验证的当下,模型能力的差距会被迅速放大——"在一线开源模 型做到90分的情况下,再训一个85分的模型就没多少竞争力。" 他同时强调,坚持做难而正确的事情很重要,哪怕投入巨大,因为"基础模型决定了整个AI产业发展的上限"。他认为,未来的关键变量将更 多来自开源生态的成熟、行业场景的深度落地,以及AI逐渐成为"全民能力"所带来的广泛参与。 在刘知远看来,2025年的一个显著拐点是"AI+编程",这一能力正在成为软件生产力的重要支撑。 对于大模型如何迈向智能体,他强调的不是堆叠更多知识,而是让模型具备"在指定工作岗位上自主学习的成长能力",像大学毕业生一样,通 过真实任务的反馈 ...
推动人工智能在金融业的应用
腾讯研究院· 2025-11-20 09:03
杜晓宇 陈楚仪 腾讯金融研究院 在实践推进过程中,金融机构普遍遵循三项原则,目标指向提质增效。一是风险可控优先。聚焦幻觉风险 可控、信息边界清晰的场景开展应用,强化风险前置识别与防护。二是内部提效优先。率先面向技术研 发、运营管理等中后台流程落地应用,便于快速验证技术成效。三是辅助决策优先。强调赋能员工而非岗 位替代,通过工具化手段提升分析判断效率。 银行业发挥头雁效应,在应用深度与广度上保持领先。银行机构正在加大资本支出和研发投入力度,发挥 其场景广泛覆盖的优势,持续夯实技术底座、保障落地成效。从短期来看,代码助手、智能问答等成熟场 景快速释放效率红利,部分机构已有超过30%的代码由AI生成;从长期来看,AI应用正向智能投顾、一线 营销赋能等核心创收领域拓展。 技术普惠正在重塑行业竞争格局,为中小金融机构提供了换道发展的窗口。随着DeepSeek、腾讯混元等高 性能模型的开源与普及,AI应用的资金和技术门槛显著降低。中小机构可以将资源聚焦于特定业务场景与 私域数据价值挖掘,形成差异化优势,其关键在于战略聚焦与组织灵活性。例如,依托决策链条短等特 点,深耕供应链金融、特定客群财富管理等垂直领域,并与领先大模型 ...
低成本叫板GPT-5.1,马斯克杀入智能体
3 6 Ke· 2025-11-20 08:56
该模型在人工智能分析智能指数(AII)中跃升4位,达到第六位,仅次于第五位的Grok 4。其中,其在智能体调用测评²-Bench Telecom排行榜上以93.3% 的得分位居榜首,以更低成本超越了GPT-5.1(high)、Gemini 3 Pro等模型的性能表现,比Grok 4 Fast提高了27分。xAI还提到,Grok 4.1 Fast在事实性方 面更准确,幻觉率比Grok 4 Fast降低了一半。 ▲AII指数情况(图源:Artificial Analysis) 智东西11月20日报道,今日,马斯克的xAI公司推出xAI API的两大更新:快速、低成本、以智能体为中心的新模型Grok 4.1 Fast和智能体工具xAI Agent Tools API。 Grok 4.1 Fast是其迄今为止性能最佳的工具调用模型,拥有支持200万token上下文的窗口,它能够准确快速地进行推理并完成智能体任务,尤其擅长处理 客户支持和财务等复杂的实际应用场景。 ▲基于Grok 4.1 Fast搭建支持用户改预定的应用(图源:xAI) Agent Tools API使智能体能够访问实时X数据、网络搜索、远程代码执行等 ...
腾讯智慧零售出席CCFA新消费论坛:智能体成企业链接效率与增长的关键点
Jiang Nan Shi Bao· 2025-11-20 07:55
Core Insights - The CCFA New Consumption Forum highlighted the role of AI agents in retail industry upgrades, emphasizing the transition from "AI that answers questions" to "AI that performs tasks" [1][2] - Over 50% of retailers are utilizing AI across more than six operational scenarios, with over 80% actively testing or deploying generative AI applications [2] Group 1: AI in Retail - AI applications are becoming systematic and pervasive in retail, with significant adoption across various business scenarios [2] - The shift from traditional large model deployment to AI agents addresses challenges like model hallucination and task planning, enhancing efficiency and productivity [2] Group 2: Core Competitiveness - Retailers need to build core competitiveness in three areas: products and services, data and knowledge, and organizational culture [2] - High-quality data governance is essential for maximizing AI value, and organizations must encourage training and experimentation with AI [2] Group 3: Intelligent Agent Applications - Tencent's "Enterprise Intelligent Agent Application Planning Compass" categorizes intelligent agent applications into four quadrants: Efficient Assistant, Execution Expert, Decision Expert, and All-round Expert [3][4][5] - In the Efficient Assistant quadrant, AI enhances personalized service capabilities, significantly improving response times and employee knowledge utilization [3] - Execution Experts handle complex tasks with low planning dependency, exemplified by AI ordering systems in the restaurant industry [4] Group 4: Decision-Making and Optimization - Decision Experts leverage big data and operational insights to assist management in making informed decisions, improving the scientific basis of business expansion [5] - All-round Experts manage complex tasks and optimize resource integration, leading to substantial improvements in sales performance and conversion rates [5] Group 5: Strategic Initiatives - Tencent Cloud is committed to supporting the deployment of intelligent agents by providing a comprehensive development platform and ecosystem [5] - The goal is to accelerate the release and diffusion of AI productivity in the retail sector, enabling companies to achieve high-quality growth in a competitive landscape [5]