Workflow
大模型智能体
icon
Search documents
RL 圈的夏夜之约!12 人唠嗑局:当强化学习撞上大模型 Agent
机器之心· 2025-07-08 04:09
Core Viewpoint - The article promotes an event titled "Reinforcement Learning New Paradigm Exploration Night," emphasizing the integration of reinforcement learning (RL) with large model agents, highlighting its significance in the current technological landscape [2][3]. Event Details - The event is scheduled for July 26, 2025, from 19:00 to 21:10, located near the Shanghai Expo Exhibition Center, aiming for an intimate gathering of only 12 participants to facilitate deep discussions [3][4]. - The event will cover three main topics: the synergy between reinforcement learning and large model agents, the dilemma of exploration versus stability in training strategies, and the challenges of aligning and evaluating intelligent agents [4]. Target Audience - The event is designed for individuals from academia, industry, and entrepreneurship, encouraging participants to bring their latest research, practical experiences, and product challenges for collaborative discussions [5][6]. - The focus is on fostering an environment for lively exchanges of ideas rather than formal presentations, aiming for a dynamic and engaging atmosphere [6][7]. Participation Information - Interested participants are encouraged to scan a QR code to express their identity (academic, industry, or entrepreneurial) and the specific RL challenges they wish to discuss, with limited spots available [8]. - The article emphasizes the importance of engaging in meaningful technical discussions and debates, suggesting that the event will provide a unique opportunity for networking and collaboration [9].
大模型智能体如何突破规模化应用瓶颈,核心在于Agentic ROI
机器之心· 2025-05-30 04:16
Core Viewpoint - The main barrier to the usability of large language model agents (LLM Agents) is not the capability of the models but rather the "Agentic ROI" which has not reached a practical threshold for widespread application [1][3][4]. Group 1: Agentic ROI Concept - Agentic ROI (Agentic Return on Investment) is a key metric that measures the ratio of "information yield" to "usage cost" for LLM Agents in real-world scenarios [4]. - Usability is achieved only when the quality of information exceeds a certain threshold and the ratio of time and cost saved by the agent is sufficiently high [4][5]. Group 2: Current Application Landscape - Most LLM Agents are currently applied in high human task time cost scenarios, such as research and programming, where human labor is intensive, thus allowing for significant efficiency improvements [7]. - In everyday applications with high user demand, such as e-commerce and personal assistants, the tasks are simpler, leading to lower marginal value from LLM Agents, which may introduce additional interaction costs and delays, resulting in low Agentic ROI [7]. Group 3: Development Trajectory - The development path of LLM Agents is characterized by a "zigzag" model of first scaling up to enhance information quality, followed by scaling down to reduce time and cost while maintaining quality [9]. - The evolution of foundational models, such as the OpenAI series, illustrates this zigzag trend, with significant performance improvements in larger models and the introduction of smaller models that maintain performance while reducing inference costs and delays [9]. Group 4: Scaling Up Information Quality - Pre-training scaling involves expanding model size, data volume, and computational resources to enhance foundational capabilities in language understanding and reasoning [11]. - Post-training scaling, including supervised fine-tuning and reinforcement learning, aligns the agent's performance with human needs and values, relying on extensive interaction data for continuous learning [12]. - Test-time scaling focuses on building a world model that supports multimodal interactions and can handle complex tasks while reflecting real-world uncertainties [13]. Group 5: Ensuring Robustness and Security - Ensuring the robustness and security of LLM Agents is crucial for enhancing information quality, preventing exploitation of reward mechanisms, and safeguarding against data contamination and feedback manipulation [16]. Group 6: Scaling Down to Reduce Time and Cost - Introducing memory mechanisms allows agents to skip redundant calculations, leveraging past knowledge to enhance processing speed [18]. - Model compression techniques can significantly reduce computational resources and inference delays without compromising performance [18]. - Optimizing reasoning strategies and infrastructure can further enhance the efficiency and responsiveness of LLM Agents [18]. Group 7: Cost Management - Reducing interaction time by enabling agents to proactively understand user intent can lower cognitive burdens and improve user experience [19]. - Managing operational costs effectively is essential, especially in large-scale deployments, by optimizing context management and controlling inference complexity [19]. - Agentic ROI serves as a framework for evaluating the real usability of LLM Agents, shifting focus from mere model performance to practical benefits and comprehensive efficiency [19].
探元计划香港站|AI 赋能历史溯源,解码九龙寨城中华文脉基因
腾讯研究院· 2025-05-23 07:47
Core Viewpoint - The "Exploration Plan 2024" aims to integrate culture and technology to promote the digital preservation of cultural heritage, with a focus on the "In Kowloon City, Witness Hong Kong" project, which highlights the historical significance of Kowloon City and its cultural narratives [3][10]. Group 1: Project Overview - The "In Kowloon City, Witness Hong Kong" project is a collaboration between Hong Kong United Publishing Group, Electronic Publishing Co., and Huacui Starlight (Beijing) Intelligent Technology Co., utilizing advanced technologies like large model agents and 3D virtual spaces to recreate the cultural essence of Kowloon City [3][4]. - The project was selected from 81 cultural demand scenarios as one of the six key cultural co-creation scenes under the "Exploration Plan 2024" [4]. Group 2: Technological Innovations - The project team is developing a multimodal knowledge intelligent agent that supports bilingual and trilingual interactions, enhancing user engagement with Kowloon City's historical culture [4]. - An AI interactive narrative game is being designed to create immersive learning experiences, encouraging public interest in Kowloon City's history [4]. - A 3D virtual space of Kowloon City will be constructed to allow users to experience different historical periods and cultural customs [4]. Group 3: Expert Insights and Discussions - Experts from various sectors, including cultural institutions and universities, discussed the importance of technology and culture working together to enhance cultural dissemination and user engagement [11]. - The discussions emphasized the need for a shift from one-way cultural output to a collaborative and shared approach, utilizing gamification and user-generated content to stimulate cultural transmission [11]. - The project aims to create sustainable development models by integrating educational and cultural tourism resources, focusing on local schools and Kowloon City Park as pilot sites [11]. Group 4: Future Events and Exhibitions - The results of the "In Kowloon City, Witness Hong Kong" project will be showcased at the Shenzhen Cultural Expo from May 22 to 26 and at the Hong Kong Book Fair from July 16 to 22 [13].