Workflow
多智能体协作
icon
Search documents
大模型这个坑,还有哪些可以发论文的点?
具身智能之心· 2025-07-05 02:25
Core Insights - The article emphasizes the rapid development of large language models (LLMs) and multimodal models, focusing on enhancing model efficiency, expanding knowledge capabilities, and improving reasoning performance as key research areas in artificial intelligence [1][2]. Course Objectives - The course aims to systematically explore cutting-edge optimization methods for large models, addressing challenges in parameter-efficient computation, dynamic knowledge expansion, and complex reasoning [1][2]. Enrollment Details - The course will accept 6 to 8 participants per session [3]. Target Audience - The course is designed for master's and doctoral students in the field of large models, individuals seeking to enhance their resumes for graduate studies abroad, and professionals in artificial intelligence looking to deepen their understanding of algorithm theory and research skills [4]. Course Outcomes - Participants will gain insights into classic and cutting-edge papers, coding implementations, and methods for writing and submitting research papers, thereby developing a clearer understanding of the subject matter [3][4]. Enrollment Requirements - Basic requirements include familiarity with deep learning/machine learning, basic knowledge of large model algorithms, proficiency in Python, and experience with PyTorch [5]. Course Structure - The course spans 12 weeks of online group research, followed by 2 weeks of paper guidance, and includes a maintenance period of 10 weeks for paper development [10]. Learning Requirements - Participants are expected to engage actively in discussions, complete assignments on time, and maintain academic integrity throughout the course [12]. Course Outline - The curriculum covers various topics, including model pruning, quantization, dynamic knowledge expansion, and advanced reasoning paradigms, with a focus on practical applications and coding [16][18].
下一代大模型高效计算:参数压缩、硬件适配与多模态推理、CoT等方向论文指导班来啦!
自动驾驶之心· 2025-07-04 07:13
Core Insights - The article discusses the rapid development of large language models (LLMs) and multimodal models, focusing on enhancing model efficiency, expanding knowledge capabilities, and improving reasoning performance as core issues in current AI research [1][2]. Course Overview - The course systematically explores cutting-edge optimization methods for large models, emphasizing three key areas: parameter-efficient computation, dynamic knowledge expansion, and complex reasoning [1]. - It addresses core challenges in model optimization, including lightweight methods such as pruning, sparsification, and quantization for parameter compression; dynamic knowledge injection techniques like retrieval-augmented generation (RAG) and parameter-efficient fine-tuning (PEFT) for knowledge expansion; and advanced reasoning paradigms such as chain-of-thought (CoT) and reinforcement learning optimization (GRPO) for reasoning enhancement [1]. Course Objectives - The course aims to help students systematically master key theoretical knowledge in specified directions and develop a clearer understanding of the content [5]. - It seeks to bridge the gap for students who lack direction and practical skills, enabling them to combine theoretical knowledge with coding practice and lay the groundwork for developing new models [5]. - The course also focuses on improving students' academic writing skills, providing guidance on manuscript preparation and submission [5]. Target Audience - The course is designed for master's and doctoral students in the field of large models, those seeking to enhance their resumes for graduate studies abroad, and professionals in the AI field looking to systematically improve their algorithmic theory and writing skills [6]. Admission Requirements - Basic requirements include a foundational understanding of deep learning/machine learning, familiarity with Python syntax, and experience with PyTorch [7]. Course Structure - The course consists of 12 weeks of online group research followed by 2 weeks of paper guidance, culminating in a 10-week paper maintenance period [11]. - Students will analyze classic and cutting-edge papers, understand key algorithms and principles, and develop their research ideas [11]. Weekly Breakdown - The course covers various topics, including model pruning, quantization, dynamic knowledge expansion, advanced reasoning techniques, and multimodal understanding [16][18]. - Each week includes specific themes and outputs, such as determining research ideas, optimizing model size and performance, and enhancing coding capabilities [16][18]. Additional Resources - The course provides access to datasets from public sources and baseline code tailored to specific applications [13][14]. - Essential papers and resources are recommended for foundational knowledge and advanced techniques in model optimization [15][17].
智能体不断进化,协作风险升高:五大安全问题扫描
Core Insights - The year 2025 is anticipated to be the "Year of Intelligent Agents," marking a paradigm shift in AI development from conversational generation to automated execution, positioning intelligent agents as key commercial anchors and the next generation of human-computer interaction [1] Group 1: Development and Risks of Intelligent Agents - As intelligent agents approach practical application, the associated risks become more tangible, with concerns about overreach, boundary violations, and potential loss of control [2] - A consensus exists within the industry that the controllability and trustworthiness of intelligent agents are critical metrics, with safety and compliance issues widely recognized as significant [2] - Risks associated with intelligent agents are categorized into internal and external security threats, with internal risks stemming from vulnerabilities in core components and external risks arising from interactions with external protocols and environments [2] Group 2: AI Hallucinations and Decision Errors - Over 70% of respondents in a safety awareness survey expressed concerns about AI hallucinations and erroneous decision-making, highlighting the prevalence of factual inaccuracies in AI-generated content [2] - In high-risk sectors like healthcare and finance, AI hallucinations could lead to severe consequences, exemplified by a hypothetical 3% misdiagnosis rate in a medical diagnostic agent potentially resulting in hundreds of thousands of misdiagnoses among millions of users [2] Group 3: Practical Applications and Challenges - Many enterprises have found that intelligent agents currently struggle to reliably address hallucination issues, leading some to abandon AI solutions due to inconsistent performance [3] - A notable case involved Air Canada's AI customer service, which provided incorrect refund information, resulting in the company being held legally accountable for the AI's erroneous decision [3] Group 4: Technical Frameworks and Regulations - Intelligent agents utilize various technical bridges to connect with the external world, employing two primary technical routes: an "intent framework" based on API cooperation and a "visual route" that bypasses interface authorization barriers [4] - Recent evaluations have highlighted chaotic usage of accessibility permissions by mobile intelligent agents, raising significant security concerns [5] Group 5: Regulatory Developments - A series of standards and initiatives have emerged in 2024 aimed at enhancing the management of accessibility permissions for intelligent agents, emphasizing user consent and risk disclosure [6] - The standards, while not mandatory, reflect a growing recognition of the need for safety in the deployment of intelligent agents [6] Group 6: Security Risks and Injection Attacks - Prompt injection attacks represent a core security risk for all intelligent agents, where attackers manipulate input prompts to induce the AI to produce desired outputs [7][8] - The emergence of indirect prompt injection risks, particularly with the rise of MCP (Multi-Channel Protocol) tools, poses new challenges as attackers can embed malicious instructions in external data sources [8][9] Group 7: MCP Services and Security Challenges - The MCP service Fetch has been identified as a significant entry point for indirect prompt injection attacks, raising concerns about the security of external content accessed by intelligent agents [10] - The lack of standardized security certifications for MCP services complicates the assessment of their safety, with many platforms lacking rigorous review processes [11] Group 8: Future of Intelligent Agent Collaboration - The development of multi-agent collaboration mechanisms is seen as crucial for the practical deployment of AI, with various companies exploring the potential for intelligent agents to work together on tasks [12][13] - The establishment of the IIFAA Agent Security Link aims to provide a secure framework for collaboration among intelligent agents, addressing issues of permissions, data, and privacy [14]
从 OpenAI 回清华,吴翼揭秘强化学习之路:随机选的、笑谈“当年不懂股权的我” | AGI 技术 50 人
AI科技大本营· 2025-06-19 01:41
Core Viewpoint - The article highlights the journey of Wu Yi, a prominent figure in the AI field, emphasizing his contributions to reinforcement learning and the development of open-source systems like AReaL, which aims to enhance reasoning capabilities in AI models [1][6][19]. Group 1: Wu Yi's Background and Career - Wu Yi, born in 1992, excelled in computer science competitions and was mentored by renowned professors at Tsinghua University and UC Berkeley, leading to significant internships at Microsoft and Facebook [2][4]. - After completing his PhD at UC Berkeley, Wu joined OpenAI, where he contributed to notable projects, including the "multi-agent hide-and-seek" experiment, which showcased complex behaviors emerging from simple rules [4][5]. - In 2020, Wu returned to China to teach at Tsinghua University, focusing on integrating cutting-edge technology into education and research while exploring industrial applications [5][6]. Group 2: AReaL and Reinforcement Learning - AReaL, developed in collaboration with Ant Group, is an open-source reinforcement learning framework designed to enhance reasoning models, providing efficient and reusable training solutions [6][19]. - The framework addresses the need for models to "think" before generating answers, a concept that has gained traction in recent AI developments [19][20]. - AReaL differs from traditional RLHF (Reinforcement Learning from Human Feedback) by focusing on improving the intelligence of models rather than merely making them compliant with human expectations [21][22]. Group 3: Challenges in AI Development - Wu Yi discusses the significant challenges in entrepreneurship within the AI sector, emphasizing the critical nature of timing and the risks associated with missing key opportunities [12][13]. - The evolution of model sizes presents new challenges for reinforcement learning, as modern models can have billions of parameters, necessitating adaptations in training and inference processes [23][24]. - The article also highlights the importance of data quality and system efficiency in training reinforcement learning models, asserting that these factors are more critical than algorithmic advancements [30][32]. Group 4: Future Directions in AI - Wu Yi expresses optimism about future breakthroughs in AI, particularly in areas like memory expression and personalization, which remain underexplored [40][41]. - The article suggests that while multi-agent systems are valuable, they may not be essential for all tasks, as advancements in single models could render multi-agent approaches unnecessary [42][43]. - The ongoing pursuit of scaling laws in AI development indicates that improvements in model performance will continue to be a focal point for researchers and developers [26][41].
百度心响上线iOS版,多智能体协作应用终于卷对地方了
量子位· 2025-05-27 03:53
Core Viewpoint - The article discusses the launch and features of Baidu's new multi-agent collaboration application, Xinxiang APP, highlighting its user-friendly design and comprehensive capabilities in various scenarios, including travel planning and deep research [1][2][3][4][5][6][14]. Group 1: Application Features - Xinxiang APP is available for both iOS and Android users for free, with no usage limits [3][4]. - The app supports a wide range of functionalities, allowing users to create travel itineraries and conduct in-depth research seamlessly [5][6][26]. - Users can generate detailed travel plans, including routes and recommendations, significantly reducing planning time [17][22]. Group 2: Deep Research Capabilities - The app can analyze and present complex technical information, such as the latest 3nm chip from Xiaomi, in a structured and visually appealing format [9][40]. - It employs a multi-step process to gather and analyze data, ensuring comprehensive insights into technology and market impacts [28][30]. Group 3: Professional Consultation Services - Xinxiang APP offers AI-driven health consultation services, mimicking the process of a human doctor by asking detailed questions to provide accurate diagnoses [45][46]. - The app is set to introduce features for interpreting medical reports, enhancing user understanding of health conditions [48]. Group 4: User Experience and Accessibility - The app is designed to be user-friendly, with low interaction barriers and no need for complex prompts, making it accessible to a broader audience [67][70]. - It features a "Inspiration Square" that provides examples and encourages user exploration, further enhancing the user experience [68][70]. Group 5: Market Trends and Future Outlook - The article notes a growing trend in AI applications focusing on multi-agent collaboration, emphasizing the need for reliable execution capabilities in AI products [71][72][74]. - The demand for AI solutions that simplify everyday tasks for ordinary users is increasing, positioning Xinxiang APP as a potential leader in this emerging market [75][79].
百度李彦宏:帮助开发者全面拥抱MCP
Guang Zhou Ri Bao· 2025-04-27 19:06
Core Insights - The annual Create2025 Baidu AI Developer Conference was held on April 25, where Baidu's founder Li Yanhong delivered a speech titled "The World of Models is the World of Applications" [2] - Baidu launched two new large models, Wenxin 4.5 Turbo and Wenxin X1 Turbo, which offer enhanced capabilities at significantly reduced costs, with prices dropping by up to 80% [3][4] - The conference emphasized the importance of applications over models, with Li Yanhong stating that without applications, models and chips hold no value [3] Model Launches - Baidu introduced Wenxin 4.5 Turbo and Wenxin X1 Turbo, featuring multi-modal capabilities, strong reasoning, and low costs [3] - Wenxin 4.5 Turbo's price decreased by 80% compared to its predecessor, while Wenxin X1 Turbo's performance improved alongside a 50% price reduction [4] Developer Support - Li Yanhong highlighted that high model costs have been a barrier for developers, and lowering these costs will encourage more development and deployment of AI applications across industries [4] - The Wenxin models' efficiency improvements are attributed to joint optimization techniques, with Wenxin 4.5 Turbo achieving a training throughput 5.4 times that of Wenxin 4.5 and an inference throughput 8 times higher [4] AI Applications - Multiple AI applications were launched, covering popular sectors such as AI digital humans, code intelligence, and multi-agent collaboration [5] - The high-persuasion digital human is a notable application, designed for e-commerce, gaming, and consumer interactions, showcasing superior performance compared to traditional digital humans [5] New Technologies - The launch of Cangzhou OS, the world's first content domain operating system, allows users to generate structured AI notes and mind maps while watching videos [5] - Baidu's multi-agent collaboration app, Xinxiang, aims to solve complex user problems by coordinating various AI agents [6][7] Talent Development - Baidu plans to cultivate 10 million AI talents over the next five years and has launched the "AI Open Plan" to support developers [8][9] - The third "Wenxin Cup" entrepreneurship competition was announced, with a maximum investment prize of 70 million yuan to support entrepreneurs [9] Industry Collaboration - Baidu's Create Conference featured various sub-forums and announced partnerships to enhance AI applications in cultural heritage [10] - The Baidu Intelligent Cloud's GENERATE ecosystem conference introduced a "Large Model Industry Partner Program" to share opportunities and resources with partners [11]
百度发布通用超级智能体「心响」,要做真正“长在用户手机和心里”的超级有用App
IPO早知道· 2025-04-26 02:16
心响App现已覆盖知识解析、旅游规划、学习办公等场景中200个任务类型。 本文为IPO早知道原创 作者| Stone Jin 微信公众号|ipozaozhidao 据 IPO早知道消息, 百度 在 4月25日举行的 Create2025百度AI开发者大会 正式发布 了 多智能体 协作 App 「 心响 」,其定位 一站式解决用户复杂问题的 "通用超级智能体" 。目前,心响已覆盖 知识解析、旅游规划、学习办公等场景中 200个任务类型。 百度创始人李彦宏在本次开场演讲中 强调, "未来真正统治这个世界的是应用,应用才是王者。" 更 进一步来讲, 多智能体协作是下一个高价值的 AI应用方向。未来的AI应用将从回答问题走向任务交 付,而任何一个复杂任务的交付,都需要多智能体的协作来解析需求、分拆任务、调度资源、规划执 行,最终交付结果。 鲜为人知的是, 这样一款能解决 从信息检索到任务完成 全部流程的 通用超级智能体 , 诞生 于百 度 一支仅有几十人的自发的内部创业团队, 且 这群人大部分都是 95后 。 短短 30天 内,这支小 团队 从零开始 打造出了这款产品 。 当然,鉴于这款产品前期打磨的时间较短,故现 ...
智能体“组团”搞旅行 飞猪上线AI“问一问 ”
"问一问"提供的旅行定制方案可用度有多高,《中国经营报》记者第一时间进行了体验。 9个Agent一起"干活" 与此前各平台上线的旅行AI产品相比,飞猪"问一问"最大的不同在于"多智能体协作"。 包含行程助手、路线定制师、智慧交通顾问、酒店顾问、攻略达人、本地人导游、预算管理师等在内,一共多达9个人设同时在线,帮助用户分头解决机 酒、行程、预算等问题。 "'五一'云南昆明七天深度游。"记者用语音给"问一问"提出要求,系统自动识别指令出发地是"上海", 调用"路线定制师""智慧交通顾问""酒店顾问",1分钟 内给出了一个预算约为3831元"经济酒店+舒适航班"的深度游方案。 方案基本覆盖了昆明市及其周边主要景点,以及交通便利、价格适中、口碑度较好的住宿推荐。优先推荐下午出发直飞航班,推荐理由是方便安排后续行 程,如果考虑到价格因素,也有深夜出发的直飞推荐。 赶在"五一"长假之前,飞猪上线了AI产品"问一问",首批向飞猪F5及以上会员开放,邀请码一度在小红书、闲鱼上形成"二级市场", 一度被炒到40元。 按照官方介绍,"问一问"是一个由多个Agent(智能体)驱动的AI产品,能像专业旅游服务从业者一样思考问题、执行 ...
发布多智能体协作AI Agent,特斯联艾渝:通用智能体引领多智能体协作新变革
IPO早知道· 2024-12-03 03:57
从生成式AI到多智能体协同,通用智能体引领交互变革。 本文为IPO早知道原创 作者|Stone Jin 微信公众号|ipozaozhidao 据IPO早知道消息,有着科技界奥林匹克之称的Web Summit日前在葡萄牙里斯本落下帷幕。本届峰会吸引来自全球超10万余名科技界精英汇聚于此, 微 软 总 裁 布 拉 德 · 史 密 斯 (Brad Smith) 、 高 通 CEO 兼 总 裁 安 蒙 ( Cristiano Amon ) 、 全 球 半 导 体 公 司 Lattice CEO 莎 拉 · 富 兰 克 林 ( Sarah Franklin)、阿里巴巴国际站总裁张阔等科技界代表出席峰会并发表对于科技发展的洞见。 作为中国AI产业界代表,特斯联创始人兼CEO艾渝受邀分享了超级人工智能即服务(IaaS)驱动下的,技术新趋势与产业新路径,并联合国际轻奢品牌 Buttons发布了由特斯联AI技术驱动的多模态多智能体协作AI Agent---"Hali"。 艾渝将视角聚焦于AI与产品美学碰撞下诞生的全新趋势。人机交互的历史正在从数字时代向智能时代革命性跃迁,交互方式也已实现从CLI(Command Line In ...