Workflow
知识迁移
icon
Search documents
FDA对偶锚点:模型知识迁移的新视角——从参数空间到输入空间
机器之心· 2025-11-14 01:33
该项工作的作者分别是来自香港中文大学的博士生 施柯煊,来自 西湖大学的助理教授 温研东,来自 香港中文大学的计算机系助理教授 刘威杨。 当前,基于通用基础模型进行任务特定微调已成为主流范式。这种范式虽然能够在各个特定任务上获得高性能的专家模型,但也带来新的挑战:如何将这些 特定微调得到的专家模型的能力有效整合到单一模型中并且无需访问原始训练数据,实现多任务协通,同时最小化性能损失? 针对这一问题, 研究者们 提出了 FDA(Model Merging with Functional Dual Anchors) ——一个全新的模型融合框架。与传统的参数空间操作不同, FDA 将专家模型的参数知识投射到输入-表征空间中的合成锚点,通过功能对偶的方式实现更高效的知识整合。 FDA 的关键思想是:将参数中所蕴藏的任务知识,用输入空间的一组对偶的合成输入点(Dual Anchors)来表示;使用合成输入点所诱导的联合梯度,更 新模型,以整合多任务知识。 具体来说,任务知识在参数空间上可以体现为模型最终的参数与初始参数的差异向量(任务向量,Task Vector)。FDA 为每一个专家模型,构造一组 Dual Ancho ...
世界人工智能大会,AI教父Hinton告诉你的25个道理
3 6 Ke· 2025-07-29 23:58
Core Insights - Geoffrey Hinton, a prominent figure in AI, discussed the evolution of AI from symbolic reasoning to neural networks at the WAIC 2025, emphasizing the importance of understanding language through large language models (LLMs) [1][2][10] Group 1: Evolution of AI Understanding - For over 60 years, there have been two paradigms in AI: the logical heuristic paradigm focusing on symbolic reasoning and the biological paradigm emphasizing neural network learning [1] - Hinton's early model in 1985 aimed to merge these theories by predicting the next word based on features, which laid the groundwork for modern LLMs [2] - The development of LLMs has evolved from Hinton's initial models to more complex structures capable of processing vast amounts of input and creating intricate relationships [2][3] Group 2: Mechanism of Language Understanding - LLMs and human language understanding share similarities, converting language into features and integrating them across neural network layers for semantic comprehension [3] - Hinton uses the analogy of LEGO blocks to describe how words can be combined to form complex semantic structures, highlighting the flexible nature of language [3][4] - Understanding language is compared to deconstructing a protein molecule rather than creating a clear logical expression [3] Group 3: Knowledge Transfer and Collaboration - Knowledge transfer in humans is inefficient, often relying on explanations, while digital intelligence can share vast amounts of information directly [5][6] - Current technology allows for efficient knowledge migration and collaborative learning across different hardware setups, enhancing the capabilities of models like GPT-4 [6][7] - If independent intelligent agents can share weights and gradients, they can effectively exchange learned knowledge, leading to significant advancements [6][7] Group 4: AI's Future and Global Cooperation - Hinton warns of the potential dangers of AI surpassing human intelligence, emphasizing the need for control and ethical considerations in AI development [7][10] - The necessity for global cooperation in AI governance is highlighted, with a call for an international organization to ensure AI develops positively [8][9] - Hinton believes that the challenge of ensuring AI remains beneficial to humanity is one of the most critical issues of the era, requiring collective efforts [9][10]
世界人工智能大会,AI教父Hinton告诉你的25个道理
混沌学园· 2025-07-29 12:04
Core Viewpoint - The article discusses Geoffrey Hinton's insights on the relationship between AI and human intelligence, emphasizing the evolution of AI from symbolic reasoning to large language models (LLMs) and the implications of AI surpassing human intelligence [1][10]. Group 1: Evolution of AI Understanding - For over 60 years, there have been two distinct paradigms in AI: the logical inference paradigm, which views intelligence as symbolic reasoning, and the biological paradigm, which sees intelligence as rooted in understanding and learning through neural networks [1]. - In 1985, Hinton created a small model to explore how humans understand vocabulary by linking features of words to predict the next word without storing entire sentences [2]. - The development of LLMs is seen as a continuation of Hinton's early work, processing more input words and utilizing complex neural structures to build richer interactions [3]. Group 2: Mechanism of Language Understanding - LLMs and human language understanding mechanisms are highly similar, transforming language into features and integrating these features across neural network layers for semantic understanding [4]. - Each word in language is likened to a multi-dimensional Lego block, which can flexibly combine to form complex semantic structures, with the shape of words adapting based on context [6]. - Understanding a sentence is compared to deconstructing a protein molecule rather than converting it into a clear, unambiguous logical expression [5]. Group 3: Knowledge Transfer in AI - The human brain operates at 300,000 watts but cannot easily transfer knowledge to another person, relying instead on explanation [11]. - In contrast, digital intelligence allows for efficient knowledge transfer, directly copying parameters and structures without intermediary language, sharing trillions of bits of information during synchronization [13][14]. - Current technology enables the same model to be deployed across different hardware, facilitating efficient knowledge migration and collaborative learning [15]. Group 4: The Dangers of Advanced AI - There is a concern that AI could surpass human intelligence, leading to scenarios where AI becomes an active system with its own goals, potentially manipulating humans [18][19]. - Hinton warns that developing AI is akin to raising a tiger; once it grows powerful, losing control could be fatal [20]. - Despite the risks, AI holds significant value in various fields, and eliminating it is not feasible; instead, a method must be found to ensure AI does not threaten humanity [21]. Group 5: Global Cooperation for AI Safety - No single country desires AI to dominate the world, and if one country discovers a method to prevent AI from going rogue, others will likely follow suit [22][23]. - Hinton proposes the establishment of an international AI safety organization to research technology and create standards to ensure AI develops positively [24]. - The long-term challenge is to ensure that AI remains a supportive tool for humanity rather than a ruler, which is a critical issue for global collaboration [25].