FedVLR框架
Search documents
AAAI 2026 Oral | 悉尼科技大学联合港理工打破「一刀切」,联邦推荐如何实现「千人千面」的图文融合?
机器之心· 2025-11-25 04:09
Core Insights - The article discusses the introduction of a new framework called FedVLR, which addresses the challenges of multimodal integration in federated learning environments while ensuring data privacy [2][3][19]. Multimodal Integration Challenges - Current recommendation systems utilize multimodal information, such as images and text, but face difficulties in federated learning due to privacy concerns [2][5]. - Existing federated recommendation methods either sacrifice multimodal processing for privacy or apply a one-size-fits-all approach, which does not account for individual user preferences [2][5]. FedVLR Framework - The FedVLR framework redefines the decision-making flow for multimodal integration by offloading heavy computation to the server while allowing users to control how they view the data through a lightweight routing mechanism [3][19]. - It employs a two-layer fusion mechanism that decouples feature extraction from preference integration [8][19]. Server-Side Processing - The first layer involves server-side "multi-view pre-fusion," where the server processes data using powerful pre-trained models to create a set of candidate fusion views without burdening client devices [9][10]. - This approach ensures that the server prepares various "semi-finished" views that contain high-quality content understanding [10]. Client-Side Personalization - The second layer focuses on client-side "personalized refinement," utilizing a lightweight local mixture of experts (MoE) routing mechanism to dynamically compute personalized weights based on user interaction history [11][12]. - This process occurs entirely on the client side, ensuring that user preference data remains on the device [12]. Performance and Versatility - FedVLR is designed to be a pluggable layer that can integrate seamlessly with existing federated recommendation frameworks like FedAvg and FedNCF, without increasing communication overhead [16]. - The framework demonstrates model-agnostic capabilities, allowing it to enhance various baseline models significantly [26]. Experimental Results - The framework has been rigorously tested on public datasets across e-commerce and multimedia domains, showing substantial and stable improvements in core recommendation metrics like NDCG and HR [26]. - Notably, FedVLR performs exceptionally well in sparse data scenarios, effectively leveraging limited local data to understand item content [26]. Conclusion - FedVLR not only enhances recommendation systems but also provides a valuable paradigm for implementing federated foundational models, addressing the challenge of utilizing large cloud models while maintaining data privacy [19].