RankMixer模型

Search documents
抖音全新推荐大模型RankMixer,参数翻70倍,推理成本不涨
量子位· 2025-08-01 09:05
Core Viewpoint - The article discusses the innovative recommendation algorithm architecture, RankMixer, developed by ByteDance, which significantly enhances the efficiency and effectiveness of video recommendations on platforms like Douyin while maintaining low inference costs [2][40]. Group 1: RankMixer Model Overview - RankMixer represents a new recommendation model architecture that increases the parameter scale from tens of millions (16M) to billions (1B), enhancing model performance without increasing inference latency [4][26]. - The model design focuses on aligning with GPU hardware characteristics, allowing for efficient computation through large matrix multiplications, thus overcoming memory bottlenecks [9][41]. - RankMixer incorporates innovative features such as TokenMixing and Per-Token SparseMoE, which improve the model's ability to capture diverse feature interactions and enhance parameter efficiency [12][24]. Group 2: Performance Metrics and Improvements - In the Douyin recommendation scenario, the RankMixer-1B model has shown a cumulative increase of over 0.3% in user active days and more than 1% in average daily usage time, indicating improved user engagement [4][35]. - The model's efficiency is highlighted by a 70-fold increase in parameters while keeping the inference cost stable, achieved through various optimization techniques [26][30]. - Offline metrics show that RankMixer-1B outperforms traditional DNN models, with an AUC increase of over 0.9% and UAUC improvement exceeding 1% [32]. Group 3: Technical Innovations - RankMixer employs Automatic Feature Tokenization to align input features into a uniform token sequence, facilitating parallel processing and maximizing hardware utilization [15][16]. - The TokenMixing module allows for efficient information exchange between tokens, enhancing the model's ability to leverage global information for better recommendations [19][20]. - The Per-Token SparseMoE architecture enables differentiated modeling of semantic subspaces, significantly increasing parameter capacity while reducing computational overhead [21][24]. Group 4: Future Implications - The successful implementation of RankMixer across various ByteDance applications demonstrates its potential as a universal ranking model architecture [39]. - The exploration of RankMixer validates the importance of co-designing algorithms with infrastructure to optimize machine learning performance and resource utilization [43][44].