推荐系统

Search documents
ICML spotlight | 一种会「进化」的合成数据!无需上传隐私,也能生成高质量垂域数据
机器之心· 2025-07-11 09:22
Core Viewpoint - The article discusses the challenges of data scarcity in the context of large models and introduces the PCEvolve framework, which aims to generate synthetic datasets while preserving privacy and addressing the specific needs of vertical domains such as healthcare and industrial manufacturing [1][2][10]. Group 1: Data Scarcity and Challenges - The rapid development of large models has exacerbated the issue of data scarcity, with predictions indicating that public data generation will not keep pace with the consumption rate required for training these models by 2028 [1]. - In specialized fields like healthcare and industrial manufacturing, the availability of data is already limited, making the data scarcity problem even more severe [1]. Group 2: PCEvolve Framework - PCEvolve is a synthetic data evolution framework that requires only a small number of labeled samples to generate an entire dataset while protecting privacy [2]. - The evolution process of PCEvolve is likened to DeepMind's FunSearch and AlphaEvolve, focusing on generating high-quality training data from existing large model APIs [2]. Group 3: Limitations of Existing Large Models - Existing large model APIs cannot directly synthesize domain-specific data, as they fail to account for various characteristics unique to vertical domains, such as lighting conditions, sampling device models, and privacy information [4][7]. - The inability to upload local data due to privacy and intellectual property concerns complicates the prompt engineering process and reduces the quality of synthetic data [9][11]. Group 4: PCEvolve's Mechanism - PCEvolve employs a new privacy protection method based on the Exponential Mechanism, which is designed to adapt to the limited sample situation in vertical domains [11]. - The framework includes an iterative evolution process where a large number of candidate synthetic data are generated, followed by a selection process that eliminates lower-quality data based on privacy-protected scoring [11][19]. Group 5: Experimental Results - PCEvolve's effectiveness was evaluated through two main approaches: the impact of synthetic data on downstream model training and the quality of the synthetic data itself [21]. - In experiments involving datasets such as COVIDx and Came17, PCEvolve demonstrated significant improvements in model accuracy, with the final accuracy for COVIDx reaching 64.04% and for Came17 reaching 69.10% [22][23].
特想聊聊快手这次的变化
Hu Xiu· 2025-06-25 00:48
Core Viewpoint - Kuaishou has fully launched its AI model-driven recommendation system, OneRec, which is the first industrial-grade recommendation solution in the industry, setting a new standard globally [1][15]. Group 1: Technological Advancements - Kuaishou's technology has reached a top-tier level, particularly in video generation models [2]. - The company has made significant underlying technological advancements that surpass initial perceptions of it being merely a short video platform [3]. Group 2: Recommendation System Overview - Recommendation systems are a major technological innovation of the mobile internet era, utilized by popular platforms like Kuaishou, Douyin, and Pinduoduo [4]. - Traditional recommendation systems typically rely on user-based collaborative filtering and content-based collaborative filtering [4][6]. Group 3: Challenges in Traditional Systems - Traditional multi-stage recommendation systems face issues such as low overall GPU utilization and inefficiencies due to independent model operations [10][11]. - The complexity of user interests and the conflicting goals of increasing click-through rates while maintaining content diversity lead to decreased recommendation accuracy [9][10]. Group 4: OneRec's Innovations - OneRec shifts from a multi-stage filtering approach to an end-to-end model that directly generates a list of recommended videos based on user interests [16]. - The system employs a multi-modal semantic tokenizer to deeply understand video content beyond surface-level tags, enhancing content comprehension [21][24]. Group 5: User Modeling and Interest Tracking - OneRec integrates user behavior over time to create a comprehensive "interest sequence," allowing for more accurate recommendations that adapt to changing user preferences [28][30]. - The model uses deep neural networks to automatically learn complex interest changes from large datasets, improving recommendation accuracy [30]. Group 6: Recommendation Generation - The system utilizes an encoder-decoder structure, where the encoder compresses user interest trajectories into vectors, and the decoder generates a sequence of recommended content [32][33]. - The introduction of a Mixture of Experts (MoE) architecture enhances model capacity and efficiency, allowing for personalized recommendations while maintaining content diversity [34][36]. Group 7: Reinforcement Learning Integration - OneRec incorporates a reward mechanism using reinforcement learning to align user preferences with recommendation outcomes, enhancing the overall effectiveness of the system [38][44]. - The model's training includes various reward signals to ensure a balanced distribution of content types and to adapt to real-world business complexities [41][42]. Group 8: Performance Metrics - During the testing phase, OneRec demonstrated performance metrics comparable to existing complex systems, with user engagement metrics such as watch time and user lifecycle showing positive growth [46][47]. - In local life scenarios, OneRec achieved a 21% increase in GMV and significant growth in order volume and new customer acquisition [48]. Group 9: Future Considerations - Despite its advancements, OneRec still faces challenges related to inference speed, resource consumption, and further optimization of the reward mechanism [49]. - The introduction of OneRec marks a new phase in recommendation systems, aligning them with the latest advancements in AI and machine learning [49][50].
打破推荐系统「信息孤岛」!中科大与华为提出首个生成式多阶段统一框架,性能全面超越 SOTA
机器之心· 2025-06-20 10:37
Core Viewpoint - The article discusses the innovative UniGRF framework, which unifies retrieval and ranking tasks in recommendation systems using a single generative model, addressing inherent issues in traditional multi-stage recommendation paradigms [1][3][16]. Group 1: Pain Points of Traditional Recommendation Paradigms - Traditional recommendation systems typically employ a multi-stage approach, where a recall phase quickly filters a large item pool, followed by a ranking phase that scores and orders the candidates. This method, while efficient, often leads to information loss and performance bottlenecks due to the independent training of each phase [3][4]. - The separation of tasks can result in the premature filtering of potential interests outside the user's information bubble, causing cumulative biases and difficulties in inter-stage collaboration [3][4]. Group 2: Advantages of UniGRF - UniGRF integrates retrieval and ranking into a single generative model, allowing for full information sharing and reducing information loss between tasks [7]. - The framework is model-agnostic and can seamlessly integrate with various mainstream autoregressive generative model architectures, enhancing its flexibility [8]. - By maintaining a single model instead of two independent ones, UniGRF potentially improves efficiency in both training and inference processes [9]. Group 3: Key Mechanisms of UniGRF - The framework includes a Ranking-Driven Enhancer, which promotes effective collaboration between the recall and ranking phases by leveraging the high precision of the ranking outputs to guide the recall process [10][11]. - It also features a Gradient-Guided Adaptive Weighter that dynamically adjusts the weights of the loss functions for the two tasks based on their learning rates, ensuring synchronized optimization and overall performance enhancement [12]. Group 4: Experimental Results - Extensive experiments on three large public recommendation datasets (MovieLens-1M, MovieLens-20M, Amazon-Books) demonstrated that UniGRF significantly outperforms state-of-the-art (SOTA) models, highlighting the advantages of its unified framework [14][18]. - The framework shows particularly notable improvements in ranking performance, which is crucial as it directly impacts the quality of recommendations presented to users [18]. - Initial tests indicate that UniGRF adheres to the scaling law, suggesting potential performance gains with increased model parameters [18]. Group 5: Future Directions - The introduction of UniGRF offers a novel and efficient solution for generative recommendation systems, overcoming traditional multi-stage paradigm issues. Future research aims to expand the framework to include more recommendation stages and validate its large-scale applicability in real-world industrial scenarios [16][17].
推荐大模型来了?OneRec论文解读:端到端训练如何同时吃掉效果与成本
机器之心· 2025-06-19 09:30
Core Viewpoint - The article discusses the transformation of recommendation systems through the integration of large language models (LLMs), highlighting the introduction of the "OneRec" system by Kuaishou, which aims to enhance efficiency and effectiveness in recommendation processes [2][35]. Group 1: Challenges in Traditional Recommendation Systems - Traditional recommendation systems face significant challenges, including low computational efficiency, conflicting optimization objectives, and an inability to leverage the latest AI advancements [5]. - For instance, Kuaishou's SIM model shows a Model FLOPs Utilization (MFU) of only 4.6%/11.2%, which is significantly lower than LLMs that achieve 40%-50% [5][28]. Group 2: Introduction of OneRec - OneRec is an end-to-end generative recommendation system that utilizes an Encoder-Decoder architecture to model user behavior and enhance recommendation accuracy [6][11]. - The system has demonstrated a tenfold increase in effective computational capacity and improved MFU to 23.7%/28.8%, significantly reducing operational costs to just 10.6% of traditional methods [8][31]. Group 3: Performance Improvements - OneRec has shown substantial performance improvements in user engagement metrics, achieving a 0.54%/1.24% increase in app usage duration and a 0.05%/0.08% growth in the 7-day user lifecycle (LT7) [33]. - In local life service scenarios, OneRec has driven a 21.01% increase in GMV and an 18.58% rise in the number of purchasing users [34]. Group 4: Technical Innovations - The system employs a multi-modal fusion approach, integrating various data types such as video titles, tags, and user behavior to enhance recommendation quality [14]. - OneRec's architecture allows for significant computational optimizations, including a 92% reduction in the number of key operators, which enhances overall efficiency [27][28]. Group 5: Future Directions - Kuaishou's technical team identifies areas for further improvement, including enhancing inference capabilities, developing a more integrated multi-modal architecture, and refining the reward system to better align with user preferences [38].
特征工程、模型结构、AIGC——大模型在推荐系统中的3大落地方向|文末赠书
AI前线· 2025-05-10 05:48
Core Viewpoint - The article discusses the significant impact of large models on recommendation systems, emphasizing that these models have already generated tangible benefits in the industry rather than focusing on future possibilities or academic discussions [1]. Group 1: Impact of Large Models on Recommendation Systems - Large models have transformed the way knowledge is learned, shifting from a closed system reliant on internal data to an open system that integrates vast external knowledge [4]. - The structure of large models, typically based on transformer architecture, differs fundamentally from traditional recommendation models, which raises questions about whether they can redefine the recommendation paradigm [5]. - Large models have the potential to create a "new world" by enabling personalized content generation, moving beyond mere recommendations to directly creating tailored content for users [6]. Group 2: Knowledge Input Comparison - A comparison highlights that large models draw knowledge from an open world, while traditional systems rely on internal user behavior data, creating a complementary relationship [7]. - Large models possess advantages in knowledge quantity and embedding quality over traditional knowledge graph methods, suggesting they are the optimal solution for knowledge input in recommendation systems [8]. Group 3: Implementation Strategies - Two primary methods for integrating large model knowledge into recommendation systems are identified: generating embeddings from large language models (LLMs) and producing text tokens for input [10][11]. - The integration of multi-modal features through large models allows for a more comprehensive representation of item content, enhancing recommendation capabilities [13][15]. Group 4: Evolution of Recommendation Models - The exploration of large models in recommendation systems has progressed through three stages, from initial toy models to more industrialized solutions that significantly improve business metrics [20][24]. - Meta's generative recommendation model (GR) exemplifies a successful application of large models, achieving a 12.4% increase in core business metrics by shifting the focus from click-through rate prediction to predicting user behavior [24][26]. Group 5: Content Generation and Future Directions - The article posits that the most profound impact of large models on recommendation systems lies in the personalized generation of content, integrating AI creators into the recommendation process [28][29]. - Current AI-generated content still requires human input, but the potential for fully autonomous content generation based on user feedback is highlighted as a future direction [41][43]. Group 6: Industry Insights and Recommendations - The search and recommendation industry is viewed as continuously evolving, with the integration of large models presenting new growth opportunities rather than a downturn [45]. - The article suggests that the key to success in the next phase of recommendation systems lies in the joint innovation and optimization of algorithms, engineering, and large models [46].
在“推荐就是一切”的时代
Hu Xiu· 2025-05-08 09:54
Group 1 - The importance of choice in the age of artificial intelligence and how recommendation systems influence user decisions [2][3] - Recommendation engines are revolutionizing personalized choices and experiences globally, shaping the future of user interactions [4][5] - Companies like Netflix and TikTok utilize advanced algorithms to enhance user engagement and content discovery [6][7] Group 2 - The rise of recommendation systems parallels the industrial revolution, becoming a driving force in the digital economy [6] - TikTok's algorithm is recognized for its ability to promote diverse content and facilitate rapid dissemination of quality creations [7] - The demand for personalized information services is increasing, leading to a focus on metrics like precision, diversity, novelty, and fairness in recommendation systems [8][9] Group 3 - Fairness in recommendation systems has emerged as a critical metric, addressing biases that may affect different user groups and content creators [9][10] - The concept of "popularity bias" highlights the tendency of recommendation systems to favor mainstream content over niche offerings [11][12] - Various factors contribute to unfairness in recommendation systems, including historical data biases and algorithmic prioritization of engagement metrics [12][13] Group 4 - Companies are beginning to integrate fairness and transparency principles into their recommendation systems to enhance user experience [14] - The evolution of recommendation engines into self-discovery tools emphasizes the importance of user agency and self-awareness [15][16] - Effective recommendation systems can lead to greater self-insight for users, reflecting their preferences and aspirations [17][18]
胡泳:在“推荐就是一切”的时代
腾讯研究院· 2025-05-08 08:43
Core Viewpoint - The article discusses the transformative impact of recommendation systems in the digital age, questioning whether these systems empower individual choice or dictate user behavior, ultimately shaping personal destinies [2][4]. Group 1: Recommendation Systems and Their Influence - Recommendation systems are pervasive in daily life, influencing choices in music, movies, and travel through personalized suggestions [3][7]. - Netflix's approach to user experience is centered around the idea that "everything is a recommendation," tailoring content based on user preferences and viewing history [3][4]. - The rise of recommendation engines is likened to a revolution in personalized choice, raising questions about autonomy and the nature of decision-making in the age of AI [4][5]. Group 2: The Role of Algorithms - Algorithms are crucial for enhancing user experience by providing tailored recommendations, which can lead to increased engagement and satisfaction [6][7]. - The effectiveness of recommendation systems is linked to the volume and quality of data they process, with more data leading to better algorithm performance [6][7]. - TikTok's recommendation algorithm has been recognized for its ability to promote diverse content, allowing lesser-known creators to gain visibility alongside popular ones [8][12]. Group 3: Evaluation Metrics for Recommendations - Key metrics for assessing recommendation systems include precision, diversity, novelty, serendipity, explainability, and fairness [9][10]. - Precision measures the relevance of recommended content to user interests, while diversity ensures a broad range of topics is covered [9][10]. - Fairness has emerged as a critical metric, addressing biases in recommendations that may disadvantage certain groups or content creators [10][11]. Group 4: Addressing Fairness and Bias - The concept of "responsible recommendation" has gained traction, focusing on eliminating systemic biases in recommendation systems and ensuring equitable treatment across different demographics [14][15]. - Companies like Amazon, Netflix, and Spotify are actively working to incorporate fairness and transparency into their algorithms to avoid biases and promote diverse content [17][18]. - The need for transparency in recommendation logic is emphasized, allowing users to understand the basis for recommendations and fostering trust in the system [14][17]. Group 5: From Recommendation to Self-Discovery - The evolution of recommendation systems into self-discovery engines is highlighted, where users can gain deeper insights into their preferences and identities through tailored suggestions [19][20]. - Empowerment through better choices and the ability to explore new interests is a key aspect of this transformation, enhancing user engagement and self-awareness [20][21]. - Ultimately, understanding oneself and one's aspirations may increasingly depend on the interactions with intelligent recommendation systems [21].
Meta Platforms CEO扎克伯格:推荐系统的改进在过去六个月内推动用户在Facebook上的停留时间增加了7%,在Instagram上增加了6%,在Threads平台上则增长了35%。
news flash· 2025-04-30 21:12
Core Insights - Meta Platforms' CEO Mark Zuckerberg highlighted that improvements in recommendation systems have led to increased user engagement, with Facebook seeing a 7% increase in time spent, Instagram a 6% increase, and Threads a significant 35% increase over the past six months [1] User Engagement Metrics - Facebook: 7% increase in user time spent [1] - Instagram: 6% increase in user time spent [1] - Threads: 35% increase in user time spent [1]