Workflow
推荐系统
icon
Search documents
ISSCC 重磅:28nm CiM 芯片,能效飙升 181 倍,市场空间有多大?
是说芯语· 2026-03-02 02:41
在2026年国际固态电路大会(ISSCC 2026)上,清华大学、华为等联合发布的基于HYDAR框架的28nm混合存内计算(CiR)芯片,以RRAM为核心载 体,通过DL-ADC早期终止、PPSP调度流水线等核心优化,实现了高吞吐、高能效与高精度的平衡,其单芯片390K QPS吞吐率、1574K QPS/W能效比, 以及多芯片扩展后66倍QPS提升的性能表现,为推荐系统算力瓶颈突破提供了全新解决方案。 相较于传统DRAM、NAND TCAM加速器的固有缺陷,该芯片的技术突破不仅补齐了行业短板,更精准契合当下数字经济对高效算力的核心需求,具备 广阔的市场落地空间与行业赋能价值。 | 精准匹配高算力需求领域 推荐系统作为连接用户与海量内容的核心枢纽,已深度渗透至电商、流媒体、社交、广告等关键领域,而相似向量检索(SVS)的效率直接决定推荐体验 与运营成本,这也是CiR芯片的核心落地场景。结合芯片性能优势与行业需求,其市场应用可聚焦三大核心领域,且均具备极强的落地可行性。 破解大规模推荐算力瓶颈 当前互联网平台正面临用户规模激增、内容体量爆炸式增长的挑战,以电商、短视频、直播平台为代表,需在毫秒级响应时间内,从数十 ...
Meta电话会:推荐系统正在被大模型重写,没有智能眼镜的未来“难以想象”
Hua Er Jie Jian Wen· 2026-01-29 02:28
Core Insights - Meta reported strong Q4 earnings, with revenue and Q1 guidance exceeding analyst expectations, driven by advertising business [4] - The company is shifting its growth strategy from "better advertising systems" to "restructuring products and infrastructure centered around personal superintelligence" [4] - Meta's AI smart glasses are positioned as the next core computing device, akin to the transition from feature phones to smartphones [5] Financial Performance - Q4 total revenue reached $59.9 billion, a 24% year-over-year increase, with advertising revenue at $58.1 billion, up 24% [19][20] - The average price per ad increased by 6% due to higher advertiser demand, contributing to an 18% growth in ad impressions across all regions [19] - Q4 operating income was $24.7 billion, with a net profit of $22.8 billion, translating to earnings per share of $8.88 [21] Strategic Focus - Meta is prioritizing AI and wearable devices, with Reality Labs shifting focus from the metaverse to AI wearables and self-developed models [4][5] - The company plans to release new AI models and products in the coming months, aiming to enhance user experience and business growth [6] - Meta's growth strategy includes improving interaction and monetization efficiency through upgraded recommendation systems [7] User Engagement and Content Optimization - Instagram Reels viewing time in the U.S. increased by 30% year-over-year, while Facebook video viewing time maintained double-digit growth [8] - Facebook's natural information flow and video post views increased by 7% in Q4, marking the largest quarterly revenue boost in two years [8] - 75% of recommended content on Instagram is now original, reflecting a focus on content freshness and originality [8] AI and Technology Development - Meta is developing a scalable recommendation system similar to large language models (LLMs) to enhance user engagement and ad performance [9] - The company is investing heavily in AI infrastructure, with capital expenditures projected to reach $115 billion to $135 billion in 2026 [11][33] - Meta's AI initiatives are expected to significantly improve operational efficiency and user experience, with a focus on personalized content delivery [13][30] Future Outlook - Meta anticipates Q1 2026 total revenue between $53.5 billion and $56.5 billion, with a positive outlook driven by strong demand and improved ad performance [32] - The company expects to maintain a robust cash flow to support ongoing investments in AI and infrastructure [49] - Meta's long-term strategy includes diversifying revenue streams beyond advertising, focusing on enterprise solutions and integrated tools for businesses [52]
刚刚,马斯克开源基于 Grok 的 X 推荐算法:Transformer 接管亿级排序
Sou Hu Cai Jing· 2026-01-20 20:23
Core Viewpoint - Elon Musk's company has open-sourced the X recommendation algorithm, which supports the "For You" feed by combining in-network and out-of-network content using a Grok-based Transformer model [1][9][12]. Group 1: Algorithm Functionality - The recommendation algorithm generates content for users' main interface from two primary sources: content from accounts they follow (In-Network) and other posts discovered on the platform (Out-of-Network) [3][4]. - The algorithm filters out low-quality, duplicate, or inappropriate content to ensure that only valuable candidates are processed for ranking [4][6]. - The core of the algorithm is a Grok-based Transformer model that scores each candidate post based on user behavior such as likes, replies, and shares, predicting the probability of various interactions [4][20]. Group 2: Historical Context - This is not the first time Musk has open-sourced the X recommendation algorithm; a previous release occurred on March 31, 2023, which included parts of the Twitter source code [9][11]. - Musk's commitment to transparency in the algorithm is seen as a response to criticism regarding the platform's content distribution mechanisms, which have been accused of bias [12][18]. Group 3: User Reactions - Users on the X platform have summarized key points about the recommendation algorithm, noting that engagement metrics like replies significantly impact visibility, while links in posts can reduce exposure [14][15]. - Some users have observed that while the architecture is open-sourced, certain elements remain undisclosed, indicating that the release is more of a framework than a complete engine [17]. Group 4: Importance of Recommendation Systems - Recommendation systems are crucial to the business models of major tech companies, with significant percentages of user engagement driven by these algorithms: Amazon (35%), Netflix (80%), and YouTube (70%) [18]. - The complexity of traditional recommendation systems has led to a desire for a unified model that can handle multiple tasks, a goal that large language models (LLMs) may help achieve [21][22]. Group 5: Technical Insights - The open-sourced algorithm lacks specific weight parameters and internal model parameters, which limits understanding of its decision-making processes [20]. - The introduction of LLMs into recommendation systems allows for a more abstract approach to feature engineering, enabling the model to understand and process user preferences without explicit instructions [22][23].
刚刚,马斯克开源基于 Grok 的 X 推荐算法!专家:ROI 过低,其它平台不一定跟
AI前线· 2026-01-20 09:36
Core Viewpoint - Elon Musk has open-sourced the X recommendation algorithm, which combines in-network content from followed accounts and out-of-network content discovered through machine learning, using a Grok-based Transformer model for ranking [3][12][18]. Summary by Sections Algorithm Overview - The open-sourced algorithm supports the "For You" feed on X, integrating content from both followed accounts and broader network sources, ranked by a Grok-based Transformer model [3][5]. - The algorithm fetches candidate posts from two main sources: in-network content (from accounts users follow) and out-of-network content (discovered through machine learning) [9][10]. Algorithm Functionality - The system filters out low-quality, duplicate, or inappropriate content to ensure only valuable candidates are processed [7]. - A Grok-based Transformer model scores each candidate post based on user interactions (likes, replies, shares, clicks), predicting the probability of various user actions [7][8]. Historical Context - This is not the first time Musk has open-sourced the X recommendation algorithm; a previous release occurred on March 31, 2023, which garnered over 10,000 stars on GitHub [12][14]. - Musk aims to enhance transparency in the algorithm to address criticisms regarding bias in content distribution on the platform [18][19]. User Reactions - Users on the X platform have summarized key insights about the recommendation algorithm, emphasizing the importance of engagement metrics like replies and watch time for content visibility [22][23]. Importance of Recommendation Systems - Recommendation systems are crucial to the business models of major tech companies, with significant percentages of user engagement driven by these algorithms (e.g., 35% for Amazon, 80% for Netflix) [25][27]. - The complexity of traditional recommendation systems often leads to high maintenance costs and challenges in cross-task collaboration [28]. Future Implications - The introduction of large language models (LLMs) presents new opportunities for recommendation systems, potentially simplifying engineering and enhancing cross-task learning [29][30]. - The open-sourcing of the X algorithm may not lead to immediate changes across other platforms, as they may lack the resources to implement similar systems [39].
突发!快手AI掌舵人周国睿即将离职,下一站爆出
Sou Hu Cai Jing· 2025-12-30 19:13
Core Insights - The news reports that Zhou Guorui, the head of Kuaishou's large model division, is set to leave the company, with his future plans currently unknown [2][4]. Company Overview - Zhou Guorui is a significant figure at Kuaishou, having held the position of Vice President and head of foundational large models and recommendation models [2][4]. - His LinkedIn profile indicates that he holds both bachelor's and master's degrees from Beijing University of Posts and Telecommunications, specializing in information and communication engineering [6]. Career Background - Prior to joining Kuaishou in 2021, Zhou worked at Alibaba's advertising division, focusing on deep learning applications in advertising ranking and model optimization [7][10]. - At Kuaishou, he advanced from the position of recommendation algorithm vice president to leading the large model and recommendation model teams [10]. Key Contributions - Zhou was instrumental in the development of the OneRec architecture, which significantly restructured the recommendation system, achieving larger models with lower costs [11][12]. - The OneRec system reportedly reduced operational costs to about one-tenth of previous levels while enhancing performance across various core business scenarios, including short video recommendations and e-commerce [12][14]. Future Implications - The immediate impact of Zhou's departure on Kuaishou's AI strategy is expected to be limited due to the established stability of the OneRec architecture and the company's commitment to self-developed recommendation models [18]. - However, the long-term effects may include challenges in technology iteration speed and potential instability in technical direction due to the loss of core talent [18].
NeurIPS 2025 | Language Ranker:从推荐系统的视角反思并优化大模型解码过程
机器之心· 2025-11-30 03:19
Core Insights - The article presents a new perspective on large language models (LLMs) by comparing their decoding process to the ranking stage of recommendation systems, highlighting the limitations of existing decoding methods and proposing an efficient, lightweight improvement framework called Language Ranker [2][3][33]. Group 1: Understanding LLMs - LLMs can be viewed as a specialized recommendation system that selects the most suitable responses from a vast candidate response space based on user input [3]. - The key components of LLMs correspond to those in recommendation systems, allowing for a clearer understanding of the limitations of current methods [6][11]. Group 2: Language Ranker Framework - Language Ranker framework is designed to overcome the limitations of traditional reward models by reusing features extracted from the main model, thus requiring only a small learning module for candidate response re-ranking [8][9]. - The framework consists of three steps: candidate recall, feature extraction, and candidate ranking, which collectively enhance the decoding process [10][14]. Group 3: Experimental Results - Language Ranker, with less than 0.5 million parameters, achieves performance comparable to large-scale reward models across various tasks, demonstrating significant efficiency [19][20]. - In the MBPP task, Language Ranker can be trained in just 67 seconds on a CPU, while traditional reward models take over an hour [21][23]. - The framework exhibits strong cross-task and cross-model adaptability, allowing a single Ranker to work across different tasks, thus reducing model management costs [24][26]. Group 4: Future Outlook - Language Ranker represents a new paradigm for optimizing the decoding phase of LLMs, emphasizing the importance of efficient selection of optimal answers rather than merely increasing model size [33]. - The framework supports personalized extensions, enabling the same main model to be paired with different Rankers to meet diverse application needs [15][33].
当推荐系统真正「懂你」:快手团队在NeurIPS 2025提出新成果TagCF
机器之心· 2025-11-27 04:09
Core Insights - The article discusses the development of a new recommendation system framework called TagCF, which aims to enhance user understanding in addition to content understanding, moving from "knowing what" to "understanding why" [2][43]. Group 1: Research Background and Motivation - The research highlights a gap in traditional recommendation systems, which often focus solely on content without understanding user identities and roles [2][5]. - The TagCF framework was developed in collaboration with Kuaishou's algorithm team, the foundational model and application department, and Wuhan University [2][3]. Group 2: Methodology and Framework - TagCF introduces two new tasks: User Role Identification, which models user characteristics and social roles, and Behavioral Logic Modeling, which explores the logical relationships between user roles and item topics [9][10]. - The framework consists of three main modules: a video content understanding platform based on MLLM, a behavioral logic graph exploration platform, and a downstream recommendation system enhancement [16][18][22]. Group 3: Experimental Results - Experiments showed that user role-based modeling statistically outperformed traditional topic modeling, leading to more stable and effective recommendations [7][40]. - The TagCF framework demonstrated significant improvements in recommendation accuracy and diversity, with TagCF-it and TagCF-ut models achieving notable performance metrics [34][36]. Group 4: Challenges and Solutions - The implementation faced challenges such as uncontrolled tag expansion and the need for precise scoring mechanisms [23][24]. - Solutions included constructing a cover set of high-frequency tags to ensure stability and generalizability in industrial applications [25][41]. Group 5: Conclusion and Future Directions - The article concludes that the TagCF framework represents a significant advancement in recommendation systems by integrating user understanding with content understanding, thus bridging the gap between statistical and symbolic modeling [43][45]. - Future work will focus on refining the tag-logic system and exploring its applications across various business scenarios, including e-commerce and search [44][45].
2018 - 2020,抖音超越快手的关键三年|42章经
42章经· 2025-11-16 12:59
Core Insights - The article discusses the rise of Douyin (TikTok) and its strategic decisions that led to its success, as shared by Yu Beichuan, a former employee who joined during its early days [2][3][11]. Group 1: Douyin's Growth Phases - Douyin was officially launched in 2016, with significant growth starting in mid-2017, leading to surpassing Kuaishou in daily active users (DAU) by early 2019 [3][11]. - The growth can be divided into several phases: initial growth from 2017 to 2018, rapid expansion from 2018 to 2019, and a focus on commercialization post-2020 [12][13][15]. - By the end of 2018, Douyin's DAU reached 30 million, and by early 2019, it had surpassed Kuaishou, becoming the leading short video platform [11][21]. Group 2: Key Strategic Decisions - Douyin's initial strategy involved not directing users from Toutiao, which allowed it to build a unique user base [46]. - The brand's youthful and independent aesthetic, along with strong content operations, attracted a younger audience [46][49]. - Significant marketing efforts included sponsoring the Spring Festival Gala in 2019, which resulted in a peak DAU of 470 million during the event [87][88]. Group 3: Challenges and Learnings - Despite rapid growth, there were internal concerns about the sustainability of user engagement and the potential DAU ceiling [21][22]. - Attempts to integrate social features were largely unsuccessful, highlighting the challenges of fostering user interaction in a primarily content-driven platform [24][27]. - The company learned that maintaining a balance between rapid growth and user retention was crucial, leading to a focus on enhancing user interaction [81][82]. Group 4: Organizational Culture and Impact - ByteDance's flat organizational structure allowed for direct communication across levels, fostering a culture of ambition and opportunity for young talent [100][106]. - The company's emphasis on extreme execution and strategic thinking contributed to its innovative approach and competitive edge in the market [114][121]. - As the company grew, maintaining its original culture became a challenge, leading to concerns about losing its competitive spirit [108][109].
小红书RecSys 2025最佳论文提名背后:破解视频时长预测难题
机器之心· 2025-10-20 04:50
Core Insights - The article highlights the impressive capabilities of Xiaohongshu's recommendation system, which has gained recognition at the RecSys 2025 conference for its innovative research and technology [4][6][7]. Group 1: Xiaohongshu's Recognition - Xiaohongshu's recommendation algorithm team received a "Best Paper Candidate" nomination at the prestigious RecSys 2025 conference for their paper on video watch time prediction [4][6]. - The conference is recognized as a leading academic event in the field of recommendation systems, attracting top scholars and industry experts from around the world [6][7]. - Xiaohongshu's technology and product have become focal points at the conference, with many attendees praising its recommendation capabilities as industry-leading [9][10]. Group 2: Research and Methodology - The paper titled "Multi-Granularity Distribution Modeling for Video Watch Time Prediction via Exponential-Gaussian Mixture Network" addresses the critical issue of predicting user watch time, which is essential for enhancing user engagement [17][22]. - The research identifies complex user behavior patterns in watch time, highlighting the challenges of skewed distributions and diverse viewing habits [30][31]. - The proposed Exponential-Gaussian Mixture Network (EGMN) model combines classic probability distributions to predict the complete probability distribution of watch time rather than a single value [33][35]. Group 3: Performance and Validation - EGMN demonstrated superior performance in offline experiments, achieving a 14.11% reduction in Mean Absolute Error (MAE) and a 7.76% increase in ranking consistency [39]. - Online A/B testing covering 15 million users over seven days showed significant improvements in key metrics, with a 19.94% decrease in KL divergence, indicating strong distribution fitting capabilities [40][41]. - Ablation studies confirmed the effectiveness of EGMN's components, validating the contributions of both the exponential and Gaussian components to the model's performance [42]. Group 4: Future Directions - The article emphasizes Xiaohongshu's commitment to a pragmatic approach in technology development, focusing on real user problems and continuous exploration of cutting-edge recommendation algorithms [46][47]. - The success at RecSys 2025 is seen as a starting point for further advancements in the recommendation system field, with the team actively seeking talent to enhance their research efforts [47].
ICML spotlight | 一种会「进化」的合成数据!无需上传隐私,也能生成高质量垂域数据
机器之心· 2025-07-11 09:22
Core Viewpoint - The article discusses the challenges of data scarcity in the context of large models and introduces the PCEvolve framework, which aims to generate synthetic datasets while preserving privacy and addressing the specific needs of vertical domains such as healthcare and industrial manufacturing [1][2][10]. Group 1: Data Scarcity and Challenges - The rapid development of large models has exacerbated the issue of data scarcity, with predictions indicating that public data generation will not keep pace with the consumption rate required for training these models by 2028 [1]. - In specialized fields like healthcare and industrial manufacturing, the availability of data is already limited, making the data scarcity problem even more severe [1]. Group 2: PCEvolve Framework - PCEvolve is a synthetic data evolution framework that requires only a small number of labeled samples to generate an entire dataset while protecting privacy [2]. - The evolution process of PCEvolve is likened to DeepMind's FunSearch and AlphaEvolve, focusing on generating high-quality training data from existing large model APIs [2]. Group 3: Limitations of Existing Large Models - Existing large model APIs cannot directly synthesize domain-specific data, as they fail to account for various characteristics unique to vertical domains, such as lighting conditions, sampling device models, and privacy information [4][7]. - The inability to upload local data due to privacy and intellectual property concerns complicates the prompt engineering process and reduces the quality of synthetic data [9][11]. Group 4: PCEvolve's Mechanism - PCEvolve employs a new privacy protection method based on the Exponential Mechanism, which is designed to adapt to the limited sample situation in vertical domains [11]. - The framework includes an iterative evolution process where a large number of candidate synthetic data are generated, followed by a selection process that eliminates lower-quality data based on privacy-protected scoring [11][19]. Group 5: Experimental Results - PCEvolve's effectiveness was evaluated through two main approaches: the impact of synthetic data on downstream model training and the quality of the synthetic data itself [21]. - In experiments involving datasets such as COVIDx and Came17, PCEvolve demonstrated significant improvements in model accuracy, with the final accuracy for COVIDx reaching 64.04% and for Came17 reaching 69.10% [22][23].