Workflow
AI Engineer
icon
Search documents
Netflix's Big Bet: One model to rule recommendations: Yesu Feng, Netflix
AI Engineer· 2025-07-16 18:00
Foundation Model Strategy - Netflix is leveraging foundation models for personalized recommendations [1] - The strategy is based on work by Yesu Feng, a staff research scientist/engineer at Netflix, focused on generative foundation models [1] - Prior to Netflix, Feng worked on feed and marketplace optimization at LinkedIn and Uber, respectively [1] Industry Focus - The application of foundation models aims to improve personalized recommendations [1] - The discussion took place at the AI Engineer World's Fair in San Francisco [1]
360Brew: LLM-based Personalized Ranking and Recommendation - Hamed and Maziar, LinkedIn AI
AI Engineer· 2025-07-16 17:59
[Music] Hi everyone. Very excited to be here and I'm Ahmed. This is Mazar.And uh today uh uh we're going to talk about our journey in leveraging large language models for personalization and ranking u and our path to production such a large model for uh for LinkedIn use cases. Oop uh recommendation ranking and personalization is deeply integrated our daily life. uh when you go to a feed to to read an article, when you're looking for a for a job, when you're searching for something, when you're buying someth ...
What We Learned from Using LLMs in Pinterest — Mukuntha Narayanan, Han Wang, Pinterest
AI Engineer· 2025-07-16 17:58
[Music] Yeah. Hi everyone. Um, thanks for joining the talk today.Um, we're super excited to be here and shares some of the learnings we um, we have from integrating the LM into Pinterest search. My name is Khan and today I'll be presenting with Mukunda and we are both machine learning engineers from search relevance team at Pinterest. So start with a brief introduction to Pinterest.Um Pinterest is a visual discovery platform where piners can come to find inspiration to create a life they love. And there are ...
RL for Autonomous Coding — Aakanksha Chowdhery, Reflection.ai
AI Engineer· 2025-07-16 16:18
Large Language Models Evolution - Scaling laws 表明,增加计算量、数据和参数可以提高 Transformer 模型的性能,并推广到其他领域 [2][3] - 随着模型规模的扩大,性能持续提高,并在中等数学难题的解决率上有所体现,尤其是在提示模型展示思维链时 [5][7] - 通过强化学习和人类反馈,模型能够更好地遵循指令,从而实现聊天机器人等应用 [10][11] Inference Time Optimization - 通过生成多个响应并进行多数投票(自洽性),可以在推理时提高性能 [15] - 顺序修改之前的响应,特别是在可以验证答案的领域(如数学和编程),可以显著提高性能 [16][17] - 在可以验证答案的领域,推理时间计算的扩展可以转化为智能 [19] Reinforcement Learning for Autonomous Coding - 强化学习是下一个扩展前沿,特别是在可以自动验证输出的领域 [24] - 经验时代将通过强化学习构建超级智能系统,尤其是在具有自动验证的领域 [25] - 自动编码是一个扩展强化学习的绝佳领域,因为它具有验证输出的能力 [30][31] Challenges in Scaling Reinforcement Learning - 扩展强化学习比扩展 LLM 更具挑战性,因为它需要多个模型副本以及训练和推理循环 [29] - 在强化学习中,奖励模型的奖励函数设计是一个挑战 [29][30] Reflection's Mission - Reflection 致力于构建超级智能,并以自主编码作为根本问题 [33] - Reflection 团队由在 LLM 和强化学习领域有开创性工作的 35 位先驱组成 [33]
Recsys Keynote: Improving Recommendation Systems & Search in the Age of LLMs - Eugene Yan
AI Engineer· 2025-07-16 15:00
Industry Trend - Recommendation and search systems have been significantly impacted by advances in language modeling, evolving from Word2vec to GRUs, Transformers, and BERT [1] - The emergence of large language models (LLMs) is driving innovation in model architecture, scalable system designs, and customer experiences within recommendation and search systems [1] - The industry is exploring real-world implementations and measurable outcomes of LLMs in recommendation and search systems [1] Technological Advancement - LLM-driven techniques are expected to shape the future of content discovery and intelligent search [1] - Amazon is building recommendation systems and AI-powered products using ML/AI [1]
OpenAI's Sean Grove: Code is NOT all you do
AI Engineer· 2025-07-16 07:00
uh it feels tangible and real but it's sort of underelling the job that each of you does. Code is sort of 10 to 20% of the value that you bring. The other 80 to 90% is in structured communication and this is going to be different for everyone but a process typically looks something like you talk to users in order to understand their challenges.You distill these stories down and then ideulate about how to solve these problems. What what is the goal that you want to achieve. You plan ways to achieve those goa ...
OpenAI's Sean Grove: Everything is a Spec: The Universal Language of Intent
AI Engineer· 2025-07-16 00:01
Core Concept - The industry emphasizes that specifications are a universal concept applicable across various fields, including programming, product management, and lawmaking [1] - The industry views prompt engineering as a form of specification writing, aligning AI models with intentions and values [1] Benefits of Specifications - Specifications enable faster and safer product development and deployment [2] - Specifications allow for broader contributions from various roles, blurring the lines between traditional roles like PM, lawmaker, engineer, marketer, and programmer [2]
Benchmarks Are Memes: How What We Measure Shapes AI—and Us - Alex Duffy
AI Engineer· 2025-07-15 17:05
Benchmarks as Memes in AI - Benchmarks are presented as memes that shape AI development, influencing what models are trained and tested on [1][3][8] - The AI industry faces a problem of benchmark saturation, as models become too good at existing benchmarks, diminishing their value [5][6] - There's an opportunity for individuals to create new benchmarks that define what AI models should excel at, shaping the future of AI capabilities [7][13] The Lifecycle and Impact of Benchmarks - The typical benchmark lifecycle involves an idea spreading, becoming a meme, and eventually being saturated as models train on it [8] - Benchmarks can have unintended consequences, such as reinforcing biases if not designed thoughtfully, as seen with the Chat-GPT thumbs-up/thumbs-down benchmarking [14] - The industry should focus on creating benchmarks that empower people and promote agency, rather than treating them as mere data points [16] Qualities of Effective Benchmarks - Great benchmarks should be multifaceted, rewarding creativity, accessible to both small and large models, generative, evolutionary, and experiential [17][18][19] - The industry needs more "squishy," non-static benchmarks for areas like ethics, society, and art, requiring subject matter expertise [34][35] - Benchmarks can be used to build trust in AI by allowing people to define goals, provide feedback, and see AI improve, fostering a sense of importance and control [37] AI Diplomacy Benchmark - AI Diplomacy is presented as an example of a benchmark that mimics real-world situations, testing models' abilities to negotiate, form alliances, and betray each other [20][22][23] - The AI Diplomacy benchmark revealed interesting personality traits in different models, such as 03 being a schemer and Claude models being naively optimistic [24][25][30] - The AI Diplomacy benchmark highlighted the importance of social aspects and convincing others, with models like Llama performing well due to their social skills [31]
Small AI Teams with Huge Impact — Vikas Paruchuri, Datalab
AI Engineer· 2025-07-15 17:05
Company Growth & Strategy - Datal Lab achieved seven-figure ARR and trained state-of-the-art models with a team of three [1] - The company has grown in revenue 5x since January [2] - Customers include tier one AI labs, universities, Fortune 500 companies, and AI startups [3] - The company's philosophy is to hire less than 15 generalists and fill in the edges with AI and internal tooling [11][12] Team Building & Productivity - Headcount does not equal productivity [3] - The company aims to maintain a "golden period" of alignment and productivity indefinitely [10][11] - The company prioritizes hiring senior generalists with maturity and the ability to solve problems independently [21][22] - The company emphasizes in-person work for small teams to facilitate fast collaboration and tight feedback loops [23] Technology & Processes - The company reuses components aggressively and keeps technology simple, avoiding fancy front-end frameworks [23][24] - The company minimizes bureaucracy and emphasizes high trust and continuous discussions [25] - The company uses AI to automate low-leverage tasks, allowing the team to focus on higher-level work [20]
Tiny Teams — Grant Lee, Gamma
AI Engineer· 2025-07-15 17:04
Company Vision & Product - Gamma aims to revolutionize content creation and sharing, positioning itself as an alternative to traditional tools like PowerPoint [1] - The company focuses on a content-first approach, simplifying the design and formatting process [3] - Gamma's goal is to provide tools that foster imagination and facilitate the sharing of ideas [4][5] Team Structure & Management - Gamma emphasizes a flat organizational structure, moving away from traditional hierarchies [5][6][7] - The company promotes the "rise of the generalist," valuing employees with diverse skill sets and adaptability [8][10] - Gamma utilizes the "player coach" model, where leaders contribute to both management and hands-on work [8][16] Scaling & Culture - Gamma has reached over 50 million users with a team of 30 [7] - The company prioritizes brand and culture from the beginning, viewing them as interconnected [24] - Gamma invests in maintaining a strong company culture through a living culture deck and regular all-hands meetings [26][29] Hiring Practices - Gamma seeks individuals who are continuous learners and effective teachers [15][16] - The company assesses candidates for "high agency" by exploring their problem-solving approaches and depth of understanding [40][41] - Gamma utilizes work trials to ensure a good fit between the company and new hires [46]