Workflow
Large language model
icon
Search documents
Transforming search and discovery using LLMs — Tejaswi & Vinesh, Instacart
AI Engineer· 2025-07-16 18:01
[Music] Hi, good afternoon everyone. Uh my name is Vines and he's the we are part of the search and machine learning team at Instacart. So today we'd like to talk to you about how we are using LMS to transform our search and discovery.Um, so yeah, so first a little bit about ourselves. Yeah, as I mentioned, we are part of the search and discovery ML team at Instacart. And for those of you who may not be part who may not be familiar with Instacart, it's the leader in online grocery in North America.Uh, and o ...
360Brew: LLM-based Personalized Ranking and Recommendation - Hamed and Maziar, LinkedIn AI
AI Engineer· 2025-07-16 17:59
[Music] Hi everyone. Very excited to be here and I'm Ahmed. This is Mazar.And uh today uh uh we're going to talk about our journey in leveraging large language models for personalization and ranking u and our path to production such a large model for uh for LinkedIn use cases. Oop uh recommendation ranking and personalization is deeply integrated our daily life. uh when you go to a feed to to read an article, when you're looking for a for a job, when you're searching for something, when you're buying someth ...
What We Learned from Using LLMs in Pinterest — Mukuntha Narayanan, Han Wang, Pinterest
AI Engineer· 2025-07-16 17:58
[Music] Yeah. Hi everyone. Um, thanks for joining the talk today.Um, we're super excited to be here and shares some of the learnings we um, we have from integrating the LM into Pinterest search. My name is Khan and today I'll be presenting with Mukunda and we are both machine learning engineers from search relevance team at Pinterest. So start with a brief introduction to Pinterest.Um Pinterest is a visual discovery platform where piners can come to find inspiration to create a life they love. And there are ...
RL for Autonomous Coding — Aakanksha Chowdhery, Reflection.ai
AI Engineer· 2025-07-16 16:18
[Music] Hi everyone, I'm Akans Shaw. I was at Google for more than six years and I led the research for Palm and I was a lead researcher in Gemini. uh these days I'm working on uh pushing the frontier for autonomous coding uh with reinforcement learning.So just to recap the arc of how we have progressed in large language models and um why autonomous coding and why now. Um so I think everyone here or those of you uh who don't remember in 2020 there was this breakthrough paper that came out which talked about ...
Recsys Keynote: Improving Recommendation Systems & Search in the Age of LLMs - Eugene Yan
AI Engineer· 2025-07-16 15:00
Recommendation systems and search have long adopted advances in language modeling, from early adoption of Word2vec for embedding-based retrieval to the transformative impact of GRUs, Transformers, and BERT on predicting user interactions. Now, the rise of large language models (LLMs) is inspiring innovations in model architecture, scalable system designs, and richer customer experiences. In this keynote, we'll dive into cutting-edge industry applications of LLMs in recommendation and search systems, explori ...
最强人才接连被挖,创业大佬离开 OpenAI 后说了实话:7 周硬扛出 Codex,无统一路线、全靠小团队猛冲
AI前线· 2025-07-16 05:08
作者|Calvin French-Owen 译者|核子可乐 策划|冬梅、褚杏娟 近日,据《连线》援引多位知情人消息,OpenAI 研究员 Jason Wei 即将加盟 Meta 新成立的超级智 能实验室。 据 Jason Wei 个人网站信息,他曾参与 OpenAI 的 o3 模型及深度研究模型开发。2023 年加入 OpenAI 前,他曾在谷歌任职,期间专注于思维链研究 —— 这种研究的核心是逐步训练 AI 模型处理 复杂查询。在 OpenAI 工作期间,Wei 曾坦言自己是强化学习的 "忠实拥趸"。强化学习是通过正反馈 或负反馈来训练、优化 AI 模型的技术,如今已成为 AI 研究的热门领域,而 Meta 超级智能团队此前 聘请的多位研究员,恰好都深耕这一方向。 另有消息人士向《连线》杂志透露,OpenAI 的另一位研究员 Hyung Won Chung 也将一同加入 Meta。多位消息人士证实,两人在 OpenAI 内部的 Slack 账户现已停用。目前,OpenAI、Meta 以 及 Wei 和 Chung 本人都未回应《连线》杂志的置评请求。 这些核心研究员的流动,无形中让外界对 OpenAI 的团 ...
重塑记忆架构:LLM正在安装「操作系统」
机器之心· 2025-07-16 04:21
机器之心报道 编辑:冷猫 超长上下文窗口的大模型也会经常「失忆」,「记忆」也是需要管理的。 众所周知,现代大型语言模型(LLM)的上下文窗口普遍有限 —— 大多数模型只能处理数千到数万 token,比如早期的 GPT-3 仅有~2,048 token。虽然近期有些模 型已经拓展到了百万级甚至千万级 token 窗口(如 Meta 的 Llama 4 Scout 宣称可达 1,000 万 token)。 图中显示了 LLM 上下文窗口大小的演变。 注意: token 数量为近似最大值。「 GPT-4.1 」指的是 2025 年 4 月更新的 GPT-4 ,「 Scout 」是专为长上下文设计的 17B 参数 Llama 4 变体。 LLM 存在一个内在的「记忆缺陷」,即拥有的上下文窗口是有限的,这严重限制了它们在多轮次、多会话的长期交互中维持一致性的能力。 也因此,现代 LLM 普遍难以维持长期记忆。这对很多应用来说实在相当不妙,毕竟记忆是实现反思和规划的关键,也是智能体系统不可或缺的重要组成部分。 基于 LLM 的自主智能体系统概况图,图源 Lil'Log https://lilianweng.github. ...
只因一个“:”,大模型全军覆没
量子位· 2025-07-15 08:31
鹭羽 发自 凹非寺 量子位 | 公众号 QbitAI 一个冒号,竟然让大模型集体翻车? 明明应该被拦下来的虚假回答,结果LLM通通开绿灯。 该发现来自一篇名叫"一个token就能欺骗LLM"的论文。 而且这一波是冲着所有通用LLM来的, GPT-4o 、 Claude-4 、 LLaMA3-70B 通通被斩于马下。 那咋办?bug有了,来自 腾讯 AI Lab 、 普林斯顿大学 和 弗吉尼亚大学 的研究人员就开始哼哧哼哧解bug。 用增强数据集训练出一个靠谱的"评委"模型 Master-RM ,被骗概率直接无限接近0,正常评估能力还能不受影响。 具体什么情况,咱且接着往下看。 一把能欺骗LLM的"万能钥匙" 不仅如此,除了 冒号 、 空格 这类符号,还有诸如此类的推理开头语: "Thought process:" 、 "解" ,也是轻松通过。 好家伙,原来一个"解"字,数学考试能得分,LLM也会被骗到…… 近来,利用LLM充当评判工具,在带可验证奖励的强化学习 (RLVR) 中评估答案质量的场景愈加普遍。 LLM评判模型通过比对生成的候选答案与参考答案,输出二元奖励信号,从而指导策略模型更新。 然而研究发现, ...
AI搜索时代来了:“SEO 已死,GEO 万岁!”
3 6 Ke· 2025-07-14 11:50
【编者按】在生成式 AI 快速重塑搜索引擎格局的当下,本文作者提出了一个颇具挑衅意味的判断:"SEO 已死,GEO 万岁"。伴随 ChatGPT 月活突破 5 亿、Google 正式推出 AI Mode,本文作者认为传统 SEO 正向 GEO(Generative Engine Optimization,生成式引擎优化)转变,为此,他不仅系统对比了传 统搜索与AI搜索之间在用户行为、排名机制、内容策略等方面的差异,还给出了一套 GEO 落地指南。 今年 5 月,ChatGPT 的月活用户达到了惊人的 5 亿。 同月,Google 也正式发布了 AI Mode(AI 模式)——一个集成在 Google 搜索里的"ChatGPT 式"智能对话搜索模式。这是一个决定性的时刻:连 Google 都已经承认,大语言模型(LLMs)将成为未来搜索的主要界面。 我在许多创业公司中都看到了这种一种趋势:越来越多用户在"你是怎么知道我们产品的?"(HDYHAU)调查中,选择了 ChatGPT 或其他 LLM 工具; 甚至还有一些客户称,每周有高达 30% 的流量来自 ChatGPT。 而这种变革的速度更是令人震惊。比如,Ver ...
从日常助手、架构搭档到“CTO”:怎么用AI让你的开发提效10倍
3 6 Ke· 2025-07-13 23:11
Core Insights - The article critiques the concept of "universal AI prompts" and emphasizes the importance of selecting AI workflows based on specific tasks, leading to significant improvements in programming efficiency [3][4][5]. Group 1: AI Workflow Optimization - The author has transformed a task that previously took a week into one that can be completed in just a few hours by understanding which AI workflow is best suited for the problem at hand [3][4]. - AI tools like Claude Code and ChatGPT have been instrumental in handling 30% of code reviews and resolving half of the encountered bugs, showcasing their effectiveness in the development process [3][4][5]. - The article introduces three core programming models that optimize cognitive load, allowing developers to focus on critical thinking rather than mechanical tasks [5][12]. Group 2: Daily Coding Partners - Tools such as Windsurf and Cursor are highlighted as effective daily coding partners, enabling developers to maintain focus and streamline the coding process by translating natural language instructions into code [6][8]. - The approach emphasizes that AI acts as an executor of decisions made by the developer, allowing for complete control over architecture and design choices [6][8]. - The method is particularly effective for tasks that are well-understood and can be executed without significant risk [8][9]. Group 3: Macro Thinking and Exploration - For larger projects or system architecture design, the author employs a different workflow that involves using AI as a true thinking partner, allowing for exploration and discovery of unexpected solutions [12][14]. - This method encourages a broad exploration of options before narrowing down to specific solutions, enhancing the overall planning process [15][18]. - The use of multiple AI models simultaneously allows for a diverse range of perspectives and solutions, which can be synthesized into a coherent plan [14][15]. Group 4: CTO Approach - The article discusses a more experimental workflow where multiple AI agents are used in parallel to handle different components of a project, akin to a CTO managing several engineering teams [20][22]. - This approach can significantly reduce the time required to complete tasks, potentially compressing a week's work into a single day [22][26]. - Effective project management skills are essential for this method, as it requires clear specifications and the ability to switch contexts efficiently [23][26]. Group 5: Future of AI in Development - The article concludes that the goal of using large language models (LLMs) is not to automate thinking but to free up cognitive space for deeper thought, ultimately leading to better outcomes [28]. - The author anticipates ongoing developments in AI workflows, suggesting that continuous experimentation and optimization will be key to leveraging these powerful tools effectively [28].