Foundation Models

Search documents
AI: Inclusive and Transformative | Manish Gupta | TEDxIITGandhinagar
TEDx Talks· 2025-07-28 16:02
AI发展与应用 - DeepMind 的使命是负责任地构建 AI,以造福人类,深度学习已成为解决图像分类、语音识别和机器翻译等问题的最佳方法 [5][6] - Transformer 架构促成了大型语言模型的构建,这些模型在大量公开数据上进行训练,能够解决广泛的问题 [8] - 现代基础模型(LLM)已超越文本,成为多模态模型,能够处理文本、手写文本和图像,为个性化辅导等学习方式带来可能性 [11][12] - Gemini 1.5 Pro 能够处理高达 1 million 多模态 tokens 的上下文窗口,可以处理大量信息作为输入 [15] - AI Agents 不仅限于简单的聊天机器人,还可以进行语音交互,甚至在 3D 世界中进行实时交互 [16] AI的包容性与可及性 - 行业致力于弥合英语和其他语言(特别是印度语言)之间 AI 能力的差距,目标是开发能够理解 125 种以上印度语言的模型 [19][20][21][22] - Vani 项目与印度科学研究所合作,旨在收集印度各个角落的语音数据,目标是从印度每个地区收集数据,以覆盖更多零语料库语言 [24][25] AI在特定领域的应用 - 行业正在构建数字农业堆栈的基础层,利用卫星图像识别农田边界、作物类型和水源,为农民提供个性化服务,如作物保险 [26][27][28] - AlphaFold 通过预测蛋白质结构,将原本需要 5 年的研究缩短到几秒钟,并在不到一年的时间内完成了 200 million 个蛋白质结构的预测,并免费提供数据,极大地加速了科学发现 [29][30][31][32] 未来展望 - 行业期望 AI 能够帮助更多人,使他们能够做出诺贝尔奖级别的贡献 [35]
Waymo's EMMA: Teaching Cars to Think - Jyh Jing Hwang, Waymo
AI Engineer· 2025-07-26 17:00
Autonomous Driving History and Challenges - Autonomous driving research started in the 1980s with simple neural networks and evolved to end-to-end driving models by 2020 [2] - Scaling autonomous driving presents challenges, requiring solutions for long-tail events and rare scenarios [5][7] - Foundation models, like Gemini, show promise in generalizing to rare driving events and providing appropriate responses [8][9][10][11] Emma: A Multimodal Large Language Model for Autonomous Driving - The company is exploring Emma, a driving system leveraging Gemini, which uses routing text and camera input to predict future waypoints [11][12][13][14] - Emma is self-supervised, camera-only, and high-dimension map-free, achieving state-of-the-art quality on the nuScenes benchmark [15][16][17] - Channel reasoning is incorporated into Emma, allowing the model to explain its driving decisions and improve performance on a 100k dataset [17] Evaluation and Validation - Evaluation is crucial for the success of autonomous driving models, including open loop evaluation, simulations, and real-world testing [25] - Generative models are being explored for sensor simulation to evaluate the planner under various conditions like rain and different times of day [26][27][28] Future Directions - The company aims to improve generalization and scale autonomous driving by leveraging foundation models [30] - Training on larger datasets improves the quality of the planner [19][20] - The company is exploring training on various tasks, such as 3D detection and rograph estimation, to create a more generalizable model [21][22][23][24]
Software Tools To Make Robots
Y Combinator· 2025-05-13 05:57
Robotics hasn't had its Jad GPT moment yet, but we think it is almost here. Everyone has known that robots are the future, but that proved elusive because previous generations of robots were expensive, brittle, and only worked in control conditions. With the rapid improvements in foundation models, it is finally possible to make robots that have human level perception and judgment.That has been the missing piece. While consumer use case feature heavily in science fiction, some of the overlooked and most imm ...