稀疏注意力

Search documents
知乎平台已沉淀858万个AI相关问题、2088万个AI专业回答丨聚焦WAIC 2025
Guo Ji Jin Rong Bao· 2025-07-27 12:23
Core Insights - The rise of AI developers has made Zhihu a primary platform for launching projects and discussing AI advancements, with significant engagement from the community [1][3][4] Group 1: Community Engagement - Zhihu has attracted 16 million continuous learners in the technology and AI fields, along with 3.56 million deep creators in these topics, accumulating 8.58 million AI-related questions and 20.88 million professional answers [1] - Several AI companies have actively engaged on Zhihu, including DeepSeek's exclusive release of a technical article and the launch of humanoid robot Lingxi X2 by Zhihu's user "Zhihui Jun" [3] Group 2: Events and Interactions - During the WAIC 2025, Zhihu showcased a multi-dimensional interactive exhibition highlighting AI technology discussions and engaging activities like "Knowledge King PK" [4] - Zhihu organized a "Developer Recovery Night" event where numerous AI developers shared insights and experiences, emphasizing the transformative impact of large models on embodied intelligence technology [5] Group 3: Collaborations and Publications - Zhihu collaborated with 14 AI companies to release the "AI World Handbook," aiming to provide insights into the AI ecosystem [4]
3700 次预训练寻找 “线性注意力” 非共识,MiniMax-01 开发者讲述 4 年探索
晚点LatePost· 2025-03-09 12:00
"我们跑的是下半场,赌的就是未来的长文本需求。" MiniMax 在今年 1 月发布了参数为 4560 亿的开源大模型 MiniMax-01,该模型就用到了他们开发的线 性注意力机制 "Lightning Attention"。 我们邀请了这个项目的负责人,MiniMax 高级研究总监钟怡然,来与我们一起聊线性注意力的研发过 程。钟怡然在 MiniMax 负责大模型网络架构设计,目前正开发多模态深度推理模型。 钟怡然曾担任上海人工智能实验室青年科学家,是新架构探索组的 PI(项目负责人);他在澳洲国立大 学获得博士学位,师从李宏东教授和 Richard Hartley 院士。他和他的团队已在一些国际顶级学术会议和 期刊上发表了 20 余篇关于模型新架构的论文,覆盖了当前多类非 Transformer 架构,如线性注意力机制 (线性注意力)、长卷积(Long Convolution)和线性循环网络(Linear RNN)。 在 2021 年,线性注意力还是一个 "看起来很美好的泡泡",怡然和团队就开始探索线性架构的实现。 嘉宾 丨 钟怡然 整理 丨 刘倩 程曼祺 上期播客中, 我们与清华的两位博士生,肖朝军和傅 ...