Mixture of Experts (MoE) - filings, earnings calls, financial reports, news

Mixture of Experts (MoE)

Search documents

腾讯研究院· 2025-06-26 07:58

以下文章来源于追问nextquestion ，作者追问追问nextquestion . 科研就是不断探索问题的边界 George Musser 作者张旭晖编译人工智能的终极梦想，从来不局限于打造一个能击败国际象棋特级大师的博弈引擎，或是设计出花言巧语蛊惑人心的聊天机器人。它的真正使命，是成为一面映照人类智慧的明镜，帮助我们更深刻地认识自我。科研工作者的目标，也不止于是狭义的人工智能，他们追求的是通用型人工智能（A GI ） ——一种具有类人的适应力与创造力的智能系统。诚然，如今大语言模型（LLM）的问题解决能力已然让大多数研究者刮目相看，但它们依然有着明显的短板，例如缺乏持续学习的能力——一旦完成基于书籍、网络文本等材料的训练后，它们的知识库就被冻结了，再也无法"更新"。正如AI公司SingularityNET的本·格策尔（Ben Goertzel）形象地比喻："你没法让大语言模型去上大学，甚至连幼儿园都进不了。"它们通过不了有"机器人高考"之名的综合测验。 "掌握"了语言，离模拟思维还有多远？在语言处理方面，目前的LLM确实展现出了专家所称的AGI"形式能力"：即使你提供 ...

Artificial Intelligence

General Artificial Intelligence (AGI)

global workspace theory (GWT)

soft - attention mechanism

latent space alignment

Artificial Intelligence

General Artificial Intelligence (AGI)

global workspace theory (GWT)

soft - attention mechanism

latent space alignment

Artificial Intelligence

一个「always」站在大模型技术C位的传奇男子

量子位· 2025-05-10 02:39

Core Viewpoint - The article highlights the significant contributions of Noam Shazeer in the AI field, particularly in the development of large language models (LLMs) and the Transformer architecture, emphasizing his role as a key figure in the evolution of AI technologies [9][10][12]. Group 1: Contributions to AI Technology - Shazeer is recognized as one of the most influential authors of the Transformer model, credited with pivotal advancements such as the introduction of the Mixture of Experts (MoE) architecture [10][18][24]. - His work on the paper "Attention Is All You Need" in 2017 is considered a foundational moment for LLMs, leading to widespread adoption and further innovations in the field [18][23]. - Shazeer has consistently anticipated technological trends, contributing to various breakthroughs, including the GShard framework for scaling models and the Switch Transformers, which achieved a parameter count of 1.6 trillion [30][33][41]. Group 2: Career and Achievements - Shazeer has a remarkable academic and professional background, having achieved a perfect score at the International Mathematical Olympiad in 1994 and later studying at Duke University [50][52]. - He joined Google as employee number 200 and made significant contributions to various projects, including Google's search spelling correction and the development of machine learning systems for ad ranking and spam detection [55][56]. - After a brief period away from Google, he co-founded Character.AI, which gained a valuation of $1 billion before being acquired by Google for $2.7 billion, leading to his return to the company [67][69]. Group 3: Impact on the Industry - Shazeer's innovations have laid the groundwork for current AI models, with many contemporary systems, including GPT-4 and others, building upon his research [41][44]. - His development of the Adafactor optimizer and Multi Query Attention (MQA) has been crucial for enhancing the efficiency of large models [43][44]. - The article concludes that Shazeer's foresight and contributions have positioned him as a defining figure in the current era of AI, with his work continuing to influence the direction of the industry [11][12][40].

大语言模型

自然语言处理

Artificial Intelligence

Transformer

Mixture of Experts (MoE)

Adafactor优化器

大语言模型

自然语言处理

Artificial Intelligence

Transformer

Mixture of Experts (MoE)

Adafactor优化器