Self-supervised Learning - filings, earnings calls, financial reports, news

Self-supervised Learning

Search documents

Zhejiang University· 2025-05-22 01:20

Investment Rating - The report does not provide a specific investment rating for the industry Core Insights - The report discusses the evolution of large language models (LLMs) and highlights the significance of DeepSeek technology in bridging the gap between open-source and closed-source AI models, reducing the development lag from 6-12 months to 1-3 months [69] Summary by Sections Language Models - Language models aim to calculate the probability of a sequence of words, enabling machines to understand human language [6] - The report outlines the basic tasks of language models, including encoding and word embedding, which help in representing words in a way that captures their meanings [13][17] Transformer - The Transformer architecture introduced in 2017 revolutionized deep learning with its self-attention mechanism, allowing for parallel computation and better understanding of global context [32] - The report emphasizes the importance of the Transformer model as a foundational technology for large models, highlighting its ability to capture complex semantic relationships through multi-head attention [33] DeepSeek - DeepSeek technology is positioned as a significant advancement in AI, with its architecture allowing for efficient model training and inference, thus addressing the computational demands of large models [70] - The report details the stages of DeepSeek's development, including supervised fine-tuning and reinforcement learning, which enhance its reasoning capabilities [117][119] New Generation Agents - The report discusses the transition from generative models to reasoning models, indicating a shift in focus towards enhancing logical reasoning capabilities in AI systems [107] - It highlights the integration of LLMs with agent-based systems, where LLMs serve as the brain of agents, enabling them to perform complex tasks through planning and tool usage [133]

Transformer

Artificial Intelligence

Large Language Model

Self-supervised Learning

Artificial Intelligence

DeepSeek

Transformer

Artificial Intelligence

Large Language Model

Self-supervised Learning

Artificial Intelligence

DeepSeek

Google首席科学家万字演讲回顾AI十年：哪些关键技术决定了今天的大模型格局？

机器人圈· 2025-04-30 09:10

Google 首席科学家Jeff Dean 今年4月于在苏黎世联邦理工学院发表关于人工智能重要趋势的演讲，本次演讲回顾了奠定现代AI基础的一系列关键技术里程碑，包括神经网络与反向传播、早期大规模训练、硬件加速、开源生态、架构革命、训练范式、模型效率、推理优化等。算力、数据量、模型规模扩展以及算法和模型架构创新对AI 能力提升的关键作用。以下是本次演讲实录经数字开物团队编译整理 01 AI 正以前所未有的规模和算法进步改变计算范式 Jeff Dean: 今天我将和大家探讨 AI 的重要趋势。我们会回顾：这个领域是如何发展到今天这个模型能力水平的？在当前的技术水平下，我们能做些什么？以及，我们该如何塑造 AI 的未来发展方向？这项工作是与 Google 内外的众多同仁共同完成的，所以并非全是我个人的成果，其中许多是合作研究。有些工作甚至并非由我主导，但我认为它们都非常重要，值得在此与大家分享和探讨。我们先来看一些观察发现，其中大部分对在座各位而言可能显而易见。首先，我认为最重要的一点是，机器学习彻底改变了我们对计算机能力的认知和期待。回想十年前，当时的计算机视觉技术尚处初级阶段，计算机几乎谈 ...

Artificial Intelligence

Neural Network

Transformer

Self-supervised Learning

Knowledge Distillation

Speculative Decoding

Artificial Intelligence

Neural Network

Transformer

Self-supervised Learning

Knowledge Distillation

Speculative Decoding