Workflow
无监督学习
icon
Search documents
开学了:入门AI,可以从这第一课开始
机器之心· 2025-09-01 08:46
Core Viewpoint - The article emphasizes the importance of understanding AI and its underlying principles, suggesting that individuals should start their journey into AI by grasping fundamental concepts and practical skills. Group 1: Understanding AI - AI is defined through various learning methods, including supervised learning, unsupervised learning, and reinforcement learning, which allow machines to learn from data without rigid programming rules [9][11][12]. - The core idea of modern AI revolves around machine learning, particularly deep learning, which enables machines to learn from vast amounts of data and make predictions [12]. Group 2: Essential Skills for AI - Three essential skills for entering the AI field are mathematics, programming, and practical experience. Mathematics provides the foundational understanding, while programming, particularly in Python, is crucial for implementing AI concepts [13][19]. - Key mathematical areas include linear algebra, probability and statistics, and calculus, which are vital for understanding AI algorithms and models [13]. Group 3: Practical Application and Tools - Python is highlighted as the primary programming language for AI due to its simplicity and extensive ecosystem, including libraries like NumPy, Pandas, Scikit-learn, TensorFlow, and PyTorch [20][21]. - Engaging in hands-on projects, such as data analysis or machine learning tasks, is encouraged to solidify understanding and build a portfolio [27][46]. Group 4: Career Opportunities in AI - Various career paths in AI include machine learning engineers, data scientists, and algorithm researchers, each focusing on different aspects of AI development and application [38][40]. - The article suggests that AI skills can enhance various fields, creating opportunities for interdisciplinary applications, such as in finance, healthcare, and the arts [41][43]. Group 5: Challenges and Future Directions - The rapid evolution of AI technology presents challenges, including the need for continuous learning and adaptation to new developments [34][37]. - The article concludes by encouraging individuals to embrace uncertainty and find their passion within the AI landscape, highlighting the importance of human creativity and empathy in the technological realm [71][73].
微软副总裁X上「开课」,连更关于RL的一切,LLM从业者必读
机器之心· 2025-05-26 01:28
Core Viewpoint - The article discusses the educational series on artificial intelligence initiated by Nando de Freitas, focusing on reinforcement learning (RL) and its applications in large language models (LLMs) [1][2]. Summary by Sections Introduction to AI Education - Nando de Freitas aims to educate readers on AI through a series of posts on X, starting with reinforcement learning and gradually covering diffusion and flow matching technologies [1][2]. Learning Types - The article highlights that there is no ultimate conclusion on unsupervised learning, supervised learning, and reinforcement learning [8][19]. - Supervised learning is described as basic imitation, requiring high-quality expert data for effective learning [9]. - Reinforcement learning focuses on selective imitation, allowing agents to learn from suboptimal experiences and improve their performance [10][11]. Distributed Reinforcement Learning Systems - Modern distributed RL systems consist of two main components: Actors and Learners, where Actors interact with the environment and collect data, while Learners update the policy network based on this data [23][24]. - The importance of measuring operational durations and communication bandwidth in such systems is emphasized [24][27]. Offline Reinforcement Learning - Offline RL has unique value in scenarios like post-training LLMs, where it can leverage historical data for learning [28][29]. Single-step and Multi-step RL - The article differentiates between single-step and multi-step RL problems, with single-step focusing on immediate actions and multi-step involving planning over a series of interactions [35][39]. - The complexity of multi-step RL is noted, particularly in credit assignment issues where multiple decisions affect outcomes [40][41]. Policy Gradient and Techniques - Policy gradient methods are discussed, including the use of baseline subtraction to reduce variance in reward signals [49][56]. - The article also covers the significance of KL divergence in maintaining proximity to supervised fine-tuning strategies during post-training [69]. Importance Sampling and PPO - Importance sampling is introduced as a method to correct off-policy sample bias, with Proximal Policy Optimization (PPO) being a key technique to manage policy updates [73][78]. - The integration of various techniques in training models like DeepSeek-R1 is highlighted, showcasing the complexity of modern RL systems [81]. Future Directions - Freitas plans to expand the discussion from single-step to multi-step RL, indicating ongoing developments in the field [82].
GPT-5 有了雏形;OpenAI 和 Manus 研发 Agent 的经验;中国大公司扩大算力投资丨 AI 月报
晚点LatePost· 2025-03-08 12:17
2025 年 2 月的全球 AI 重要趋势。 文 丨 贺乾明 2025 年 2 月的 AI 月报,你会看到: 硅谷巨头的新共识:推理能力是大模型的一部分 OpenAI 和 Manus 的 Agent 开发经验 DeepSeek 推动中国大公司加大算力投入,阿里、字节两家加起来,今年就超过 2000 亿 3 家售价过亿的 AI 公司和 23 家获得超过 5000 万美元融资的 AI 公司 OpenAI 时薪 100 美元招专家生产数据提高模型能力 这一期月报中,我们开始邀请研究者、创业者和投资人提供一手视角的对每月 AI 趋势和标志性事件的评述和 洞察。 晚点 AI 月报,每月选取最值得你知道的 AI 信号。 以下是我们第 4 期 AI 月报,欢迎大家在留言区补充我们没有提到的重要趋势。 技术丨GPT-5 雏形出现,行业新共识诞生 DeepSeek 带来的冲击波继续扩散,全球大模型公司陷入混战:不论是马斯克用超过 10 万张 GPU 训练 的 Grok 3,还是 OpenAI 可能投入 10 亿美元训练的 GPT-4.5,或是 Anthropic 融合推理(reasoning) 能力的最新模型 Claude 3 ...