Workflow
Meta为他豪掷2亿美元,上交校友庞若鸣,晒出在苹果的最新论文
机器之心·2025-07-10 10:49

Core Viewpoint - The article discusses Ruoming Pang's transition from Apple to Meta, highlighting his contributions to Apple's foundational model and the development of AXLearn, a modular large model training system designed for heterogeneous infrastructure. Group 1: Ruoming Pang's Transition - Ruoming Pang, head of Apple's foundational model team, is moving to Meta's newly established superintelligence team, with a reported offer of $200 million [2][3]. - Despite the transition, Pang continues to contribute to Apple by promoting his research on AXLearn [3][4]. Group 2: AXLearn Overview - AXLearn is a production-grade system designed for large-scale deep learning model training, emphasizing scalability and high performance [6]. - The system features a modular design and comprehensive support for heterogeneous hardware infrastructure, allowing for efficient integration of functionalities like Rotary Position Embeddings (RoPE) with minimal code [6][8]. - A new method for measuring modularity, based on lines of code (LoC-complexity), is introduced, showing that AXLearn maintains constant complexity during system expansion, unlike other systems that exhibit linear or quadratic growth [7][23]. Group 3: Performance Evaluation - AXLearn's training performance is compared with systems like PyTorch FSDP, Megatron-LM, and MaxText across various hardware platforms, demonstrating competitive iteration times and throughput [26][29]. - The system shows near-linear scalability in weak-scaling experiments, indicating its robustness in handling increased workloads [30]. Group 4: Production Use and Impact - AXLearn has evolved from a tool for a few developers to a large platform supporting hundreds of developers in training models with billions to trillions of parameters [35]. - It can concurrently support over 10,000 experiments and is deployed across various heterogeneous hardware clusters, contributing to features used by billions of users [36][37].