端侧大模型 - filings, earnings calls, financial reports, news

端侧大模型

Search documents

3 6 Ke· 2025-05-07 12:23

Core Insights - The smartphone industry is undergoing an AI revolution, with manufacturers increasingly integrating AI features into their new products, marking a shift from traditional hardware innovation to AI-driven functionalities [2][5][14] - IDC forecasts a dramatic increase in AI smartphone shipments in China, with a year-on-year growth of 591% in 2024, and a penetration rate rising from 3% in 2023 to 22% [4] - The competition among smartphone manufacturers is shifting from hardware specifications to AI capabilities, emphasizing the need for end-to-end AI design from chips to operating systems [8][13] Group 1: Industry Trends - The AI smartphone market is expected to reach 1.18 billion units by 2025, accounting for 40.7% of the overall market [4] - High-end smartphones priced above $600 are projected to exceed 30.9% of the market share, with AI features contributing 75% of their premium pricing [4] - The average replacement cycle for smartphones has extended to 51 months, prompting manufacturers to focus on AI to drive consumer upgrades [5] Group 2: Technological Developments - The new generation of smartphones must feature advanced AI capabilities, including large model computing power, system-level AI integration, and proactive service in various scenarios [8][16] - AI's impact on imaging technology is significant, with innovations allowing for real-time analysis and optimization of images, enhancing capabilities beyond traditional photography [10][11] - The relationship between hardware manufacturers and AI developers is evolving, with companies like Qualcomm and Huawei creating ecosystems that support AI development and deployment [17][22] Group 3: Competitive Landscape - Major smartphone manufacturers are divided into three camps: Apple, Huawei, and an open ecosystem represented by brands like Xiaomi and Honor, each pursuing different AI strategies [20][22] - Huawei is positioned to lead in the AI smartphone market due to its strong R&D investment and technological capabilities in AI chipsets and cloud collaboration [22][23] - The future of smartphones may not solely rely on traditional devices, raising questions about the evolution of AI-native smart devices beyond current smartphones [23][24]

ICML 2025 Spotlight｜华为诺亚提出端侧大模型新架构MoLE，内存搬运代价降低1000倍

机器之心· 2025-05-07 00:33

Core Insights - The article introduces Mixture-of-Lookup-Experts (MoLE), a new architecture designed to optimize the deployment of Mixture-of-Experts (MoE) models, particularly in resource-constrained environments [1][28] - MoLE addresses the challenges of high memory usage and transmission delays associated with traditional MoE during inference by replacing matrix operations with lookup tables [28] Group 1: MoLE Architecture - MoLE activates only a small subset of experts needed for each token during inference, significantly reducing computational load while maintaining a large parameter scale [1] - The architecture allows for the pre-computation of input-output mappings stored as lookup tables, enabling efficient retrieval during inference [5][6] Group 2: Training Phase Differences - In the training phase, MoLE modifies the input to routed experts from the previous layer's output to shallow embedding tokens, facilitating the pre-computation and storage of lookup tables [8] - MoLE employs an activation strategy that activates all routed experts during training, eliminating the need for sparse activation to control computational load [9] - The loss design in MoLE focuses solely on language modeling loss, without additional load balancing loss terms [10] Group 3: Inference Phase Process - During inference, MoLE constructs lookup tables from the embedding layer's weight matrix, allowing for direct retrieval of expert outputs based on token IDs [15] - The lookup table is stored in lower storage devices, and during inference, the corresponding expert outputs are retrieved and loaded into memory for computation [16] Group 4: Performance and Efficiency - MoLE's computational complexity during inference is comparable to dense models and traditional MoE models, while significantly reducing transmission overhead [17] - Experimental results indicate that MoLE achieves performance on par with MoE while drastically reducing transmission costs by over a thousand times [20][28] Group 5: Experimental Results - The experiments conducted on the Pile dataset show that MoLE maintains performance equivalent to MoE while using the same training parameters and inference activation parameters [20] - MoLE demonstrates lower inference latency compared to MoE, especially in batch decoding scenarios, highlighting its advantages in high-throughput tasks [28]

端侧大模型

Mixture-of-Experts（MoE）

大规模语言模型（LLM）

Telecommunications Equipment

Mixture-of-Lookup-Experts（MoLE）

端侧大模型

Mixture-of-Experts（MoE）

大规模语言模型（LLM）

Telecommunications Equipment

Mixture-of-Lookup-Experts（MoLE）

智能车速度刷新：仅10个月，首个纯端侧大模型上车量产！

量子位· 2025-04-24 10:29

Core Viewpoint - The article highlights the rapid advancements in automotive AI technology, particularly focusing on the end-side large model developed by Mianbi Intelligent, which has achieved remarkable speed and efficiency in vehicle applications, revolutionizing the industry standards for AI integration in cars [1][14]. Group 1: Product Launch and Features - Mianbi Intelligent's cpmGO, a pure end-side large model-driven intelligent assistant, was showcased at the Shanghai Auto Show, marking a significant milestone in automotive AI [9][12]. - cpmGO boasts features such as 91% execution accuracy, local data processing, and robust performance in weak network conditions, making it a pioneering product in the industry [10][28]. - The product integrates multi-modal perception and interaction, allowing users to control vehicle functions through voice commands with high accuracy [30][31]. Group 2: Technological Innovations - The cpmGO model is powered by the MiniCPM, which operates entirely on the vehicle's local system, ensuring data privacy and rapid response times [27][28]. - The system's GUI Agent can understand and execute screen commands, enhancing user interaction by performing tasks autonomously based on context [33][36]. - The collaboration with major chip manufacturers like Qualcomm and Intel supports the optimization of cpmGO across various platforms, ensuring compatibility and performance [11][13]. Group 3: Industry Impact and Future Trends - The article discusses the shift in the automotive industry towards end-side AI models, which are less dependent on cloud services, addressing issues like latency and data security [38][42]. - The partnership between Mianbi Intelligent and Intel aims to redefine the next generation of in-vehicle AI systems, emphasizing the importance of local processing capabilities [40][48]. - The emergence of end-side models is seen as a response to the challenges of cloud-based solutions, positioning them as the future of automotive intelligence [44][46].