SmallThinker
Search documents
百元级硬件流畅运行百亿参数大模型!上交&本智激活开源端侧原生大模型
量子位· 2025-07-27 09:01
Core Viewpoint - The next battleground for AI is shifting from the cloud to mobile devices, emphasizing the need for local computation to ensure user privacy and data security [2][3]. Group 1: Industry Trends - Major smartphone manufacturers like Apple, Huawei, Samsung, Xiaomi, and OPPO are integrating large models into mobile devices, indicating a competitive landscape for edge AI [2]. - The challenges of running AI smoothly on local devices are significant, as evidenced by Apple's delayed launch of its core AI features [2][3]. Group 2: Technological Innovations - A new collaboration between Shanghai Jiao Tong University and the startup Zenergize AI has led to the development of the SmallThinker series, which is designed specifically for edge computing [4]. - The SmallThinker models, including SmallThinker-4B-A0.6B and SmallThinker-21B-A3B, are optimized for local CPU inference without relying on high-end GPUs, achieving impressive performance metrics [5][23]. Group 3: Model Architecture - SmallThinker employs a unique architecture that allows for efficient inference on devices with limited computational resources, avoiding the need for traditional model compression techniques [6][8]. - The model features three core technological characteristics: expert knowledge activation, preemptive expert routing to minimize I/O overhead, and a hybrid sparse attention mechanism that reduces memory usage by 76% [9][12][17]. Group 4: Performance Metrics - In extreme memory-constrained scenarios (1GB RAM), the SmallThinker-4B-A0.6B model achieves a speed of 19.91 tokens/s, significantly outperforming competitors like Qwen3-1.7B [26][27]. - On standard PC configurations (8GB RAM), the SmallThinker-21B-A3B model demonstrates a speed of 20.30 tokens/s, doubling the performance of Qwen3-30B-A3B [29]. Group 5: Future Directions - The development team plans to enhance the model's capabilities by scaling up with more high-quality data and aims to create a personal AI assistant that operates entirely on individual devices [32][33]. - The vision is to integrate AI seamlessly into daily life, providing a secure, private, and intelligent experience for users [34].
本智激活完成数千万元种子轮融资,加速端侧 AI 全面落地
Tai Mei Ti A P P· 2025-07-24 02:31
Core Insights - "BenZhi Activation" is a startup incubated from Shanghai Jiao Tong University's Institute of Parallel and Distributed Systems (IPADS), which is renowned for its expertise in operating systems and distributed systems, ranking first globally in the CSRankings for the past decade [2] - The team, led by CEO Mi Zeyu, focuses on edge-native AI solutions, aiming to transform personal AI paradigms by addressing privacy concerns, high costs, and lack of personalization in cloud-based AI models [2][3] Group 1: Technological Innovations - "BenZhi Activation" proposes a disruptive "edge-native" full-stack design that reconstructs the software and hardware technology system from the ground up, achieving true edge intelligence without sacrificing model intelligence [3] - The team has achieved significant breakthroughs in edge model algorithms and infrastructure, including the release of the PowerInfer edge model infrastructure system, which operates a trillion-parameter model efficiently on consumer-grade NVIDIA GTX 4090 GPUs, achieving 90% of the performance of data center-level A100 GPUs [4] - The upcoming PowerInfer-2, set to launch in June 2024, will enable the smooth operation of a 47 billion-parameter model on smartphones, surpassing the performance of international benchmarks by 29 times [4][5] Group 2: Market Impact and Future Prospects - The first batch of edge-native models will be released and open-sourced on July 26, 2025, featuring original algorithm architectures designed specifically for edge devices, allowing for smooth operation on budget hardware [5] - Industry experts highlight the growing demand for privacy protection and low latency, positioning edge intelligence as a key entry point connecting the virtual and physical worlds, with "BenZhi Activation" leading the way in low-cost, efficient deployment of large models on mainstream devices [6] - The company is recognized as one of the few globally with top-tier R&D capabilities and mass production experience in edge AI, indicating a strong potential for future growth and innovation in the sector [6]