轻量化模型
Search documents
国产算力崛起:内外双轮驱动下的自主生态突围
Guotou Securities· 2026-03-04 10:43
2026 年 03 月 04 日 电子 国产算力崛起:内外双轮驱动下的自主 生态突围 内外需求共振,开启国产算力历史窗口 行业深度分析 国产算力需求侧呈现多维度共振:海外云厂商资本开支已进入新一轮上行 周期,其算力迭代节奏同步牵引全球设备需求,为我国云计算投资提供周 期性机遇;美国对华芯片管制不断升级,从硬件延伸至软件、人才乃至云 服务,形成全链条封锁,反而倒逼国内形成以"自主可控"为核心的政策 与产业共识;中国自上而下全面布局,以"东数西算"国家工程为牵引, 结合地方"算力券""模型券"等创新激励,构建起覆盖战略规划、基础 设施建设与场景开放的完整政策体系;同时,以 DeepSeek-V2 为代表的轻 量化模型技术取得突破,通过稀疏混合专家架构(MoE)等创新,在保持高 性能的同时大幅降低训练与推理的算力负担,为国产芯片切入主流 AI 应 用扫清了关键的性能门槛。 自主技术突破,夯实国产算力供给底座 面对强劲的需求牵引,国产算力供给侧在技术攻坚与生态构建上取得系列实质 性突破。在硬件层面,国内先进制程在持续追赶的同时,以 Chiplet(芯粒) 技术路径实现"制程混搭",将计算核心与 I/O 等模块解耦,有 ...
OpenAI持续布局轻量化,云知声(09678.HK)端侧小型语音模型领跑本土创新
Zhong Jin Zai Xian· 2025-10-09 05:11
Core Insights - OpenAI has launched GPT-5Pro and GPT-realtime-mini, emphasizing the trend towards lightweight and efficient AI models with strong multimodal interaction capabilities [1] - CloudWalk (云知声) has established itself as a significant player in the AI industry, leveraging its "chip-cloud integration" strategy and technological expertise [1] Group 1: Technological Advancements - CloudWalk's 0.5B parameter edge-side voice model, based on the Shanhai model distillation technology, is now operational in mass-produced vehicles from companies like Geely and Zhiji, demonstrating reduced hardware requirements and a response speed as low as 350ms [2] - The company has developed a comprehensive technology architecture consisting of "general large models - industry large models - edge-side lightweight models," which has earned it dual recognition with the "2025 AIEra Enterprise Innovation Award" and the "X Future Business Brand" award [2] Group 2: Financial Performance - In the first half of 2025, CloudWalk reported a 457% year-on-year increase in revenue from large model-related businesses, surpassing 100 million RMB, which now accounts for 24.4% of total revenue [3] - The edge-side voice model has established a presence across four major sectors: automotive, healthcare, transportation, and government, serving tens of millions of terminal devices [3] Group 3: Global Expansion - CloudWalk has partnered with the Nanning Municipal Government to establish an "ASEAN Headquarters Project," integrating edge-side voice models into Southeast Asia's transportation and cross-border healthcare scenarios [4] - The launch of GPT-5Pro validates the market potential for small voice models, while CloudWalk is proactively pursuing both localization and globalization strategies [4] - The company's technological approach reflects the innovative logic of Chinese AI enterprises, focusing on full-stack technical capabilities and differentiated competition through "large model technology downscaling and deep scene adaptation" [4]
仅0.27B参数,谷歌开源史上最小Gemma 3,手机能跑,25次对话耗电不到1%
3 6 Ke· 2025-08-15 10:15
Core Insights - Google has launched the Gemma 3 270M, the smallest open-source model to date, featuring 270 million parameters and designed for specific task fine-tuning, showcasing strong instruction tracking and text capabilities [2][5]. Model Performance - In instruction execution capability tests, the Gemma 3 270M outperformed larger models like Qwen2.5 0.5B Instruct and matched the performance of Llama 3.2 1B [1]. - The model excels in specific tasks, achieving performance levels comparable to larger models, making it suitable for offline and web-based creative tasks [3]. Model Architecture - The Gemma 3 270M features a lightweight yet powerful architecture with 270 million parameters, including 170 million embedding parameters and 100 million Transformer module parameters, supported by a large vocabulary of 256k tokens [4]. - The model is designed for low power consumption, consuming only 0.75% of battery over 25 dialogues on the Pixel 9 Pro SoC, making it Google's most energy-efficient Gemma model [4]. Instruction Following and Deployment - The model has excellent instruction-following capabilities, providing a pre-trained checkpoint that can respond to general instructions "out of the box" [4]. - It supports quantization-aware training (QAT) checkpoints, allowing operation at INT4 precision with minimal performance loss, crucial for deployment on resource-constrained devices [4]. Target Use Cases - The Gemma 3 270M is ideal for users with high-capacity, well-defined tasks who need cost-effective, rapid iteration and deployment, or have privacy concerns [5]. - The launch of this lightweight model addresses the misconception that larger parameter sizes equate to better performance, demonstrating the effectiveness of smaller models in instruction adherence and fine-tuning [5].
谷歌版小钢炮开源!0.27B大模型,4个注意力头,专为终端而生
量子位· 2025-08-15 06:44
Core Viewpoint - Google has launched the open-source model Gemma 3 270M, which is compact and efficient, capable of running locally in a browser without internet connectivity, and demonstrates superior performance compared to similar models like Qwen 2.5 [1][3][4]. Model Features - The new model contains 270 million parameters, with 170 million dedicated to the embedding layer and 100 million for the Transformer module, showcasing a lightweight architecture [14]. - It has a large vocabulary capacity of 256,000 tokens, allowing it to handle specific and rare vocabulary, making it ideal for further fine-tuning in specialized fields and languages [15]. - The model is designed for extreme energy efficiency, consuming only 0.75% battery after 25 dialogue rounds when run on a Pixel 9 Pro smartphone [17]. - It includes a pre-trained checkpoint that allows for precise instruction following right out of the box [18]. - The model supports quantization, enabling it to run at INT4 precision with minimal performance loss, which is crucial for deployment on resource-constrained devices [19]. Application Scenarios - The lightweight model has proven effective in real-world applications, such as a collaboration between Adaptive ML and SK Telecom, where a specialized version of Gemma 3 was fine-tuned for complex multilingual content moderation [20]. - The fine-tuned 270M model can be deployed on lightweight, low-cost infrastructure, allowing for rapid iteration and deployment of customized models for specific tasks [24]. - It ensures user privacy by allowing complete local operation without sending data to the cloud [24]. - The model is suitable for batch processing tasks like sentiment analysis, entity extraction, and creative writing, while also significantly reducing inference costs and response times in production environments [27]. Getting Started - Users can access the model from platforms like Hugging Face, Ollama, Kaggle, LM Studio, or Docker [25]. - Personalization can be achieved using tools such as Hugging Face, UnSloth, or JAX, followed by easy deployment to local environments or Google Cloud Run [28].
从感知能力提升到轻量化落地,具身这条路还要走很长一段时间~
自动驾驶之心· 2025-07-02 02:05
Core Viewpoint - The embodied intelligence industry is expected to experience explosive growth by 2025, driven by technological advancements and application traction, shaping both the technical roadmap and commercialization pathways [1]. Group 1: Technological Developments - Upgrades in perception capabilities and multimodal integration are crucial for the development of embodied technologies, with a focus on tactile perception, particularly in dexterous hands, enhancing precision and feedback [1]. - Multimodal sensor fusion technology allows robots to process various types of information simultaneously, significantly improving environmental perception accuracy and comprehensiveness [1]. - Large model-driven algorithms are enhancing robots' understanding of the world, particularly in humanoid robots, by improving perception, autonomous learning, and decision-making capabilities [1]. - Lightweight model design is becoming a pressing need for industry implementation, requiring low-computation, multimodal, and cross-platform models [1]. Group 2: Simulation and Data Ecosystem - The continuous improvement of simulation environments and data ecosystems is vital for embodied intelligence, providing efficient training platforms for robots [1]. - Simulations based on physical world principles help in modeling and analyzing various phenomena, aiding robots in understanding physical interactions and operations [1]. - The alignment of simulation and real-world environments is a key challenge that researchers are working to overcome [1]. Group 3: Community and Resources - The "Embodied Intelligence Heart Knowledge Planet" serves as a technical exchange platform for various stakeholders in the field, including members from renowned universities and leading robotics companies [6]. - The community has compiled over 40 open-source projects and nearly 60 datasets related to embodied intelligence, along with mainstream simulation platforms and various learning pathways [6][12]. - Members can access a wealth of resources, including research reports, technical learning routes, and job opportunities in the embodied intelligence sector [11][14].