轻量化模型

Search documents
OpenAI持续布局轻量化,云知声(09678.HK)端侧小型语音模型领跑本土创新
Zhong Jin Zai Xian· 2025-10-09 05:11
在这一浪潮之下,作为中国领先的人工智能企业,云知声(09678.HK)早已在此领域深耕多年,其独特 的"芯云一体"战略布局与深厚的技术积累,正使其成为这场产业变革中不可忽视的力量。 0.5B参数方案已服务一线智能终端,技术商业化能力获双重认证 在OpenAI正式推出GPT-5Pro小型语音模型、引发全球智能交互技术竞逐之际,国内AGI先行者云知声 近日披露的技术落地成果引发行业关注:其基于山海大模型蒸馏技术的0.5B级端侧语音模型已稳定服务 于吉利、智己等多家车企的量产车型,这一技术"瘦身"大幅降低了对端侧设备推理硬件的要求,即便在 算力为30TOPS的8295平台上,也能流畅运行,实测响应速度低至350ms。公司构建了"通用大模型-行业 大模型-端侧轻量化模型"的全栈技术架构,并凭借"高效推理+隐私保护+多场景适配"的技术优势,斩 获"2025AIEra企业创新大奖"与"X未来商业品牌"奖双重认可。 商业领跑:上半年大模型相关收入暴增457%,覆盖多行业主流市场 10月8日讯,近日,人工智能领域再度迎来标志性事件,OpenAI正式推出GPT-5Pro及轻量语音模型GPT- realtime-mini,旨在以 ...
仅0.27B参数,谷歌开源史上最小Gemma 3,手机能跑,25次对话耗电不到1%
3 6 Ke· 2025-08-15 10:15
Core Insights - Google has launched the Gemma 3 270M, the smallest open-source model to date, featuring 270 million parameters and designed for specific task fine-tuning, showcasing strong instruction tracking and text capabilities [2][5]. Model Performance - In instruction execution capability tests, the Gemma 3 270M outperformed larger models like Qwen2.5 0.5B Instruct and matched the performance of Llama 3.2 1B [1]. - The model excels in specific tasks, achieving performance levels comparable to larger models, making it suitable for offline and web-based creative tasks [3]. Model Architecture - The Gemma 3 270M features a lightweight yet powerful architecture with 270 million parameters, including 170 million embedding parameters and 100 million Transformer module parameters, supported by a large vocabulary of 256k tokens [4]. - The model is designed for low power consumption, consuming only 0.75% of battery over 25 dialogues on the Pixel 9 Pro SoC, making it Google's most energy-efficient Gemma model [4]. Instruction Following and Deployment - The model has excellent instruction-following capabilities, providing a pre-trained checkpoint that can respond to general instructions "out of the box" [4]. - It supports quantization-aware training (QAT) checkpoints, allowing operation at INT4 precision with minimal performance loss, crucial for deployment on resource-constrained devices [4]. Target Use Cases - The Gemma 3 270M is ideal for users with high-capacity, well-defined tasks who need cost-effective, rapid iteration and deployment, or have privacy concerns [5]. - The launch of this lightweight model addresses the misconception that larger parameter sizes equate to better performance, demonstrating the effectiveness of smaller models in instruction adherence and fine-tuning [5].
谷歌版小钢炮开源!0.27B大模型,4个注意力头,专为终端而生
量子位· 2025-08-15 06:44
Core Viewpoint - Google has launched the open-source model Gemma 3 270M, which is compact and efficient, capable of running locally in a browser without internet connectivity, and demonstrates superior performance compared to similar models like Qwen 2.5 [1][3][4]. Model Features - The new model contains 270 million parameters, with 170 million dedicated to the embedding layer and 100 million for the Transformer module, showcasing a lightweight architecture [14]. - It has a large vocabulary capacity of 256,000 tokens, allowing it to handle specific and rare vocabulary, making it ideal for further fine-tuning in specialized fields and languages [15]. - The model is designed for extreme energy efficiency, consuming only 0.75% battery after 25 dialogue rounds when run on a Pixel 9 Pro smartphone [17]. - It includes a pre-trained checkpoint that allows for precise instruction following right out of the box [18]. - The model supports quantization, enabling it to run at INT4 precision with minimal performance loss, which is crucial for deployment on resource-constrained devices [19]. Application Scenarios - The lightweight model has proven effective in real-world applications, such as a collaboration between Adaptive ML and SK Telecom, where a specialized version of Gemma 3 was fine-tuned for complex multilingual content moderation [20]. - The fine-tuned 270M model can be deployed on lightweight, low-cost infrastructure, allowing for rapid iteration and deployment of customized models for specific tasks [24]. - It ensures user privacy by allowing complete local operation without sending data to the cloud [24]. - The model is suitable for batch processing tasks like sentiment analysis, entity extraction, and creative writing, while also significantly reducing inference costs and response times in production environments [27]. Getting Started - Users can access the model from platforms like Hugging Face, Ollama, Kaggle, LM Studio, or Docker [25]. - Personalization can be achieved using tools such as Hugging Face, UnSloth, or JAX, followed by easy deployment to local environments or Google Cloud Run [28].
从感知能力提升到轻量化落地,具身这条路还要走很长一段时间~
自动驾驶之心· 2025-07-02 02:05
Core Viewpoint - The embodied intelligence industry is expected to experience explosive growth by 2025, driven by technological advancements and application traction, shaping both the technical roadmap and commercialization pathways [1]. Group 1: Technological Developments - Upgrades in perception capabilities and multimodal integration are crucial for the development of embodied technologies, with a focus on tactile perception, particularly in dexterous hands, enhancing precision and feedback [1]. - Multimodal sensor fusion technology allows robots to process various types of information simultaneously, significantly improving environmental perception accuracy and comprehensiveness [1]. - Large model-driven algorithms are enhancing robots' understanding of the world, particularly in humanoid robots, by improving perception, autonomous learning, and decision-making capabilities [1]. - Lightweight model design is becoming a pressing need for industry implementation, requiring low-computation, multimodal, and cross-platform models [1]. Group 2: Simulation and Data Ecosystem - The continuous improvement of simulation environments and data ecosystems is vital for embodied intelligence, providing efficient training platforms for robots [1]. - Simulations based on physical world principles help in modeling and analyzing various phenomena, aiding robots in understanding physical interactions and operations [1]. - The alignment of simulation and real-world environments is a key challenge that researchers are working to overcome [1]. Group 3: Community and Resources - The "Embodied Intelligence Heart Knowledge Planet" serves as a technical exchange platform for various stakeholders in the field, including members from renowned universities and leading robotics companies [6]. - The community has compiled over 40 open-source projects and nearly 60 datasets related to embodied intelligence, along with mainstream simulation platforms and various learning pathways [6][12]. - Members can access a wealth of resources, including research reports, technical learning routes, and job opportunities in the embodied intelligence sector [11][14].