Workflow
多模态大模型
icon
Search documents
阶跃星辰 Tech Fellow 段楠:Step-Video 系列模型的关键技术解读
AI科技大本营· 2025-03-21 06:35
4 月 18-19 日,由 CSDN&Boolan 联合举办的「2025 全球机器学习技术大会」将在上海虹桥西郊庄园丽笙大酒店隆重举行,本次大会共设 12 大技术 专题,云集院士、IEEE Fellow、顶尖学者、一线科技企业技术实战专家组成的超 50 位重磅嘉宾。他们将以独特的视角,解读智能体、联邦学习、多 模态大模型、强化学习等前沿议题。 在 4 月 18 日下午,走在多模态研究前沿的阶跃星辰 Tech Fellow,多模态基础模型领域专家段楠博士将在「多模态大模型前沿」专场带来《视频生成 基础模型进展、挑战和未来》的主题分享,分享其在视频生成基础模型方面的最新研究成果和前瞻性思考。 段楠博士拥有深厚的学术背景和丰富的产业经验。他长期深耕自然语言处理、代码智能、多模态基础模型和智能体等领域,是中国科学技术大学和西安 交通大学兼职博导,天津大学兼职教授。在加入阶跃星辰之前,段楠博士曾在微软亚洲研究院担任资深首席研究员及自然语言计算团队研究经理长达十 二年,对自然语言处理和多模态技术的发展做出了卓越贡献。 在 2025 全球机器学习技术大会上,段楠博士将围绕阶跃星辰开源的 Step-Video 系列模型,深入 ...
海康威视:跟踪报告之四:宏观信心修复,大模型规模化落地变现开启-20250309
EBSCN· 2025-03-08 18:39
Investment Rating - The report maintains a "Buy" rating for Hikvision [5][27]. Core Views - The company achieved a revenue of 92.49 billion yuan in 2024, a year-on-year increase of 3.52%, while the net profit attributable to shareholders was 11.96 billion yuan, a decrease of 15.23% [3][23]. - The recovery of macroeconomic confidence is indicated by the manufacturing PMI data, which rose to 50.2% in February, entering the expansion zone [1][10]. - The integration of multi-modal large models with smart hardware is expected to drive scalable monetization for Hikvision [2][15]. Summary by Sections Financial Performance - In 2024, the company reported a revenue of 92.49 billion yuan, with a growth rate of 3.52% [4][23]. - The net profit for 2024 was 11.96 billion yuan, reflecting a decline of 15.23% year-on-year [4][23]. - The earnings per share (EPS) for 2024 is projected to be 1.30 yuan, with a forecasted net profit of 14.54 billion yuan for 2025, representing a growth of 21.61% [4][28]. Business Development - The company is focusing on innovative business areas such as edge computing, robotics, and smart connected vehicles, with overseas business revenue exceeding 50% [3][23]. - The multi-modal large model technology is being integrated into various products, enhancing the company's competitive edge in the market [2][15]. Market Outlook - The report highlights the positive trend in the manufacturing sector, with a significant recovery in demand and production indices, which is expected to benefit Hikvision's performance [1][10]. - The company's strong position in the multi-modal large model space and its extensive user base across various industries are seen as key factors for long-term benefits in the evolving market landscape [3][27].
【海康威视(002415.SZ)】宏观信心修复,大模型规模化落地变现开启——跟踪报告之四(刘凯/王之含)
光大证券研究· 2025-03-07 14:30
Core Viewpoint - The company is experiencing short-term pressure on performance, with a slight increase in revenue but a significant decline in net profit, indicating potential challenges ahead [2]. Group 1: Financial Performance - In 2024, the company achieved operating revenue of 92.486 billion yuan, representing a year-on-year growth of 3.52% [2]. - The net profit attributable to shareholders was 11.959 billion yuan, showing a year-on-year decrease of 15.23%, indicating short-term performance pressure [2]. Group 2: Macro Environment - The manufacturing PMI for February recorded at 50.2%, indicating a return to the expansion zone, with a month-on-month increase of 1.1 percentage points, driven by rapid recovery in demand post-holiday [3]. - The improvement in macro factors that previously suppressed the company's performance and valuation is becoming significant [3]. Group 3: Policy and Security - The Central Political Bureau emphasized the construction of a safer China, which is expected to accelerate security and digital governance projects, potentially benefiting the company's PBG business directly [4]. Group 4: Innovation and Technology - The company is launching a series of products based on multi-modal large models, integrating advanced technology with embedded smart hardware, aiming for broader and more efficient applications across various industries [5]. - The focus on innovative business areas such as edge computing, robotics, and smart connected vehicles is expected to catalyze growth, with overseas business now accounting for over half of total operations [6].
声网发布对话式AI引擎:让任意大模型开口说话
36氪· 2025-03-07 09:37
Core Viewpoint - The article highlights the launch of Agora's conversational AI engine, which enables any text-based large model to be upgraded into a conversational multimodal model, emphasizing affordability and efficiency in AI voice interaction [2][4]. Group 1: Product Features - The conversational AI engine supports a wide range of large model providers, including DeepSeek and ChatGPT, allowing developers to choose freely [4]. - It features low latency with a median voice conversation delay of 650ms and an intelligent interruption technology that allows for responses as low as 340ms [5]. - The engine can filter out 95% of environmental noise, ensuring accurate voice recognition, and maintains stable conversations even under poor network conditions [5]. Group 2: Development and Cost Efficiency - Developers can deploy the AI engine with just two lines of code in about 15 minutes, significantly lowering the development barrier [6]. - The cost for AI voice interaction is set at 0.098 yuan per minute, with an initial bonus of 1000 minutes for new users [7]. - Average conversation costs are calculated to be around 0.03 yuan per interaction, making it highly economical for frequent use [8]. Group 3: Application Scenarios - The conversational AI engine can be utilized in various applications such as smart assistants, virtual companionship, language practice, customer service, and smart hardware [10]. - It enhances the functionality of smart devices by enabling voice control and personalized services, applicable in AI toys, educational hardware, and home assistants [10].
集齐了「鸿蒙」和「DeepSeek」两颗「龙珠」,深思考给出端侧AI「深度思考」
36氪· 2025-02-27 10:31
Core Viewpoint - The integration of AI edge models and hardware modules is set to drive a significant explosion in smart terminal applications, particularly with the introduction of DeepSeek-R1 and its adaptations for various edge scenarios [1][4][5]. Summary by Sections AI Edge Market Potential - The global AI edge market is projected to reach $143.6 billion by 2032, driven by applications in sectors such as medical devices, personal storage, and smart home technologies [6]. DeepSeek and Domestic Innovations - The integration of DeepSeek-R1 with WeChat marks a significant innovation in the domestic mobile internet landscape, showcasing the potential of large models in practical applications [4][5]. Technical Innovations by iDeepWise.ai - iDeepWise.ai has developed the Dongni-AMDC algorithm, which compresses the DeepSeek R1 model for edge deployment, ensuring low power consumption and high performance [8][11]. - The company has introduced the TinyDongni model, specifically designed for edge scenarios, with parameter sizes of 1.5B, 0.4B, and 0.15B, ensuring rapid response times and data security [19][21]. Collaboration with Domestic Hardware - iDeepWise.ai has partnered with leading domestic module manufacturers to create a comprehensive edge AI solution that integrates with various operating systems, including OpenHarmony and Linux [30][32]. - The collaboration with hardware manufacturers has reduced the development cycle for AI smart hardware by 50%, facilitating faster deployment in sectors like automotive and robotics [32]. Performance Metrics - The DeepSeek 1B model deployed on the Rockchip RK3588 achieves a processing speed of 10.2 tokens per second, while the TinyDongni model reaches 13.6 tokens per second, demonstrating significant advancements in edge AI performance [34][35]. Real-World Applications - iDeepWise.ai's edge models have been successfully implemented in various applications, including AI PCs for local multimodal searches and AI microscopes for medical diagnostics, showcasing their versatility and effectiveness [40][46]. - The company has a strong focus on the healthcare sector, having developed AI solutions that have processed over 30 million cervical cancer screenings, leveraging extensive medical literature for training [47][48]. Future Outlook - The company anticipates a surge in AI-enabled smart terminal applications, positioning itself to meet the growing market demand for localized AI solutions that prioritize user privacy and data security [49].