多模态大模型
Search documents
豆包大模型日均token用量破50万亿后,火山引擎将主战场押注Agent
Tai Mei Ti A P P· 2025-12-19 10:05
Core Insights - The release of Doubao Model 1.8 and Seedance 1.5 pro marks a significant update in AI capabilities, particularly in multi-modal understanding and Agent functionalities [2][4] - Doubao Model 1.8 has achieved a daily token usage of over 50 trillion, a tenfold increase from the previous year, with over 100 enterprise clients utilizing more than 1 trillion tokens [2][5] - The advancements in Agent capabilities are seen as a pivotal development, allowing for complex applications in enterprise scenarios [4][7] Group 1: Model Updates - Doubao Model 1.8 has significantly improved its tool-calling ability, allowing for the simultaneous use of over 20 tools, reducing planning steps by 37% and increasing execution success rates by 21% [5] - The model has enhanced capabilities in visual understanding, long video comprehension, and document structuring, along with native support for intelligent context management [5][6] - Seedance 1.5 pro is designed to meet the growing demand for video creation, featuring cinematic narrative tension and breakthroughs in audio-visual synchronization technology [2][5] Group 2: Industry Trends - The industry is still in its early stages, with ongoing technical limitations, but there is a strong demand for multi-modal models [3][7] - The Agent era is expected to continue its growth, with predictions of enterprises utilizing 50 to 200 Agents by 2025, necessitating improved management and operational capabilities [10] - Key sectors such as internet, retail, automotive, and education are rapidly adopting Agent technologies, while traditional industries are slower but have high potential [7][10] Group 3: Competitive Landscape - Major players like Anthropic, Google, and OpenAI are refining their models to enhance practical applications, with a focus on economic value and real-world utility [8][10] - The competition among large model vendors is anticipated to intensify as the Agent capabilities become more critical in the market [10]
火山引擎总裁谭待:谈论Agent与APP冲突还太早
第一财经· 2025-12-19 06:51
Core Insights - The article discusses the recent advancements in AI models by ByteDance's Volcano Engine, highlighting the launch of Doubao Model 1.8 and Seedance 1.5 pro, with Doubao's daily token usage exceeding 50 trillion, up from 30 trillion in September [2]. Group 1: AI Model Developments - Doubao Model's daily token usage has significantly increased, indicating growing adoption and demand for AI solutions [2]. - The industry is still in the early stages of AI implementation, with the transition from the APP era to the Agent era being characterized as a conflict of perspectives rather than a definitive shift [2][3]. - The core value of AI lies in optimizing unmet needs and enhancing efficiency, rather than merely replacing existing platforms [2]. Group 2: Challenges and Ecosystem Readiness - The exploration of AI and Agents is still in a trial phase, with market demand present but models not yet fully developed, a situation expected to persist for about three more years [3]. - The readiness of the ecosystem for comprehensive Agent integration is contingent on the improvement of Agent tools [3][4]. - Key challenges for Agents include foundational capabilities and real-world application requirements, such as stability, scalability, and data security [4]. Group 3: Multi-Modal AI and Future Trends - The introduction of multi-modal capabilities in AI models allows them to perform tasks similar to human functions, marking a shift towards deeper application scenarios [4]. - The rapid evolution of models is addressing many issues, with significant advancements made since last year [4]. - The competition among AI firms should focus on expanding the market and accelerating AI implementation across various industries [4]. Group 4: Cloud Services and Market Dynamics - Volcano Engine emphasizes the value of cloud services in the AI era, drawing parallels between the growth of AI cloud services and the GPU market surpassing CPUs [5]. - The shift towards AI-driven cloud services is expected to render traditional private deployment models obsolete, as the technology continues to evolve rapidly [5]. - The importance of cloud infrastructure is underscored by the challenges faced by fixed-capacity machines in supporting diverse AI applications [5].
AI 时代,如何定义电商营销新范式
Sou Hu Cai Jing· 2025-12-19 03:08
「用 AI 改造行业」这句话,电商行业已经喊了很多年。 但实际行动,却一直比较散点,停留在诸如「猜你喜欢」「以图搜图」这样单点的能力建设,不够深入,也难以实现真正的系统级 AI 升级改造。 核心瓶颈有二:一是早年 AI 技术尚未成熟,缺乏支撑全链路协同的底层能力;二是系统级改造需穿透产品核心逻辑,而新玩法、新能力的叠加必然推高 使用门槛。对普通商家而言,应对复杂系统往往需要投入高额人力物力,这让技术升级的规模化落地难以为继。 直到 2025 年,抖音电商率先用 AI 破局,解开了技术深度升级与商家门槛降低的两难问题。 而背后的秘密武器,正是不久前千川大会上重磅发布的「千川・乘方」:借助 AI 加持,以及平台的精准用户洞察,千川・乘方不仅能最大限度的简化商 家的操作,提升用户的内容体验,更做到了预判用户需求、激发用户需求、千人千策,以实现商家、用户、平台三方共赢。 那么千川·乘方到底是怎样一个产品?它是将如何撑起抖音电商高增长、用户体验、商家体验的之间的不可能三角的? 再看技术引擎。过去五年,虽然基于深度学习的推荐模型一直是电商搜推的核心,但是包括多模态 AI 在内的更多技术则一直停留在辅助工具层面,核心 原因 ...
火山引擎总裁谭待:谈论Agent与APP冲突还太早
Di Yi Cai Jing· 2025-12-18 15:26
Core Insights - ByteDance's cloud platform Volcano Engine has released the Doubao model 1.8 and the Seedance 1.5 pro audio-video creation model, with Doubao's daily token usage exceeding 50 trillion, up from 30 trillion in September [2] - The industry views the targeted restrictions on internet apps as a conflict between the "Agent era and the APP era," but the president of Volcano Engine, Tan Dai, believes that the core value for users lies in achieving goals more conveniently and at lower costs, regardless of the medium used [2] - Tan Dai emphasizes that AI's primary role should be to optimize the efficiency of unmet needs, suggesting a coexistence of Web, APP, and Agent rather than a replacement [2] Industry Readiness - The exploration of AI and Agents is still in a trial phase, with market demand present but models not yet fully developed, a situation expected to last for about three more years [3] - The core issue regarding the industry's readiness for Agent integration lies in the improvement of Agent tools, with Volcano Engine investing significant resources to make existing functions recognizable and callable by Agents [3] - Tan Dai notes that both Doubao AI assistants and APPs consist of complex Agent collections, facing challenges in foundational capabilities and real-world application requirements [3] Multi-Modal Models - By the end of 2025, leading domestic and international model manufacturers are intensifying efforts, with multi-modal models like Seedance 1.5 pro marking a shift towards deeper AI applications [4] - Multi-modal capabilities allow models to "see, hear, speak, and act," moving beyond text-based interactions to practical applications such as traffic recognition and quality inspection [4] - Tan Dai believes that while multi-modal models face data challenges, significant progress has been made compared to last year, and the pace of model advancement is rapid [4] Cloud Services in AI Era - Volcano Engine continues to highlight the value of cloud services in the AI era, with AWS aiming for its generative AI platform Bedrock to become the "largest reasoning engine globally," comparable to its core computing service EC2, which is currently valued at around $40 billion [4] - Tan Dai acknowledges this trend and compares the development of MaaS (Model as a Service) to the chip business, indicating a shift from GPU training to inference processes [4] Future of AI Hardware - Tan Dai cites the early 2025 AI wave as evidence of the importance of cloud business, noting that many users faced issues with fixed-capacity AI hardware due to rapid technological iterations [5] - The inability to privatize deploy technologies like Agents and the fixed capabilities of one-machine solutions hinder the successful implementation of diverse AI applications [5] - Consequently, the private one-machine model from the software era is expected to be phased out in the AI era [5]
商汤科技预计配售31.5亿港元,用于多模态大模型研发和垂直场景商业落地
Ge Long Hui· 2025-12-18 00:55
Group 1 - The core announcement from SenseTime is the placement of 1.75 billion new Class B shares, expected to raise approximately HKD 3.15 billion, reflecting strong market confidence in the company's long-term value and development prospects [1] - The proceeds from the share placement will be used to enhance SenseTime's leading position in the full-stack AI field, including the development of an industry-leading AI cloud and expansion of AI infrastructure [1] - The funds will also support research and development in generative AI and the commercialization of products derived from multimodal large models, as well as exploring AI technology integration in vertical sectors such as finance and education [1] Group 2 - SenseTime has achieved breakthroughs in product ecosystem co-construction, launching several applications based on its multimodal large model since the start of the "SenseTime Product Release Week" on December 15 [2] - New products include the industry's first integrated creative and multi-episode generation intelligent agent Seko 2.0, which is compatible with domestic AI chips from Cambrian [2] - Additional products such as the AI office assistant Xiaohuanxiong 3.0 and the marketing intelligent agent Ruying have been introduced, showcasing SenseTime's continuous leadership in integrating AI technology with practical applications [2]
商汤科技预计配售31.5亿港元,继续扩大大装置规模和提升国产化比例
Jin Rong Jie· 2025-12-18 00:35
Group 1 - The core announcement from SenseTime is the placement of 1.75 billion new Class B shares, expected to raise approximately HKD 3.15 billion, reflecting strong institutional confidence in the company's long-term value and growth prospects [1] - The proceeds from the share placement will be used to enhance SenseTime's leading position in the full-stack AI sector, including the development of an industry-leading AI cloud and expansion of its AI infrastructure [1] - The funds will also support research and development in generative AI and the commercialization of products derived from multi-modal large models, as well as exploring AI applications in various verticals such as finance and education [1] Group 2 - SenseTime has made significant breakthroughs in product ecosystem co-construction, launching multiple applications based on its multi-modal large model since the start of the "SenseTime Product Release Week" on December 15 [2] - New products include the industry's first integrated creative and multi-episode generation intelligent agent Seko 2.0, which is compatible with domestic AI chips from Cambrian [2] - Additional products such as the AI office assistant Xiaohuanxiong 3.0 and marketing intelligent agent Ruying, along with innovative robots and the Kapi family, showcase SenseTime's ongoing leadership in integrating AI technology with practical applications [2]
商汤-W(00020)拟配售17.5亿股新B类股份 净筹约31.46亿港元
智通财经网· 2025-12-17 23:19
Core Viewpoint - SenseTime-W (00020) announced a placement agreement to issue 1.75 billion shares at a price of HKD 1.80 per share, representing an approximate 8.63% discount to the closing price of HKD 1.97 on December 17, 2025 [1] Group 1: Placement Details - The placement will involve at least six subscribers and will account for approximately 4.60% of the issued B shares and about 4.52% of the total issued shares as of the announcement date [1] - The expected total proceeds from the placement are approximately HKD 3.15 billion, with net proceeds estimated at HKD 3.14 billion [1] Group 2: Use of Proceeds - 30% of the net proceeds will be used to support the company's core business development, including building an industry-leading AI cloud and expanding the scale of its AI infrastructure [1] - 30% will be allocated for research and development of generative AI and the development of products derived from the company's multimodal large models [1] - 20% will be used to explore the integration and application of AI in innovative verticals, including but not limited to finance and education [1] - 20% will be reserved for general working capital [1]
商汤-W(00020.HK)拟配售17.5亿股新B类股份 总筹31.5亿港元
Ge Long Hui· 2025-12-17 23:07
Core Viewpoint - The company, SenseTime-W (00020.HK), has announced a placement agreement to issue 1.75 billion shares at a price of HKD 1.80 per share, aiming to raise approximately HKD 31.50 billion in total proceeds [1] Group 1: Placement Details - The placement shares represent about 4.60% of the issued B shares and approximately 4.52% of the total issued shares as of the announcement date [1] - The net proceeds from the placement are expected to be around HKD 31.46 billion [1] Group 2: Use of Proceeds - 30% of the net proceeds will be used to support the company's core business development, including building an industry-leading AI cloud and expanding the scale of its AI infrastructure [1] - 30% will be allocated for research and development of generative AI and the development of derivative products based on the company's multimodal large models [1] - 20% will be used to explore the integration and application of AI technologies in innovative verticals, including but not limited to finance and education [1] - 20% will be allocated for general working capital of the company [1]
最近收到了很多同学关于具身方向选择的咨询......
具身智能之心· 2025-12-17 00:05
【具身智能之心论文辅导重磅上线!多模态大模型/VLA/强化学习/VLN/遥操作/数采/机器人仿 真/real2sim2real/端到端/diffusion等顶会方向1V1定制化辅导】 辅导区间 CCF-A到CCF-C 先看看具身的一些方向,vln、vla、强化、还有一些real2sim2real。很多小白不知道如何下手,选择强化学 习还是vla?传统slam还是vln?哪些方向需要较大算力,哪些不需要?除此之外,什么样的本体适合自己研 究,预算不够怎么办?仿真可以吗? 对正在从事slam的同学,vln和vla都是一个比较好的切入方向。如果有机械臂,展开vla是一个不错的选择。 除此之外,没有硬件的同学可以尽量在仿真里面或者使用低成本的so-100等硬件完成实验。也有很多低成 本的科研平台,比如移动操作平台。四足和人形更适合强化,vla难度过高。 剩下就是一些方法论的问题了,有好的idea至关重要。对很多新人研究者,一个好的idea需要踩很多次坑。 如果你还是新人,不知道怎么入门,可以看看我们推出的论文辅导。 论文辅导上线了 最近收到很多小伙伴的咨询,其中不乏大模型、传统机器人、机械方向的同学。 ✅ 顶会/顶刊 ...
商汤科技与寒武纪实现多模态大模型Day 0成功适配 激发AI前沿应用创新活力
智通财经网· 2025-12-16 11:25
Core Insights - The collaboration between SenseTime and Cambricon marks a significant milestone in the development of "domestic chips + domestic models," with the successful adaptation of the Seko series model on the same day it was released [1][2] - This partnership aims to enhance the domestic AI application ecosystem, making advanced multimodal AI capabilities more accessible and cost-effective for developers and enterprises [1][3] Group 1: Technological Collaboration - Cambricon's successful adaptation of SenseTime's Seko series models on "Day 0" demonstrates the rapid response capability of domestic chips in supporting local AI firms [2] - The Seko series, which includes models like SekoIDX and SekoTalk, serves as the core technology foundation for the Seko2.0 intelligent agent [2] Group 2: Ecosystem Development - The collaboration is focused on creating a more efficient and user-friendly tiered product system, aimed at fostering innovation in cutting-edge applications [3] - SenseTime's LightX2V framework incorporates a highly compatible domestic adaptation plugin model, enabling quick adaptation to various domestic hardware [3] - The partnership will lead to further optimization efforts, enhancing model efficiency and resource utilization while lowering the barriers to using multimodal AI [3]