多模态Agent
Search documents
融资35亿后,Kimi神秘模型现身竞技场
量子位· 2026-01-05 05:00
发现这个新模型的推特网友询问了模型的身份,结果模型自报家门,表示自己来自月之暗面Kimi,训练数据截止到2025年1月。 另有网友表示,Kiwi-do表现出了一些有趣的结果,尤其是在竞技场当中。 那么,Kiwi-do的真实身份究竟是什么呢? 克雷西 发自 凹非寺 量子位 | 公众号 QbitAI 融资35亿后,Kimi的新模型紧跟着就要来了?! 大模型竞技场上,一个名叫 Kiwi-do 的神秘模型悄然出现。 神秘模型就是K2-VL? 最早发现Kiwi-do的博主先是对比了Kiwi-do和已上线的K2-Thinking在SVG绘图上的表现。 绘画的内容分别是一只骑自行车的鹈鹕和一个游戏手柄,下面这组图就是Kiwi-do的作品。 而K2-Thinking的绘制结果长下面这样,两个结果有明显差别。 但除了SVG绘图与K2-Thinking相比有区别之外,没有更多信息可以用来推测模型身份。 还有网友猜测可能是一个小参数模型。 | Kenshi AI ど ♥ @kenshii_ai · 1月4日 | | | | --- | --- | --- | | Small param model? | | | | โ | C | ...
全球大公司要闻 | 摩尔线程首次披露GPU路线图
Wind万得· 2025-12-21 22:35
Group 1 - ByteDance announced the release of the Doubao large model 1.8 and the Seedance 1.5 Pro video generation model, entering the "multimodal agent" field, with enterprise users able to access it via Volcano Engine API starting December 23 [2] - Changan Automobile received the first L3-level autonomous driving license in China, marking the country's advancement in commercializing autonomous driving [2] - Moore Threads unveiled its new GPU architecture "Huagang" at the MUSA Developer Conference, boasting a 50% increase in computing power density and a 10-fold efficiency improvement [3] Group 2 - SoftBank Group is working to finalize a $22.5 billion investment in OpenAI by year-end, potentially using its stake in Arm as collateral [3] - Guizhou Bailing faced penalties totaling 25.6 million yuan due to false records in multiple annual reports, with its stock being suspended and then marked as ST [5] - Alibaba's DingTalk initiated a secret project "D Plan" to enter the AI hardware market, speculated to launch smart hardware products [5] Group 3 - OpenAI improved its "compute margin" to 70% as of October, significantly up from 52% at the end of 2024 [8] - Nike projected a low single-digit revenue decline for Q3, reflecting weak consumer demand and increased market competition [8] - Tesla's CEO Elon Musk had a legal victory restoring his $55-56 billion compensation plan, which may impact the company's governance structure [8] Group 4 - Samsung Electronics launched the world's first 2nm mobile application processor Exynos 2600, with AI computing power increased by 113% compared to the previous generation [10] - Toyota launched the new Levin L and Corolla models, with prices starting at 129,800 yuan and 99,000 yuan respectively, while also expanding its hydrogen network in California [10] - Mitsubishi UFJ Financial Group acquired a 20% stake in Shriram Finance, part of a broader trend of mergers and acquisitions in Japan [10] Group 5 - BMW Group opened a battery recycling center in Bavaria, capable of processing several tons annually, utilizing innovative direct recycling technology [14] - LVMH continued to invest in high-end beauty brands to strengthen its competitive position in the beauty market [14] - Swedish Stegra's green steel plant project has surpassed 50% installation progress of its electrolyzers, aiming for production in 2026 [14]
火山引擎FORCE大会追踪(1):豆包1.8/Seedance1.5Pro发布
Haitong Securities International· 2025-12-21 13:32
Investment Rating - The report does not explicitly state an investment rating for the industry or specific companies involved. Core Insights - The launch of Doubao Large Model 1.8 and Seedance 1.5 Pro at the Volcengine FORCE Conference indicates significant advancements in AI capabilities, particularly in multimodal applications and audio-video synchronization [1][13] - Doubao's average daily token usage has exceeded 50 trillion, reflecting a more than 10-fold year-over-year increase, and it serves over 100 enterprise customers, indicating successful scaling in production environments [1][14] - The introduction of the "AI Savings Plan" aims to transition AI model consumption from fragmented trials to centralized procurement, reducing friction costs for enterprises [4][17] Summary by Sections Doubao Large Model 1.8 - Doubao 1.8 focuses on solving the "last mile" issue for enterprise Agent deployment, enhancing multi-tool orchestration and reliable execution under complex instructions [2][15] - The model's capabilities are designed to support high-value scenarios such as quality inspection and retail operations, directly impacting ROI considerations for enterprise clients [2][15] Seedance 1.5 Pro - Seedance 1.5 Pro offers high-fidelity audio-visual synchronization and multilingual lip-sync capabilities, addressing common challenges in AI video generation [3][16] - The "Draft Preview" mechanism introduced in Seedance 1.5 Pro significantly improves creation efficiency by approximately 65%, facilitating standardized production processes in various sectors [3][16] Enterprise Solutions - The AgentKit and HiAgent platforms are designed to streamline deployment and integration costs for enterprises, addressing challenges in permission management and system observability [4][17] - The combination of model capabilities, platform tools, and pricing mechanisms aims to lower the total cost of ownership (TCO) for enterprises, fostering customer loyalty and reducing barriers to AI deployment [4][17]
豆包家族继续发力,Agent是下一个战场?
Zheng Quan Shi Bao Wang· 2025-12-21 07:17
Group 1 - ByteDance has launched the Doubao Model 1.8, marking a significant advancement in AI technology, particularly in the "multimodal agent" sector [1] - The release of Doubao 1.8 indicates a shift from cognitive capabilities to collaborative functionalities, aiming to create AI as a digital employee with execution power rather than just a knowledge responder [1] - The introduction of the Seedance 1.5 Pro video generation model further accelerates the integration of AI into core production systems, enhancing the company's position in the video creation market [2] Group 2 - The Seedance 1.5 Pro model features an innovative native audio-video joint generation architecture, achieving millisecond-level audio-visual synchronization [2] - A new "Draft Sample" feature will be launched to lower creation costs and barriers, allowing creators to preview low-resolution samples that closely match the final output, improving overall efficiency by 65% and reducing ineffective creation costs by 60% [2] - Volcano Engine has introduced the "AI Savings Plan," offering tiered discounts on large model products, enabling companies to save up to 47% on costs [3]
豆包大模型日均调用量突破50万亿tokens 火山引擎深化AI时代Agent生态变革
Xin Lang Cai Jing· 2025-12-19 20:27
Core Insights - The article discusses the advancements in AI technology, particularly focusing on the launch of Doubao Model 1.8 and Seedance 1.5 pro by Huoshan Engine, highlighting their capabilities in multi-modal understanding and content creation [3][4][6]. Group 1: Doubao Model 1.8 - Doubao Model 1.8 has significantly enhanced its multi-modal understanding capabilities, increasing video frame understanding from 640 to 1280 frames, which supports various applications like online education and industrial quality inspection [4][5]. - The model's tool usage and complex instruction adherence capabilities have been improved, making it suitable for enterprise-level tasks that require planning and execution [5][6]. - Doubao Model 1.8 supports a context window of 256K and offers API management for context, optimizing performance while reducing costs [5][6]. Group 2: Seedance 1.5 pro - Seedance 1.5 pro introduces a native audio-video joint generation architecture, allowing for real-time synchronization of audio and visual elements, enhancing the realism of generated videos [6][7]. - The model supports multi-language dialogue and precise lip-syncing, significantly improving the global creative potential of video content [7][8]. - A "Draft Sample" feature will be launched to allow creators to preview low-resolution samples, increasing efficiency by 65% and reducing ineffective production costs by 60% [8]. Group 3: AI Cloud-Native Architecture - Huoshan Engine is transitioning to an AI cloud-native architecture to support the scaling of enterprise Agent applications, addressing challenges in identity management and system integration [9][10]. - The AgentKit platform has been upgraded to cover the entire lifecycle of Agent development, deployment, and management [9]. - The average number of intelligent agents per enterprise is expected to increase from dozens in 2024 to over 200 in 2025, with applications expanding from consumer entertainment to serious production scenarios [10].
大厂多模态Agent能力激战正酣
Zheng Quan Ri Bao· 2025-12-18 15:40
Core Insights - Volcano Engine officially launched Doubao-Seed-1.8 and Seedance1.5Pro at the FORCE conference, marking a significant advancement in the multi-modal agent landscape [1] - The daily token usage of Doubao model has surpassed 50 trillion, representing over a tenfold increase compared to the same period last year, with more than 100 enterprise clients using over 1 trillion tokens [1] Group 1: Product Development - Doubao-Seed-1.8 focuses on enhancing the capabilities of multi-modal agents, optimizing for complex instruction adherence and operational capabilities [2] - The model's video understanding capability has been upgraded to process 1280 frames per video, enabling high-precision analysis of lengthy visual information [2] - Seedance1.5Pro showcases advanced multi-modal integration, achieving millisecond-level audio-visual synchronization and addressing long-standing issues in AI video generation [2] Group 2: Industry Trends - The launch signifies a shift in the large model industry from parameter competition to a focus on multi-modal agents, emphasizing full-chain execution capabilities [3] - The IT infrastructure is transitioning from function-driven to intelligence-driven paradigms, with Volcano Engine's AI cloud-native architecture indicating a future dominated by agent-centric intelligent networks [3] - Large model applications are overcoming scalability barriers related to cost and stability [3] Group 3: Competitive Landscape - Major cloud vendors are shifting their strategic focus to multi-modal intelligent agent platforms, leading to a multi-dimensional competition encompassing full-stack technology and industry applications [4] - Alibaba Cloud has upgraded its AI ecosystem, achieving high scores in agent tool invocation capabilities and enhancing development efficiency through new frameworks [4] - Baidu has also upgraded its AI capabilities, supporting various modalities for creative tasks, indicating a competitive push in the multi-modal space [4] Group 4: Strategic Initiatives - Volcano Engine has upgraded its enterprise AI agent platform, AgentKit, covering the entire lifecycle from development to management [5] - The introduction of HiAgent workstation aims to facilitate scalable management and application of agents for enterprises [6] - The company has launched an "AI Savings Plan" promising up to 47% cost savings for pay-as-you-go enterprises, reflecting a commitment to enhancing model capabilities and infrastructure [6]
豆包大模型1.8正式发布,拥有更强多模态Agent能力,豆包日均使用量超过50万亿,推出成本节省计划降幅达47%
硬AI· 2025-12-18 14:05
Core Insights - The article highlights the launch of Doubao Model 1.8 by Volcano Engine, which features enhanced multimodal agent capabilities and a 256K ultra-long context for handling complex tasks [2][3][5] - Volcano Engine's "AI Savings Plan" aims to optimize user costs, offering savings of up to 47% on AI usage [3][17] - The company emphasizes the importance of expanding the AI market rather than competing for existing market share, predicting a potential market growth of tenfold in the coming year [4] Model Capabilities Upgrade - Doubao Model 1.8 shows significant improvements in multimodal understanding, particularly in long video comprehension and security monitoring scenarios [5] - The model's context management allows companies to tackle complex tasks and support decision-making processes [5] - New image generation model Doubao-Seedream-4.5 offers capabilities such as multi-image combinations, creative photography, and virtual try-ons [5] Video Generation Enhancements - The Seedance series includes two versions: Seedance-1.0-Lite focuses on cost and speed, while Seedance-1.0-Pro delivers cinematic quality and native sound effects [7] Application Scenarios - Doubao Model has been integrated into smart hardware and voice assistants, covering daily communication, professional services, and online searches [9] Ecosystem Development - Volcano Engine introduced "Volcano Ark" inference outsourcing service, supporting major open-source models for seamless deployment [11] - The Viking series products enhance user input quality and facilitate the rapid construction of knowledge and memory bases for models and agents [13] - The company launched an enterprise-level AI Agent platform, AgentKit, which has been adopted by leading clients [15] Cost Optimization Plan - The "AI Savings Plan" allows users to join once and benefit from cost reductions across various models, with flexible payment options [17] - The initiative is expected to enhance performance and reduce costs, particularly for video generation models, and is seen as a potential investment opportunity in the AI application landscape [17]
【周四美股盘前你需要了解的全球要闻】 通胀超预期放缓!美国11月核心CPI为2.6%,创2021年以来最低涨幅。 美国上周首申人数回落至22.4万人,好于预期。 特朗普:将很快宣布新任美联储主席,是一个认同低利率的人选。 5比4惊险过关!英国央行“鹰派”降息25个基点,称进一步判断宽...
Sou Hu Cai Jing· 2025-12-18 14:05
Group 1 - The U.S. November core CPI is reported at 2.6%, marking the lowest increase since 2021, indicating unexpected easing of inflation [1] - Micron Technology's stock surged over 14% in pre-market trading due to strong chip demand, with both performance and guidance exceeding expectations [1] - Trump Media Group's stock rose over 30% in pre-market trading as the company plans to acquire nuclear fusion company TAE and aims to start construction of a nuclear fusion power plant next year [1] Group 2 - Eli Lilly's patients transitioning from Wegovy and Zepbound to its oral medication can effectively maintain weight loss results [2] - Hedge fund giants, including Point72 led by Steve Cohen, are considering entering commodity trading [3] - The Nikkei 225 index fell by 1%, while the Shanghai Composite Index rose by 0.16%, and the Hang Seng Index increased by 0.12% [4]
港股尾盘走强!关注今晚大事件,明天日本央行或加息、股指期货交割
Sou Hu Cai Jing· 2025-12-18 09:04
Core Viewpoint - The recent volatility in technology stocks, with significant fluctuations in prices, raises concerns about market stability and investor sentiment, particularly in light of recent economic announcements and corporate developments [1][2]. Group 1: Economic Announcements - U.S. President Donald Trump claimed that the government has rapidly reduced high prices and that wage growth is outpacing inflation [1]. - Federal Reserve Governor Waller indicated that there is still room for a 50 to 100 basis point interest rate cut, but aggressive action is not necessary given the current economic outlook [1]. Group 2: Corporate Developments - Oracle's planned $10 billion data center project has encountered funding issues as its main partner, Blue Owl, has withdrawn financial support, which could delay capital expenditures and impact Oracle's AI infrastructure expansion [1]. - Ford has canceled a $6.5 billion electric vehicle battery contract with LG Energy Solution, which represents 37.5% of LG's total revenue from the previous year, leading to a nearly 9% drop in LG's stock price [3]. Group 3: Market Reactions - Major U.S. AI stocks, including Oracle, Broadcom, Nvidia, and Google, experienced declines, with Oracle falling over 5% and Nvidia facing significant risk of breaking below key support levels [2][3]. - In the Chinese market, sectors such as optical modules and PCB boards also saw adjustments, with New Yi Sheng dropping over 4% and Industrial Fulian falling over 5% [2]. Group 4: Industry Performance - The banking, coal, oil and petrochemical, national defense, and light manufacturing sectors showed gains, while the electric equipment, communication, electronics, and machinery sectors faced declines [6].
豆包 1.8 多模态超越谷歌Gemini 3!字节祭出“推理代工”,要做模型届的英特尔?
AI前线· 2025-12-18 07:24
Core Insights - The article discusses the launch of Doubao Model 1.8 by Huoshan Engine, which is optimized for multi-modal agent scenarios, featuring a context window of 256k and various token limits for input and output [2][3]. Model Performance - Doubao 1.8 achieves a processing speed of 5000k tokens per minute (TPM) and 30k requests per minute (RPM), leading to significant improvements in various benchmarks, surpassing competitors like Gemini 3 [3][4]. - In specific benchmarks, Doubao 1.8 scored 94.6 in AIME-25 for mathematics and 85.7 in GPQA-Diamond for reasoning, indicating its strong performance across multiple tasks [4]. Multi-modal Capabilities - The model has enhanced multi-modal understanding, excelling in visual judgment, spatial understanding, document parsing, and video motion recognition, positioning it among the global leaders in these areas [3][7]. - Doubao 1.8 can efficiently process long videos, quickly identifying critical moments, which has applications in various sectors such as online education and safety inspections [5][7]. Business Applications - The model's capabilities allow for complex agent construction, which can create significant value across various industries, with a reported daily token usage exceeding 50 trillion, marking a 417-fold increase since its launch [6][16]. - Huoshan Engine introduced the "Doubao Assistant API," enabling businesses to utilize core agent capabilities easily, with plans to expand functionalities [16][17]. Cost Efficiency Initiatives - The "AI Savings Plan" offers unified pricing for enterprises using large models, allowing for cost savings of up to 47% based on usage [17]. - The "Inference Outsourcing" service allows businesses to upload encrypted model parameters without managing GPU infrastructure, potentially halving hardware and operational costs [18][19]. Creative Tools - The article highlights advancements in Doubao's image and video generation capabilities, including the new Seedream and Seedance models, which enhance creative processes in various applications [8][9]. - Seedance 1.5 Pro introduces features like synchronized audio-visual output and multi-language support, significantly improving content creation efficiency [9][13].