多模态交互 - filings, earnings calls, financial reports, news - Reportify

多模态交互

Search documents

客易云数字人接入Banana API：以无缝融合开启智能交互新维度的“技术桥梁”

Sou Hu Cai Jing· 2026-01-10 17:04

Core Viewpoint - The demand for digital humans has shifted from single-function implementation to deep integration across multiple systems and scenarios, addressing challenges such as fragmented technology stacks and poor interface compatibility [1][10]. Group 1: Banana API Integration - The Banana API provides a standardized and highly compatible "technical bridge" that integrates the core capabilities of digital humans with the flexibility of existing business systems, allowing for a "one-time access, all-domain invocation" solution [1][10]. - Traditional digital human integration often fails due to mismatched interface protocols and incompatible data formats, but the Banana API eliminates these barriers through a unified technical architecture and protocol standards [2][10]. - The "multi-end adaptation" feature of the Banana API allows digital humans to automatically recognize and adapt to different interaction scenarios, significantly reducing integration costs and enhancing service coverage [2][5]. Group 2: Dynamic Scalability - The Banana API's low-code configuration engine enables businesses to quickly adapt digital humans to new scenarios without modifying code, allowing for rapid adjustments through a visual interface [5][10]. - For instance, an educational institution can switch a digital human from a "daily Q&A" mode to a "course recommendation" mode by simply uploading new course materials and setting recommendation rules [5][10]. Group 3: Multi-Modal Interaction - The Banana API enhances user experience through deep collaboration in multi-modal interactions, allowing for comprehensive analysis of user inputs across voice, text, images, and actions [6][10]. - This multi-modal perception and output generation ensure that digital humans can respond in a more natural and human-like manner, breaking the barriers of mechanical interaction [6][10]. Group 4: Security and Control - Security and controllability are foundational to the long-term application of digital humans, with the Banana API implementing end-to-end encryption and localized deployment options to protect sensitive data [7][10]. - The API adheres to data minimization principles and provides real-time monitoring features to ensure compliance with industry regulations [7][10]. Group 5: Open Ecosystem - The Banana API fosters collaborative innovation in the digital human ecosystem by providing lightweight access solutions for developers and research institutions [9][10]. - This open ecosystem accelerates the iteration of digital human technology and reduces innovation costs for enterprises, allowing them to explore new scenarios and models without reinventing the wheel [9][10].

digihuman(BJ:835670)

数字人技术

多模态交互

Artificial Intelligence

客易云数字人

数字人技术

多模态交互

Artificial Intelligence

客易云数字人

美股盘前要点 | 特朗普拟猛增军费预算，商务部回应审查Meta收购Manus

Ge Long Hui· 2026-01-08 12:37

Group 1 - U.S. stock index futures are all down, with Nasdaq futures down 0.25%, S&P 500 futures down 0.17%, and Dow futures down 0.34% [1] - Major European stock indices are collectively down, with Germany's DAX down 0.08%, UK's FTSE 100 down 0.31%, France's CAC down 0.25%, and the Euro Stoxx 50 down 0.32% [2] - Chevron is reportedly negotiating with the U.S. government to expand its operating license in Venezuela to increase crude oil exports [7] - ExxonMobil expects a decrease in fourth-quarter profits by $800 million to $1.2 billion due to falling oil prices [8] - Lockheed Martin delivered a record 191 F-35 fighter jets to the U.S. and its allies last year [9] - Morgan Stanley will replace Goldman Sachs as Apple's credit card business partner [10] - Ford plans to launch an L3 level driving assistance system by 2028, allowing drivers to free their eyes and hands [11] - Disney leads the global box office with $6.58 billion, according to the 2025 Hollywood box office rankings [12] - Alibaba Cloud has released a multimodal interaction development kit applicable to AI glasses and robots [15] - JD.com has established a "Chameleon Business Unit," with the second batch of self-developed AI toys set to launch in mid-January [16] - XPeng will release four new cars at the beginning of the year and plans to scale production of humanoid robots and flying cars this year [17] - FF announced the FX Super One three-phase delivery robot strategy, expecting positive cash flow within three years [19]

L3级驾驶辅助系统

多模态交互

L3级驾驶辅助系统

多模态交互

阿里云发布全新多模态交互开发套件可应用于AI眼镜、机器人等

Zhi Tong Cai Jing· 2026-01-08 06:22

Core Insights - Alibaba Cloud has launched a new multimodal interaction development kit that integrates three foundational models: Qianwen, Wanxiang, and Bailing, enabling devices to listen, see, think, and interact with the physical world [1][2] - The kit is compatible with over 30 mainstream ARM, RISC-V, and MIPS architecture terminal chip platforms, facilitating rapid integration with most hardware devices in the market [1] - The development kit includes over ten pre-set Agents and MCP tools for various applications in daily life, work efficiency, and entertainment, enhancing user interaction capabilities [1][2] Group 1 - The multimodal interaction development kit supports full-duplex voice, video, and text interactions, with end-to-end voice interaction latency as low as 1 second and video interaction latency as low as 1.5 seconds [1] - The kit connects to Alibaba Cloud's Bailian platform ecosystem, allowing users to add third-party Agents and expand application capabilities significantly [2] - Solutions for smart wearable devices and companion robots have been showcased, including features like real-time anomaly monitoring and keyword-based video search [2] Group 2 - In the AI glasses sector, the kit enables functionalities such as simultaneous translation, photo translation, multimodal memos, and audio transcription through a complete interaction chain [2] - The development kit aims to optimize the deployment and inference performance of the Tongyi model family on RISC-V architecture in collaboration with Xuantie RISC-V [1] - The pre-set travel planning Agent allows users to access route planning, travel guides, and leisure exploration capabilities directly [1]

多模态交互

Software and Services

阿里云多模态交互开发套件

多模态交互

Software and Services

阿里云多模态交互开发套件

阿里云推出面向AI硬件的多模态交互开发套件

Zheng Quan Shi Bao Wang· 2026-01-08 03:20

Core Viewpoint - Alibaba Cloud has launched a multimodal interactive development suite that integrates three foundational models, enabling advanced interaction capabilities with physical devices [1] Group 1: Product Features - The development suite includes three models: Qianwen, Wanxiang, and Bailing, which enhance its functionality [1] - It comes preloaded with over ten agents and MCP tools tailored for various fields such as leisure and work efficiency [1] - The suite is designed to enable devices to listen, see, think, and interact with the physical world, making it applicable for AI glasses, learning machines, companion toys, and smart robots [1]

多模态交互

多模态交互开发套件

多模态交互

多模态交互开发套件

联想发布Yoga Mini i迷你主机：小巧便携，性能强劲

Xin Lang Cai Jing· 2026-01-08 01:30

Core Insights - Lenovo launched the new Yoga Mini i mini PC during the International Consumer Electronics Show on January 7, 2026, emphasizing portability and structural strength with a cylindrical design weighing 600 grams and a volume of 0.65 liters [2][3] Product Features - The device integrates speaker and microphone modules, supporting multimodal interaction experiences. It includes an accelerometer and touch sensing capabilities for intuitive user interaction, along with a fingerprint recognition module for enhanced security and convenience [2][3] - Hardware specifications allow for an Intel Core Ultra X7 series 358H processor, up to 32GB of LPDDR5X RAM, and a 2TB PCIe 4.0 solid-state drive [2][3] - Wireless connectivity supports Wi-Fi 7 and Bluetooth 6.0 standards, featuring user presence detection based on Wi-Fi signal changes, enabling automatic wake-up operations [2][3] - The device offers a rich array of ports, including two Thunderbolt 4 ports, two full-featured USB-C Gen 2 ports, USB-A, and HDMI ports, supporting connections for up to four external displays [2][3] Pricing and Availability - The expected starting price for the Yoga Mini i is $699.99, with plans for official release in June 2026 [2][3]

LENOVO GROUP(HK:00992)

多模态交互

计算机硬件

Yoga Mini i迷你主机

多模态交互

计算机硬件

Yoga Mini i迷你主机

30亿美元天价收购以色列公司，英伟达在下一盘怎样的大棋？

Zhong Guo Qi Che Bao Wang· 2026-01-04 08:51

Core Insights - The core focus of major global chip companies, including NVIDIA, is accelerating their layout in the automotive intelligence and electrification sectors for the upcoming year and beyond [2] Group 1: Acquisition of AI21 Labs - NVIDIA is in advanced negotiations to acquire Israeli AI startup AI21 Labs for up to $3 billion, which was valued at $1.4 billion during a previous funding round in 2023 [2][3] - AI21 Labs has made significant advancements in natural language processing (NLP) and generative AI, particularly in multimodal interaction and efficient data processing [3] - The acquisition aims to leverage AI21 Labs' top-tier AI research team and their potential for future development, enhancing NVIDIA's capabilities in automotive AI model training and data processing [4] Group 2: Strategic Shift - NVIDIA is transitioning from being a hardware leader to becoming a leader in AI ecosystem construction, integrating AI innovations into its automotive business [5][6] - The company is expanding its automotive business, showcasing strong growth and aiming to provide comprehensive solutions beyond just high-performance computing chips [6][7] - NVIDIA's next-generation Thor platform will deliver 2000 TOPS of computing power, facilitating a shift from distributed to centralized electronic architectures in vehicles [7] Group 3: Competitive Landscape - The competition in the automotive chip market is intensifying, with NVIDIA's technology offering comprehensive environmental perception through data fusion from multiple sensors [7][8] - The acquisition signals a shift in the automotive industry towards a full-stack competition involving hardware, algorithms, data ecosystems, and service scenarios [8] - Despite NVIDIA's current leadership in automotive intelligence, emerging competitors like Tesla are posing significant challenges, necessitating continuous investment in R&D to maintain a competitive edge [8][9] Group 4: Future Industry Dynamics - The future of automotive intelligence will revolve around the balance between monopoly and innovation, as well as open versus closed competition [9] - Companies in the automotive and chip sectors must enhance their technical capabilities and innovation to navigate market changes and challenges effectively [9]

Nvidia(US:NVDA)

汽车智能化

自然语言处理（NLP）

生成式AI技术

多模态交互

汽车智能化

自然语言处理（NLP）

生成式AI技术

多模态交互

海尔消费金融公布11周年成绩单：服务新市民近2000万

Zheng Quan Ri Bao Zhi Sheng· 2025-12-26 11:37

Group 1 - Haier Consumer Finance Co., Ltd. (referred to as "Haier Finance") celebrated its 11th anniversary, reporting a cumulative loan amount of approximately 172.4 billion yuan and serving 19.45 million real-name new citizens users [1] - The company has innovatively launched the "Smart Home Installment" service, enabling a zero-interest installment model through industry subsidies, facilitating the adoption of smart and green home appliances [1] - The "Smart Home Installment" service has been implemented in over 2,000 Haier specialty stores nationwide, with a cumulative installment amount of 130 million yuan, significantly boosting retail sales by over 30% during special promotional events [1] Group 2 - Haier Finance has issued nearly 11.2 billion yuan in Asset-Backed Securities (ABS) to optimize financing costs and improve the asset-liability structure, supporting inclusive financial innovation and AI technology development [2] - The company has submitted over 730 patent applications, with more than 95% of its 600 online operational applications being self-developed, focusing on cutting-edge areas such as AI and multimodal interaction [1] - In the innovative financing sector, Haier Finance plans to implement the industry's first "Technology Finance + Green Finance" sustainable development-linked syndicated loan of 900 million yuan by 2025 [1]

多模态交互

大模型应用

多模态交互

大模型应用

海尔消金发布11周年成绩单：发行ABS总规模112亿元，累计为新市民放款约1724亿元

Jing Ji Guan Cha Wang· 2025-12-26 09:31

Core Insights - Haier Consumer Finance has achieved a cumulative asset-backed securities (ABS) issuance scale of nearly 11.2 billion, marking a key component of its diversified financing matrix [1] - In 2025, Haier Consumer Finance actively issued two ABS products in the public market, totaling 3 billion, with the second phase's priority A tranche interest rate dropping to 1.80%, setting a new low for the consumer finance industry that year [1] - The low-cost and efficient ABS issuances have optimized the company's asset-liability maturity structure, reduced comprehensive financing costs, and injected strong momentum into core business development [1] Financing and Technology - Leveraging an AI-driven technology system and a team of over 70% technology risk control talents, Haier Consumer Finance has served 19.45 million real-name new citizens, with a cumulative loan amount of approximately 172.4 billion, demonstrating the power of inclusive finance [2] - The Smart Home Installment business has covered over 2,000 Haier specialty stores nationwide, with a cumulative installment amount of 130 million, where special promotional events can boost retail sales by over 30% in partner stores [2] - Haier Consumer Finance has submitted over 730 patent applications, with more than 95% of its 600 online operational applications being self-developed, focusing on cutting-edge areas such as AI, multimodal interaction, and large model applications [2]

多模态交互

大模型应用

资产证券化（ABS）产品

多模态交互

大模型应用

资产证券化（ABS）产品

从“抖音同款”到“豆包同款”：视频云正在进入 Agent 时代

Sou Hu Cai Jing· 2025-12-24 17:22

作者 | 凌敏对于普通人而言，音视频算得上是最"接地气"的技术——不需要具备专业背景，就能直观地感受到技术能力高低带来的体验层面的差异。比如，观看世界杯直播，模糊的画面、明显的延迟、卡顿的互动，都能直接影响球迷观看体验。在移动互联网时代，人们对于音视频技术的要求其实很简单，就是"看得清、看得爽"。这也是火山引擎视频云能够在这一时期杀出重围的关键——火山引擎将抖音在亿级 DAU 场景下长期打磨和验证的能力，封装成一系列解决方案，向业界输出"抖音同款"的音视频能力，重点解决画质、时延、稳定性和大规模分发问题，为用户带来更高清、更实时、更沉浸的视频体验。核心引擎：AI MediaKit 将"王牌"原子能力引入大模型在过去，传统媒体工具套件的核心是媒体数据处理与服务的技术集合，这套经典能力长期用于开发音视频播放或录制的各类功能。但在 AIGC 时代，媒体价值链路被重新定义，内容不再只是拍摄、播放，而是生成、分析、理解、消费；用户也不再只是观看，而是通过自然语言、语音、图像等方式参与交互。这也是为什么，火山引擎视频云选择将经典能力升级为 AI MediaKit——作为面向 AI 云原生时代的极致效率 ...

多模态交互

火山引擎视频云

AIGC传输系统

多模态交互

火山引擎视频云

AIGC传输系统

QuestMobile：豆包、DeepSeek、元宝周活跃用户位居前三

Feng Huang Wang· 2025-12-23 05:17

Group 1 - The active user rankings for AI-native apps have changed significantly, with Doubao, DeepSeek, and Yuanbao leading the list with weekly active users of 155 million, 81.56 million, and 20.84 million respectively [1] - The second tier includes Antifufu, Qianwen, and Doubao Aixue, with weekly active users of 10.25 million, 8.72 million, and 7.22 million respectively [1] - From July to November 2025, the industry chain completed 186 financing events, amounting to 33.67 billion yuan, a 20.8% increase compared to the first half of the year [1] Group 2 - Over 200 AI applications were launched from July to November 2025, with AI application plugins, PC web versions, and AI-native apps accounting for 81.5%, 10.7%, and 7.8% respectively [2] - Vertical applications that deeply understand user personalization needs and pain points have become breakthroughs, with AI image processing, AI professional consulting, AI efficiency office, AI social interaction, and AI copywriting accounting for 24.9%, 18.5%, 6.8%, 5.9%, and 5.9% respectively [2] - As of November 2025, eight major manufacturers have launched a total of 409 large models, with single-modal, multi-modal, and full-modal models accounting for 61.4%, 36.7%, and 1.9% respectively [1]

Seek .(US:SKLTY)

多模态交互

Artificial Intelligence

多模态交互

Artificial Intelligence