多模态智能体
Search documents
榜单更新,字节Seed2.0表现亮眼,我们还测了爆火的龙虾 |xbench 月报
红杉汇· 2026-03-04 02:49
春节期间,多家公司发布的最新模型,xbench都进行了评测,并对leaderboard进行了更新。xbench最新发布的多 模态理解benchmark BabyVision,已被近期发布的多款模型产品纳入评测体系;包括seed-2.0、Qwen3.5、Kimi K2.5在内的项目均在其公开发布的技术报告中引用了Babyvision,体现出社区对该Benchmark的持续关注与广泛 采用。 xbench采用长青评估机制,持续汇报最新模型的能力表现,更多榜单未来将陆续更新,期待你的关注。你可以 在xbench.org上追踪我们的工作和查看实时更新的Leaderboard榜单排名;欢迎通过team@xbench.org与我们取得联 系,反馈意见。 xbench-ScienceQA Leaderboard更新 | | 模型名 | API | 模式 | 公司 | 本视台 | BoN | 平均响应时间 | input cost | output cost | 发布时间 | 测评时间 | | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | ...
中国儒意涨超5% 公司与爱诗科技战略合作并对其战投 布局影视游戏智能化
Zhi Tong Cai Jing· 2026-02-10 02:18
Core Viewpoint - China Ruoyi (00136) has seen a significant increase in stock price, rising over 5% following the announcement of a strategic partnership with Aishi Technology, which includes a strategic investment of $14.2 million [1] Group 1: Strategic Partnership - China Ruoyi has entered into a strategic cooperation with Aishi Technology to explore the next generation of entertainment ecology through "AI + content" [1] - Aishi Technology will receive quality copyright authorization from China Ruoyi, which owns works such as "Detective Chinatown 1900," to jointly develop AI tools for professional film production [1] Group 2: Technological Advancements - The partnership aims to enhance the intelligent upgrade of content creation and operations for China Ruoyi's "Pumpkin Movies" [1] - Both companies will collaborate with "Jingxiu Games" to innovate the application of AI video generation technology in gaming gameplay and interactive experiences [1] Group 3: Research and Development - The focus will also be on the joint development of multimodal intelligent agents, targeting breakthroughs in real-time interaction and emotional interaction [1]
港股异动 | 中国儒意(00136)涨超5% 公司与爱诗科技战略合作并对其战投 布局影视游戏智能化
智通财经网· 2026-02-10 02:18
Core Viewpoint - China Ruyi (00136) has announced a strategic partnership with Aishi Technology, involving a strategic investment of $14.2 million, aimed at exploring the next generation of "AI + content" in the entertainment ecosystem [1] Group 1: Strategic Partnership - The partnership will leverage Aishi Technology's video generation model technology alongside China Ruyi's extensive IP resources and industry experience [1] - Aishi Technology will receive quality copyright authorization from Ruyi, which includes works like "Detective Chinatown 1900," to jointly develop AI tools for professional film production [1] Group 2: Content Creation and Operations - The collaboration will enhance the intelligent upgrade of Ruyi's "Pumpkin Movies" in content creation and operations [1] - The two companies will work together with "Jingxiu Games" to explore innovative applications of AI video generation technology in gaming gameplay and interactive experiences [1] Group 3: Research and Development - A key focus will be on the joint development of multimodal intelligent agents, aiming for breakthroughs in real-time interaction and emotional interaction [1]
AI市场将扩容10倍?多模态Agent需求逐步爆发
2 1 Shi Ji Jing Ji Bao Dao· 2025-12-19 07:16
Core Insights - The development trajectory of Doubao large model reflects the overall trend of China's large model industry moving from enthusiastic exploration to pragmatic implementation [2] - Doubao large model has established a unique path by focusing on "Model as a Service" (MaaS), penetrating both enterprise applications and end devices, and building a comprehensive AI capability system covering "cloud-edge-end" [1] Group 1: Development and Strategy - Since the AI model boom in 2023, Doubao has evolved from a tool embedded in existing ecosystems to a robust platform, integrating with ByteDance products like Douyin and Toutiao to refine its capabilities [2] - In 2024, the focus shifted towards making models more user-friendly and affordable, with innovations in service models such as pricing based on input length and intelligent model routing [2] - The introduction of a "model-centered AI cloud-native architecture" and various supporting infrastructures aims to enable efficient and economical deployment of AI agents for enterprises [3] Group 2: Market Position and Applications - Doubao large model has achieved a daily token usage of over 50 trillion, ranking first in China and third globally, with over 100 enterprises utilizing its platform [1] - The company has established a strong presence in high-value sectors, serving over 80% of systemically important banks and major securities firms, and covering 90% of mainstream automotive companies [5] - The strategy of focusing on high-value industries creates a positive feedback loop, where serving leading clients generates complex data that enhances model optimization, attracting more customers [5] Group 3: Future Outlook - The market for AI is expected to expand tenfold, with a shift in focus from competition to market growth, as stated by the company president [3] - The "AI Savings Plan" aims to reduce usage costs by up to 47%, further lowering the barriers for large-scale AI applications [5] - The company anticipates a shift in the ratio of enterprise to individual developers in the AI era, indicating a potential increase in individual developer participation [5]
起底“豆包手机”:核心技术探索早已开源,GUI Agent布局近两年,“全球首款真正的AI手机”
量子位· 2025-12-09 07:37
Core Insights - The article discusses the rapid success and technological foundation of the "Doubao Phone" and its assistant, which has gained significant attention in the market due to its advanced capabilities in automating tasks on mobile devices [1][50]. Group 1: Product Overview - The "Doubao Phone" sold out its initial stock of 30,000 units, with prices in the second-hand market doubling [1]. - The phone's assistant can automate complex tasks across applications, such as submitting leave requests and booking train tickets [4][5]. - The assistant is built on ByteDance's self-developed UI-TARS model, which has been optimized for mobile use [7][8]. Group 2: Technological Development - The UI-TARS model has undergone significant iterations, with the initial version released in January 2023, followed by UI-TARS-1.5 and the latest UI-TARS-2, which enhances the agent's capabilities [11][23][34]. - UI-TARS-2 addresses issues related to data scalability and multi-round reinforcement learning, allowing for more autonomous interactions with graphical user interfaces [34][35]. - The model has shown superior performance in various benchmarks compared to competitors like OpenAI's models [27][28]. Group 3: User Experience and Feedback - Users have reported high satisfaction with the assistant's ability to perform tasks efficiently, with one user describing it as the "world's first true AI smartphone" [69]. - The assistant's design includes a dual-mode system, allowing for both rapid responses and deeper reasoning capabilities [60][62]. - Concerns regarding privacy and security have been raised, but the company has emphasized that user consent is required for high-level permissions [50][51]. Group 4: Market Implications - The success of the "Doubao Phone" indicates a shift towards AI-driven mobile technology, where devices can autonomously understand and execute user intentions [85]. - The product's development reflects a broader trend in the industry towards integrating advanced AI capabilities into everyday technology, potentially redefining user interaction with mobile devices [86].
大湾区智能算力与大模型智能体论坛在深圳举办
Zhong Guo Xin Wen Wang· 2025-12-05 02:41
Core Insights - The "2025 Guangming Science City Forum: Greater Bay Area Intelligent Computing Power and Large Model Intelligent Agents Forum" was held in Shenzhen, focusing on intelligent computing infrastructure, large model technology innovation, and multi-modal intelligent agent applications [1][3] Group 1: Forum Highlights - The forum gathered experts and scholars from artificial intelligence, high-performance computing, and multi-modal intelligent agents to discuss cutting-edge topics [1] - The director of Pengcheng Laboratory, Gao Wen, highlighted the progress of the "Pengcheng Cloud Brain III" large scientific device, which aims to accelerate scientific innovation and industrial technology upgrades [1][3] Group 2: Industry Development - Guangming District has attracted nearly 100 high-quality AI enterprises, with an industry scale exceeding 30 billion [3] - The forum aims to deepen the integration of industry, academia, and research to promote innovation in intelligent computing, large models, and intelligent agent technologies [3] Group 3: Technological Innovations - The forum announced several technological advancements, including the open-source model "Pengcheng Brain 2.1" and the AI forecaster assistant "Afu," developed in collaboration with the Shenzhen Meteorological Bureau [5] - Other innovations included the domestic "FenixCOS" inference engine and a financial intelligent agent based on a domestic full lifecycle model toolkit [5] Group 4: Collaborations and Partnerships - Cooperation agreements were signed between Pengcheng Laboratory and various institutions, including the Shenzhen Meteorological Bureau and the National Supercomputing Center in Wuxi [7] - Prominent academics from Tsinghua University, Hong Kong University, and other institutions delivered keynote speeches at the forum [7]
安凯微:前三季度研发费用占比超30% 发布多款芯片开启多模态智能体新未来
Zheng Quan Shi Bao Wang· 2025-10-27 11:56
Core Insights - Ankai Microelectronics (安凯微) reported a stable revenue of 351 million yuan for the first three quarters of 2025, with R&D expenses reaching 105 million yuan, accounting for 30.13% of revenue, indicating a 5.18% year-on-year increase, which lays a solid foundation for the company's long-term development [1] Group 1: Product Development and Innovation - The company launched multiple new chip products at the "2025 Ankai Microelectronics Developer Technology Forum," focusing on key areas such as visual processing, audio interaction, and power management, showcasing its technological achievements [1][2] - Ankai Micro has released a total of 8 chip products since the beginning of the year, covering the entire link from visual perception to voice interaction and control execution, providing foundational technology support for the "multi-modal + intelligent body" ecosystem [2] - The company demonstrated seven categories of application scenarios, including solar-powered smart cameras and AI glasses, expanding the practical application boundaries of edge intelligent systems [2] Group 2: Strategic Partnerships and Ecosystem Development - The forum gathered key partners and experts to discuss the integration of multi-modal perception and edge intelligence technologies, clarifying the company's strategic path in the intelligent body direction [4] - Ankai Micro has successfully taped out chips that cover major functional forms such as AI audio glasses and AI display glasses, now entering large-scale promotion and customer product development stages [3] - The company is enhancing its system capabilities from underlying hardware to terminal applications, continuously improving the adaptability and application breadth of edge intelligent solutions [7] Group 3: Future Directions and Market Trends - The DINO-X model, developed by the IDEA Research Institute, is highlighted as a leading general visual model with strong capabilities in open-world object detection and understanding, applicable in various fields such as intelligent security and autonomous driving [5] - Industry experts noted that while multi-modal large models have established numerous application scenarios, there remains significant room for improvement in specialized scenarios, particularly in balancing computing power costs and energy consumption [6] - Ankai Micro is expected to continue iterating and upgrading its products under the trends of high integration and low power consumption, leveraging its self-developed IP and SoC architecture technology [6]
Grok: xAI引领Agent加速落地:计算机行业深度研究报告
Huachuang Securities· 2025-09-23 03:41
Investment Rating - The report maintains a "Buy" recommendation for the computer industry [3] Core Insights - The report details the development and technological advancements of the Grok series, particularly Grok-4, and analyzes the commercial progress of major domestic and international AI model manufacturers, highlighting the transformative impact of large models on the AI industry [7][8] Industry Overview - The computer industry consists of 337 listed companies with a total market capitalization of approximately 494.5 billion yuan, representing 4.53% of the overall market [3] - The circulating market value stands at around 428.3 billion yuan, accounting for 4.98% [3] Performance Metrics - Absolute performance over 1 month, 6 months, and 12 months is 6.7%, 17.4%, and 71.5% respectively, while relative performance is 1.3%, 9.1%, and 50.2% [4] Grok Series Development - The Grok series, developed by xAI, has undergone rapid iterations, with Grok-1 to Grok-4 showcasing significant advancements in model capabilities, including multi-modal functionalities and enhanced reasoning abilities [11][13][29] - Grok-4, released in July 2025, features a context window of 256,000 tokens and demonstrates superior performance in academic-level tests, achieving a 44.4% accuracy rate in the Human-Level Examination [30][29] Competitive Landscape - The report highlights the competitive dynamics in the AI model market, noting that the landscape has shifted from a single-dominant player (OpenAI) to a multi-polar competition involving several key players, including xAI, Anthropic, and Google [8][55] - Domestic models are making significant strides in performance and cost efficiency, with models like Kimi K2 and DeepSeek R1 showing competitive capabilities against international counterparts [8][55] Investment Recommendations - The report suggests focusing on AI application sectors, including enterprise services, financial technology, education, healthcare, and security, with specific companies identified for potential investment [8]
更懂国内APP的开源智能体!感知/定位/推理/中文能力全面提升,还能自己学会操作
量子位· 2025-08-31 04:25
Core Viewpoint - The article discusses the development and capabilities of the open-source multimodal intelligent agent UItron, which can autonomously operate mobile and computer applications, particularly excelling in Chinese app interactions [1][4][20]. Group 1: Technology and Methodology - UItron is designed for complex multi-step tasks on mobile and computer platforms, showcasing superior performance in real interactions within Chinese app environments [3][4]. - The development of UItron involves a systematic data engineering approach to address the scarcity of operational trajectories and enhance the interactive infrastructure for intelligent agents [6][8]. - UItron employs a three-stage training strategy, including two supervised fine-tuning (SFT) phases for perception and planning tasks, followed by a reinforcement learning (RL) phase [12][14]. Group 2: Performance and Evaluation - UItron achieved an average score of 92.0 on the ScreenspotV2 benchmark, indicating strong GUI content understanding and task localization capabilities [16]. - In offline planning benchmarks like Android-Control and GUI-Odyssey, UItron reached a maximum average score of 92.9, demonstrating robust task planning and execution abilities [18]. - The agent's performance in the OSWorld benchmark was notable, with a score of 24.9, positioning it as one of the top performers among GUI agents [19]. Group 3: Data Engineering and Infrastructure - UItron's data engineering includes perception data, planning data, and distilled data, which collectively enhance the training dataset's quality and quantity [8][10]. - The interactive infrastructure established by UItron facilitates the collection of trajectory data and supports online evaluation and reinforcement learning training [10]. - The integration of mobile and PC environments allows for automatic recording of screenshots and coordinates, significantly improving the efficiency of collecting operational trajectories in Chinese contexts [10]. Group 4: Future Implications - UItron aims to provide a stronger foundational model for the field of multimodal intelligent agents, with an emphasis on usability and reliability, particularly in real-world applications involving Chinese app interactions [20].
早报李强:采取有力措施巩固房地产市场止跌回稳态势;A股市值历史首次突破100万亿元大关
Sou Hu Cai Jing· 2025-08-19 08:19
Company News - China Shipbuilding announced that the number of valid dissenting shares is 0, and the stock will resume trading [5] - Midea Group stated on the interactive platform that it has undertaken the first large-scale all-liquid cooling intelligent computing data center project from China Telecom in the Guangdong-Hong Kong-Macao Greater Bay Area [5] - Tibet Tianluo reported a net loss of 112 million yuan for the first half of the year [5] - Yanghe Distillery announced a 45% year-on-year decline in net profit for the first half of the year [5] - Zhifei Biological announced a net loss of 597 million yuan for the first half of the year, marking a transition from profit to loss [5] - Tongzhou Electronics announced that the information circulating about the company entering the supply chain of Nvidia and other enterprises is untrue [5] - O-film Technology reported a net loss of 109 million yuan for the first half of the year, transitioning from profit to loss [5] - Chuangzhong Technology announced that if abnormal trading of the company's stock continues, it may apply for a trading suspension for verification [5] - Nanya New Materials announced that during the period of abnormal stock trading, board member Zhang Dong and others reduced their holdings of the company's shares [5] Industry News - The A-share market's total market capitalization has historically surpassed 100 trillion yuan, with an increase of 1.45 trillion yuan this year [3] - The positive performance of the A-share market has led to an increase in brokerage account openings, with most brokerages reporting a growth in new accounts, some reaching new highs for August [3] - According to a report by the China Automobile Dealers Association, only 30.3% of dealers met their sales targets in the first half of 2025, with 29.0% of dealers failing to meet 70% of their targets [3] - A new low-altitude flight route connecting Kunshan, Jiangsu, and downtown Shanghai has officially opened, allowing for a 20-minute direct flight between the two locations [3] - The Shenzhen Stock Exchange has sent a special letter to member units requesting assistance in conducting research on the network voting situation for customer credit trading guarantee securities accounts [4] - Bicycle prices have significantly decreased, with many brands dropping by around 1,000 yuan, and some high-end imported models seeing price reductions exceeding 50% [4] - The National Radio and Television Administration has issued measures to enrich television content and improve the supply of broadcasting content [4]