Workflow
多模态大模型
icon
Search documents
拓尔思20250820
2025-08-20 14:49
拓尔思 20250820 摘要 拓尔思 2025 年上半年营收虽受外部环境和季节性影响承压,但公共安 全业务稳定及降本增效措施,如经营活动支出减少 2,165 万元和信创采 购支出减少,有效缓解了压力。下半年随着招标项目增加,环比改善趋 势明确。 公司坚持 AI 及数据赛道高研发投入,研发费用率达 27.3%,总额 1.44 亿元,重点布局海外版企业级大数据和智能分析产品、多模态大模型、 智能体应用及全球数据采集,为未来增长提供动力。 拓尔思加速人工智能产品研发与应用,升级新一代智能底座,以拓天大 模型为核心构建多智能体协同操作系统,已在金融、公安、政府等领域 私有化加云方式部署超 300 个 AI agent 项目。 公司持续增强数据资产和数据产业化服务能力,拓展全球多语种信息采 集,深化数据标注,提升数据治理质量,新增格力电器、新元股份等 100 多家客户,数据资源获市场认可。 拓尔思在泛安全行业合同额同比增长 61%,深耕特种行业辅助决策赛道, 拓展非法资金追踪与分析查控及认知域业务支撑平台两大核心能力,成 为新的增长点。 Q&A 拓尔思 2025 年上半年经营情况如何? 2025 年上半年,拓尔思的收 ...
官宣!2025 全球机器学习技术大会北京站首批嘉宾出炉,重磅来袭!
AI科技大本营· 2025-08-11 07:16
Core Viewpoint - The 2025 Global Machine Learning Technology Conference in Beijing is officially announced, following the successful Shanghai event, focusing on cutting-edge AI topics and featuring top scholars and industry practitioners [1][2]. Group 1: Conference Overview - The conference will take place on October 16-17, 2025, and is co-hosted by CSDN and Boolan, emphasizing high-quality discussions on AI evolution and industry applications [1]. - It aims to cover 12 key topics that address the most advanced and engineering challenges in AI, focusing on "technological explainability, engineering replicability, and scene applicability" [2][3]. Group 2: Core Topics - The 12 core topics include: - Evolution of large language model technology - Practical applications of large models - Software development transformation driven by large models - Frontiers of multimodal large models - Innovation and exploration of GenAI products - Infrastructure construction for large models - Engineering and architecture of large models - Technical analysis of DeepSeek and industry applications - AI agents - Embodied intelligence and smart hardware - Computing power infrastructure and performance optimization - Industry application practices of large models [4]. Group 3: Speaker Highlights - The conference will feature prominent speakers from various leading companies and research institutions, providing deep insights into the future of AI [6][7]. - Notable speakers include: - Zhao Jian, Director of Multimedia Cognitive Learning at China Telecom AI Research Institute [8]. - Zhou Pan, Multimodal Intelligence Lead at Li Auto [10]. - Tang Rui, Chief Scientist at Qunke Technology [13]. - Zhang Junlin, Chief Scientist at Sina Weibo [14]. - Leng Dawei, Vice President of 360 AI Research Institute [15]. - Wang Zhaode, Technical Expert at Alibaba [16]. - Jiang Yudong, Head of Intelligent Creation Technology at Bilibili [18]. - Chen Yingfeng, Head of Robotics Algorithms at NetEase [19]. - Zhang Heng, Senior Algorithm Expert at Xiaomi [20]. Group 4: Call for Participation - The conference invites AI community members to contribute by sharing their successful cases, technical insights, and innovative ideas, enhancing the event's value [24][25]. - Companies are encouraged to participate through exhibitions, technical exchanges, and project collaborations to showcase their innovative technologies and expand cooperation opportunities [27].
国新证券每日晨报-20250728
Domestic Market Overview - The domestic market experienced a weak consolidation with a decrease in trading volume, with the Shanghai Composite Index closing at 3593.66 points, down 0.33%, and the Shenzhen Component Index at 11168.14 points, down 0.22% [1][5][10] - Among the 30 sectors tracked, 9 sectors saw gains, with notable increases in computer, electronics, and light manufacturing, while construction materials, construction, and food and beverage sectors faced significant declines [1][5][10] - The total trading volume for the A-share market was 181.55 billion yuan, showing a decrease compared to the previous day [1][5][10] Overseas Market Overview - The three major U.S. stock indices saw slight gains, with the Dow Jones up 0.47%, S&P 500 up 0.4%, and Nasdaq up 0.24%. Notably, Tesla's stock rose over 3% [2][5] - The performance of Chinese concept stocks was mixed, with many declining, including a drop of over 10% for Xiaoying Technology [2][5] Key News Highlights - The 2025 World Artificial Intelligence Conference was attended by Premier Li Qiang, emphasizing the rapid development of AI technology and its integration into the economy [3][12] - The establishment of the China Capital Market Society was announced, aiming to enhance research and development in the capital market [3][21] - A trade agreement was reached between the U.S. and the EU, which includes a 15% tariff on EU goods entering the U.S. and a commitment from the EU to increase investment in the U.S. [3][22][23] Industrial Insights - In June, the profit decline of industrial enterprises above designated size narrowed, with total profits amounting to 715.58 billion yuan, a year-on-year decrease of 4.3%, which is an improvement from the previous month [16][17] - The equipment manufacturing sector showed significant growth, with a 7.0% increase in revenue and a profit increase of 9.6%, contributing positively to overall industrial profits [17][18] - The manufacturing sector is advancing towards high-end, intelligent, and green production, with notable profit increases in high-end equipment manufacturing and smart products [18][19] Agricultural Sector Developments - A new plan to promote agricultural product consumption was released, focusing on optimizing supply, innovating distribution, and enhancing market activation [20] - The plan aims to meet diverse consumer needs and improve the quality of agricultural products while leveraging e-commerce platforms for better market reach [20]
大模型面经 - 快手快 Star
自动驾驶之心· 2025-07-20 08:36
作者 | 小森 编辑 | 自动驾驶之心 点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近15个 方向 学习 路线 >>自动驾驶前沿信息获取 → 自动驾驶之心知识星球 二面仍是对于论文的详细拷打,看来面试官比较看重论文,八股倒是问的比较简单。场景题也比较烦,面试官会在给定的方案上提出未解决的问题, 要一步一步完善方案 三面 本文只做学术分享,如有侵权,联系删文 部门与岗位:MMU - 【快Star】多模态大模型 一面 8. 代码:32. 最长有效括号 一面论文问的比较细致,对于没有提到的细节面试官还会询问确认,但是八股问的还是比较常规的,就是概率题有点烦 二面 原文链接: https://zhuanlan.zhihu.com/p/1928556109037281822 1. 自我介绍,问实习和论文,对于 CV 的论文进行了深入的探讨,尤其对于引入 Diffusion 十分感兴趣,从 motivation 到 method 再到 result 顺下来 的,花了比较长的时间 2. 了解哪些多模态大模型,简要介绍一下吧,目前主流的多模态大模型的范式是什么样的 3. 在 BLIP-2 或者 ...
ICCV 2025 | 清华&腾讯混元X发现「视觉头」机制:仅5%注意力头负责多模态视觉理解
机器之心· 2025-07-14 11:33
Core Insights - The article introduces SparseMM, a method that optimizes KV-Cache allocation based on the identification of visual heads in multimodal large models, significantly improving efficiency and performance in visual understanding tasks [5][30][31] Group 1: Visual Head Identification - Multimodal large models extend from large pre-trained language models (LLMs) and can exhibit strong performance in visual tasks after multimodal training [2] - The study identifies that less than 5% of attention heads, termed "visual heads," are primarily responsible for visual understanding, while most heads focus on text or auxiliary features [2][8] - A method based on OCR tasks is proposed to quantify the attention of each head towards visual content, revealing the sparse nature of visual heads [2][14] Group 2: SparseMM Methodology - SparseMM employs a differentiated cache allocation strategy, dividing the total cache budget into three parts: basic local cache for all heads, uniform distribution, and prioritized allocation for visual heads based on their scores [6][20] - The method has been tested across various multimodal benchmarks, achieving a decoding speedup of up to 1.87× and reducing peak memory usage by 52% [6][27] Group 3: Experimental Results - In OCR-rich datasets like DocVQA and TextVQA, SparseMM demonstrates significant performance advantages, maintaining high accuracy even with limited cache budgets [22][23] - The method shows robust performance across general visual tasks, maintaining nearly consistent performance with full cache models under constrained budgets [25] Group 4: Implications for Deployment - SparseMM effectively reduces inference costs and enhances the deployment efficiency of multimodal large models, particularly in high-resolution image and long-context scenarios [27][31] - The visualization of identified visual heads indicates their ability to accurately focus on relevant visual information, contrasting with non-visual heads that often miss critical details [28]
福布斯中国“人工智能科技企业TOP 50”发布,创新集群阶梯崛起
机器人圈· 2025-06-30 13:53
Core Viewpoint - The article highlights the emergence and growth of the artificial intelligence (AI) industry in China, particularly focusing on the 2025 Forbes China "Top 50 AI Technology Companies" list, which showcases a diverse range of technologies and applications across various sectors [3][4][6]. Group 1: Industry Overview - The 2025 Forbes China "Top 50 AI Technology Companies" list features a significant number of companies from Shanghai (21), followed by Beijing (14), and highlights the growing innovation in central regions like Wuhan, which has 9 companies on the list [4][5]. - Wuhan's AI industry has seen a compound annual growth rate of over 40% in the past five years, with a core industry scale exceeding 700 billion yuan [5]. - The AI industry in China is characterized by a pyramid structure, with major players like Baidu Cloud and Alibaba Cloud at the top, followed by "hidden champions" in specific fields, and a large base of emerging companies [6][7]. Group 2: Investment Trends - Approximately 25% of the listed companies are publicly traded, indicating a significant presence of non-listed companies that drive innovation through unique algorithms and specialized applications [7]. - The investment logic has shifted towards companies with clear commercialization pathways, as seen in the examples of companies like Yuanli Wuxian and Blue Technology, which have demonstrated substantial operational efficiencies and market leadership [7]. Group 3: Development Trends - The article identifies three key trends: the evolution of multi-modal large models towards lightweight and industry-specific applications, the integration of quantum computing with AI chips, and the rise of AI in healthcare and industrial robotics as potential investment hotspots [8]. - The AI industry in China is moving beyond mere technological catch-up to establish a unique industrial ecosystem, supported by the implementation of the "New Generation Artificial Intelligence Development Plan" [8].
福布斯中国“人工智能科技企业TOP 50”发布,创新集群阶梯崛起
Core Insights - The 2025 Forbes China "Top 50 AI Technology Companies" list highlights the diverse technological characteristics of selected companies, showcasing a significant presence of hard technology and internationalization, particularly in Shanghai [1][2]. Group 1: Regional Highlights - Shanghai leads with 21 companies on the list, focusing on sectors like new energy vehicles, biomedicine, robotics, and semiconductor integrated circuits [2]. - Beijing has 14 companies, with notable contributions from Cambrian's AI chips and Zhipu AI's general models, reflecting the region's emphasis on technological originality [2]. - The central region, particularly Wuhan, shows growing innovation with 9 companies, including Landing's AI cervical cancer screening system serving over 2,000 medical institutions [2][3]. Group 2: Industry Growth and Structure - Wuhan's AI industry has experienced a compound annual growth rate exceeding 40% over the past five years, with a core industry scale surpassing 70 billion [3]. - The AI industry in China is structured like a pyramid, with major players at the top, "invisible champions" in the middle, and numerous emerging companies at the base, indicating a vibrant ecosystem [4][5]. Group 3: Patent and Innovation Landscape - The top 50 companies collectively hold over 260,000 patents, with the top five companies accounting for 90% of the total, while the AIGC sector sees a 45% annual growth in software copyrights, primarily from small to medium enterprises [4]. - The coexistence of large established firms and agile startups reflects the unique vitality of the AI industry, which requires both long-term foundational research and rapid scene innovation [4][5]. Group 4: Investment Trends - Among the selected companies, 20 are publicly listed, indicating that 25% of the firms are established, while 75% are non-listed, suggesting that innovation is not monopolized by large corporations [5]. - The investment logic has shifted, focusing on the commercialization roadmap rather than just technological concepts, as seen in companies like Yuanli Wuxian and Blue Technology [5]. Group 5: Future Development Trends - The list reveals three key trends: the evolution of multimodal large models towards lightweight and industry-specific applications, the integration of quantum computing with AI chips, and the emergence of AI in healthcare, industrial robotics, and semiconductor equipment as potential investment hotspots [6][7]. - The AI industry in China is moving beyond mere technological catch-up to establish a unique industrial ecosystem, supported by the implementation of the "New Generation Artificial Intelligence Development Plan" [7].
中国光谷八家企业入选2025年福布斯中国人工智能科技企业TOP50
Jing Ji Guan Cha Bao· 2025-06-27 12:20
经济观察网 程久龙 实习生 王震 刘轩宇 6月27日,经济观察网从光谷人工智能创新大会获悉,在2025福 布斯中国人工智能科技企业TOP 50评选活动中,共有9家武汉企业入选,其中8家聚集在武汉东湖高新 区(中国光谷),彰显了武汉和光谷作为我国人工智能产业高地的崛起态势。 2025年,中国人工智能产业越来越多地从实验室走向市场。随着社会结构转型与产业升级加速,AI技 术正在深刻改变着我们的生产生活方式。在社会老龄化趋势下,智能康养的刚性需求激增;消费升级带 动文旅产业向个性化、智能化转型;智能制造在AI加持下揭开了新的篇章......这些变革,持续驱动着AI 技术向垂直类应用场景的深度渗透。 据了解,9家入选企业中,有8家位于中国光谷,形成了中部人工智能创新极核。兰丁股份的宫颈癌AI 筛查系统已服务超2000家医疗机构;武汉紫东太初的多模态大模型拿下多个标杆案例;声通科技依托武 汉的区位与产业优势,已构建起智慧交通、数字园区等创新AI生态,并加速技术转化,与东风、金龙 合作的自动驾驶项目进入牌照审核阶段。 数据显示,武汉人工智能产核心产业规模突破700亿元,其中70%位于光谷。光谷人工智能产业的崛 起,与多重 ...
启明创投周志峰对话阶跃星辰姜大昕:探索AI创业的“无人区”
IPO早知道· 2025-06-23 03:23
Core Viewpoint - The article discusses the advancements and strategic positioning of Jiyue Xingchen, a leading AI model startup, in the context of the evolving AI landscape, particularly focusing on the development of AI Agents and the pursuit of Artificial General Intelligence (AGI) [2][25]. Group 1: AI Model Development and AGI - Jiyue Xingchen emphasizes the importance of integrated multimodal models for understanding and generating tasks, which is crucial for the development of AI Agents [2][11]. - The company has set a goal to achieve AGI, defining it as the ability of models to perform 50% of human tasks by 2030, and has outlined a three-phase roadmap: Simulated World, Exploratory World, and Inductive World [7][10]. - The first phase involves imitation learning from vast internet data, while the second phase focuses on problem-solving capabilities through slow thinking and reinforcement learning [8][10]. Group 2: AI Agent and Market Positioning - The concept of AI Agents is gaining traction, with predictions that 2025 will be a pivotal year for their adoption, driven by the need for strong reasoning capabilities and multimodal understanding [25][26]. - Jiyue Xingchen aims to create a platform for intelligent terminals that can autonomously assist users in complex tasks, highlighting the importance of both automatic and proactive functionalities in AI Agents [27][28]. - The company differentiates itself by focusing on comprehensive multimodal capabilities, which are essential for achieving AGI and enhancing user interaction [12][11]. Group 3: Technological Trends and Future Directions - The article notes that the AI model landscape is rapidly evolving, with significant advancements in reasoning models and the integration of multimodal capabilities [14][15]. - Jiyue Xingchen is actively working on improving reasoning efficiency and exploring how reinforcement learning can be applied in various domains, including mathematics and coding [16][18]. - The integration of understanding and generation tasks in multimodal models is identified as a critical area for future development, with ongoing efforts to enhance this capability [19][20].
阿里巴巴集团副总裁许主洪:多模态大模型是通往AGI的关键路径|直击MWC上海2025
Guo Ji Jin Rong Bao· 2025-06-19 10:48
许主洪进一步分享道,多模态理解模型主要基于自回归的模型框架,相比之下,多模态生成模型则更多地采用基于扩散的模型框架,利用如UNet和DiT 等架构,以及CLIP和T5等先进的文本编码器。 根据许主洪预测,未来多模态大模型将朝着理解与生成相统一的方向发展,但同时也指出主干网络设计、模态对齐融合等关键技术仍需深入研究。尽管 行业整体仍处于早期阶段,不过其对多模态技术在搜索、创作、机器人等领域的应用前景充满信心。 "多模态agent AI的时代才刚刚开始,未来我们要真正达到AGI,还是要解决非常多的技术难题,包括多模态大模型基础的能力,数据细节的连接与操 作,物理世界的控制与交付等等,都有很多的技术挑战,但这也是未来多模态大模型行业机会。"6月19日,在上海世界移动通信大会(MWC上海2025) 上,阿里巴巴集团副总裁,智能信息事业群首席科学家发表主题演讲,深入阐述了多模态大模型技术的发展趋势及其在实现通用人工智能(AGI)中的核心 作用。 在演讲中,许主洪将多模态大模型技术分为理解与生成两大类,并系统梳理了技术演进路径。他指出,多模态的理解任务,主要解决的难点包括多模态 的模态编码对齐、融合的理解与推理等等;多模 ...