多模态

Search documents
盘前情报|国家发改委:今年将推出3万亿元规模优质项目;华为首款鸿蒙电脑正式亮相
2 1 Shi Ji Jing Ji Bao Dao· 2025-05-09 00:38
昨日A股 5月8日,市场全天低开高走,创业板指领涨。沪深两市全天成交额1.29万亿元,较上个交易日缩量1749 亿元。截至收盘,沪指涨0.28%,深成指涨0.93%,创业板指涨1.65%。 板块方面,军工、铜缆高速连接、脑机接口、CPO等板块涨幅居前,PEEK材料、农业、化肥、黄金等 板块跌幅居前。 | 名称 | 最新点位 | 、涨跌幅 | | --- | --- | --- | | 上证指数 | 3352.0 | +9.33(0.28%) | | 深证指数 | 10197.66 | +93.53(0.93%) | | 创业板指 | 2029 45 | +32.94(1.65%) | | | 日期:5月8日 制图:21投资通 | | 隔夜外盘 纽约股市三大股指5月8日上涨。截至当天收盘,道琼斯工业平均指数比前一交易日上涨254.48点,收于 41368.45点,涨幅为0.62%;标准普尔500种股票指数上涨32.66点,收于5663.94点,涨幅为0.58%;纳斯 达克综合指数上涨189.98点,收于17928.14点,涨幅为1.07%。 欧洲三大股指5月8日涨跌不一。截至当天收盘,英国富时100种股票平均价 ...
晚报 | 5月9日主题前瞻
Xuan Gu Bao· 2025-05-08 14:44
Group 1: Hongmeng PC - Hongmeng PC operating system was unveiled at a communication conference, with the first device set to launch on May 19, featuring AI capabilities integrated with hardware and software [1] - The system-level AI assistant, Xiaoyi, will assist in tasks such as creating PPTs and meeting summaries, enhancing productivity [1] - Analysts from Dongwu Securities and Zhongtai Securities express optimism about the potential of multimodal AI to reduce costs and drive efficiency in enterprises, while also predicting an expansion in computing power demand [1] Group 2: Robotics - Qianxun Intelligent Technology has welcomed new shareholders, including Huawei's Hubble Technology, which is expected to enhance funding and technical collaboration in the field of embodied intelligent robots [2] - Citic Securities forecasts that 2025 will mark the year of mass production for embodied intelligent robots, indicating a significant integration of AI and robotics [2] - The production of humanoid robots is anticipated to reach a scale that will address data scarcity issues, propelling the industry into a more practical phase [2] Group 3: Low-altitude Economy - China Bank and Zhongyin Financial Leasing have signed strategic agreements with Shanghai Volant Aviation to procure 100 eVTOL aircraft, marking a significant step in the low-altitude economy [3] - The partnership aims to leverage financing and management services to support the eVTOL sector, with a total credit line of no less than 1 billion yuan [3] - Recent orders for eVTOLs signal the beginning of a large-scale development phase for China's low-altitude economy, creating a trillion-level industrial ecosystem [3] Group 4: Macro and Industry News - President Xi Jinping and President Putin signed a joint statement to deepen the strategic partnership between China and Russia, exchanging over 20 cooperation documents [4] - The Ministry of Industry and Information Technology is seeking public input on the mandatory national standard for automotive door handle safety, aiming to enhance vehicle safety [4] - The Ministry of Commerce emphasizes the need to boost domestic demand, particularly in consumption, to drive economic growth [4] Group 5: Market Trends - The silicon industry is experiencing a downturn, with prices for components and batteries declining due to reduced downstream demand [5] - Chongqing Beer’s president expresses cautious optimism for the beer industry in 2025, anticipating a more favorable development environment [5] - Baidu Apollo and Shenzhou Car Rental are set to launch an autonomous vehicle rental service, indicating advancements in the autonomous driving sector [5] - CATL has released the world's first 9MWh energy storage system, showcasing innovation in energy solutions [5]
对话阶跃星辰CEO姜大昕:两年发布16款多模态模型,DeepSeek证明投流模式不成立|钛媒体AGI
Tai Mei Ti A P P· 2025-05-08 08:33
Core Insights - The CEO of Leap AI, Jiang Daxin, announced the upcoming release of the full version of the inference model Step R1 and a more advanced Step image editing model within the next two to three months [2] - Leap AI emphasizes the importance of "multi-modal understanding and generation integration" as a key path towards developing a world model and progressing towards Artificial General Intelligence (AGI) [2][3] - Jiang Daxin highlighted that traditional traffic investment logic in AI product growth needs reevaluation, as demonstrated by the performance of DeepSeek and other AI products [2] Company Overview - Leap AI, founded in April 2023, is a leading startup focused on developing general AI models and has released the Step series of foundational models [5] - The company has raised several hundred million dollars in its B-round financing, with key investors including Shanghai State-owned Capital Investment Co., Tencent Investment, and Qiming Venture Partners [5] - Leap AI has launched 22 self-developed foundational models, with over 70% being multi-modal models, establishing itself as a leader in the multi-modal AI space [5] Product Development - The company has made significant advancements in multi-modal models, covering various applications such as image understanding, video generation, and music generation [5][7] - Leap AI has established deep collaborations with industry leaders in automotive, mobile, and IoT sectors, enhancing its product capabilities [7] - Recent product releases include the Step R-mini inference model and open-sourced video models, indicating a commitment to expanding its model capabilities [7] Strategic Focus - Leap AI is concentrating on developing intelligent terminal agents that enhance user experience by understanding environmental contexts [11] - The company believes that the integration of pre-trained foundational models with reinforcement learning can significantly improve reasoning capabilities [12] - Jiang Daxin asserts that achieving AGI requires a multi-modal approach, as human intelligence is diverse and relies on various modalities [8] Competitive Positioning - Leap AI differentiates itself from competitors like OpenAI and Google by focusing on foundational model development and multi-modal capabilities [13] - The company aims to create an ecosystem that integrates models with intelligent agents, bridging cloud and edge computing [13]
为什么AI视频工具长得越来越像?
3 6 Ke· 2025-05-07 07:50
Core Insights - The AI video sector has seen a shift in focus from OpenAI's Sora to new players like Keke and Jiemeng, with industry players now prioritizing the reduction of the gap between AI video production and consumption [4][5][6] - The competition among AI video players is intensifying, with frequent updates and new model releases expected in 2025, indicating a rapid evolution in the industry [4][12][26] - There is a growing concern among mid-tier AIGC entrepreneurs regarding the commercial viability of AI video, as production costs remain high while client budgets are decreasing [4][16][18] Group 1: Industry Dynamics - The AI video landscape is becoming increasingly crowded, with numerous players emerging and competing for market share [23][26] - The focus of competition has shifted from model parameters to three key dimensions: consistency, usability, and playability [6][13][14] - Many AI video products are becoming homogenized in terms of functionality, leading to increased competition on quality, cost, and interaction forms [5][16] Group 2: Technological Advancements - AI video players are enhancing video generation consistency by improving frame transitions and scene realism, which are critical for quality [9][11] - Major players are iterating their foundational models regularly, with updates occurring at least every six months to maintain competitive advantage [11][12] - New features such as dynamic editing capabilities and end-to-end production tools are being developed to improve usability for creators [13][14] Group 3: Market Challenges - Despite the proliferation of tools and features, many creators express anxiety over rising production costs and decreasing project budgets [16][18][21] - The pricing strategies in the AI video market are not leading to significant reductions in costs, with many companies maintaining high prices for advanced models [20][21] - The complexity of video creation demands a multi-platform approach, as no single company currently meets all needs in the market [27]
多模态和Agent成为大厂AI的新赛点
创业邦· 2025-05-01 02:54
Core Viewpoint - The article discusses the evolution of large models in consumer-facing applications, focusing on enhancing user interaction and enabling complex task execution through multi-modal capabilities and agent product ecosystems [4][6]. Multi-modal Capabilities - Major companies like ByteDance, Baidu, Google, and OpenAI have recently launched advanced multi-modal models, enabling innovative applications [4]. - Alibaba's AI product Quark introduced a new feature called "Photo Ask Quark," which utilizes multi-modal capabilities for enhanced user interaction [4][10]. - The development of multi-modal reasoning abilities is evident in products like Byte's Doubao 1.5 and OpenAI's o3 and o4-mini, which can analyze images and generate content [9][10]. Agent Execution Capabilities - The emergence of general agent products aims to execute complex tasks through natural language commands, with recent launches from companies like ByteDance and Baidu [4][5]. - The article highlights the need for agents to possess three key capabilities: integration with third-party data and tools, coding abilities, and strong task understanding [20][23]. - Manus has set a direction for agent products, showcasing a framework that combines user task understanding with tool integration [17]. Future of Agents - The ultimate goal for agents remains uncertain, with ongoing exploration in their development and application [7]. - The integration of multi-modal capabilities and agent execution abilities is crucial for creating a foundational entry point for future applications [25]. - OpenAI anticipates that AI agents will surpass ChatGPT in sales by the end of 2025, projecting revenues of $3 billion, with further growth expected by 2029 [25].
多模态和Agent成为大厂AI的新赛点
3 6 Ke· 2025-04-29 23:29
Core Insights - The article discusses the evolving landscape of AI applications, focusing on the dual pillars of multimodal capabilities and agent execution as key areas of development in the industry [1][2][3] Multimodal Capabilities - Major companies like ByteDance, Baidu, Google, and OpenAI have recently launched advanced multimodal models, enhancing application innovation [1][5] - Alibaba's AI product Quark introduced a new feature called "Photo Query Quark," which utilizes multimodal capabilities for user interaction [1][6] - OpenAI's latest models, o3 and o4-mini, have achieved significant multimodal understanding, allowing for image analysis and generation [5][16] - The integration of multimodal capabilities is expected to transform user experiences in work, study, and daily life, although current products are still in early exploration stages [2][3] Agent Execution - The article highlights the emergence of general agent products that can execute complex tasks based on natural language commands, with notable examples including ByteDance's Kouzi Space and Baidu's Xinxiang App [1][12] - The effectiveness of these agents relies on three key capabilities: connecting to third-party data and tools, coding ability, and task understanding [12][16] - OpenAI is exploring the acquisition of AI programming startup Windsurf to enhance coding capabilities for agents [16][17] - The anticipated revenue from AI agents is projected to exceed $3 billion by the end of 2025, with a potential contribution of $29 billion by 2029 [17] Future Directions - The article suggests that the future of agents may involve a more human-like ecosystem, with agents being developed according to specific professional roles [17] - The integration of multimodal capabilities with agent execution is seen as crucial for establishing a foundational entry point for future AI applications [17]
通义千问 Qwen3 发布,对话阿里周靖人
晚点LatePost· 2025-04-29 08:43
以下文章来源于晚点对话 ,作者程曼祺 晚点对话 . 最一手的商业访谈,最真实的企业家思考。 阿里云 CTO、通义实验室负责人 周靖人 "大模型已经从早期阶段的初期,进入早期阶段的中期,不可能只在单点能力上改进了。" Qwen3 旗舰模型,MoE(混合专家模型)模型 Qwen3-235B-A22B,以 2350 亿总参数、220 亿激活参数,在 多项主要 Benchmark(测评指标)上超越了 6710 亿总参数、370 亿激活参数的 DeepSeek-R1 满血版。更小 的 MoE 模型 Qwen3-30B-A3B,使用时的激活参数仅为 30 亿,不到之前 Qwen 系列纯推理稠密模型 QwQ- 32B 的 1/10,但效果更优。更小参数、更好性能,意味着开发者可以用更低部署和使用成本,得到更好效 果。图片来自通义千问官方博客。 (注:MoE 模型每次使用时只会激活部分参数,使用效率更高,所以有 总参数、激活参数两个参数指标。) Qwen3 发布前,我们访谈了阿里大模型研发一号位,阿里云 CTO 和通义实验室负责人,周靖人。他 也是阿里开源大模型的主要决策者。 迄今为止,Qwen 系列大模型已被累计下载 3 ...
国产算力景气度持续,关注昇腾产业链
2025-04-28 15:33
Summary of Conference Call Records Industry Overview - The conference call primarily discusses the domestic computing power industry and the optical communication sector, highlighting the performance of various companies within these industries [1][4][8]. Key Points and Arguments Domestic Computing Power Industry - The Ascend 910C chip has shown performance improvements, narrowing the gap with NVIDIA's H100, primarily used in Huawei's cloud infrastructure. Strong demand from downstream internet companies is expected to lead to large-scale shipments by May 2025, utilizing a dual 910B chip packaging solution [1][2]. - The overall performance of domestic graphics cards has improved, with increased customer acceptance and a positive outlook for the upstream supply chain, including connectors, liquid cooling, and servers [2]. Optical Communication Sector - The optical communication segment has exceeded expectations, with companies like NewEase and Shijia Photon showing strong performance. Source Technology's CW light source shipments have significantly improved revenue and profitability, with new product gross margins exceeding 80% [1][4]. - Domestic optical module companies, such as Guangxun Technology, experienced a slight decline in Q1 but showed significant improvement in profitability. Demand for domestic optical modules remains high, with production capacity expected to ramp up to 700,000 to 800,000 units per month this year [1][4]. Company Performance Highlights - NewEase and Shijia Photon have reported strong revenue and profit growth, driven by overseas demand for passive devices and corresponding chip products. Their revenue and gross margins for AWG, MPO connectors, and indoor optical cable products have significantly improved [5]. - In contrast, Invec's performance in the liquid cooling segment fell short of expectations, leading to a stock price decline. However, revenue met expectations, and the company faces increased margin pressure due to intensified competition in domestic temperature control orders [8]. Market Trends and Future Outlook - The communication sector's overall performance has been mixed, with some companies meeting expectations while others, like Invec, have struggled. The industry remains optimistic due to high investment from major players like ByteDance, Alibaba, and Tencent, which is expected to drive growth [8][9]. - The AI large model continues to evolve, with significant increases in computing power demand. For instance, Baidu's new model has reduced costs to about one-fourth per million tokens, indicating a growing need for computing resources [12]. - Investment recommendations focus on three areas: self-controlled supply chains (including high-speed connectors and liquid cooling), domestic computing power and AI data center industry trends, and advancements in AI applications, particularly in IoT smart modules and controllers [13]. Additional Important Insights - The optical communication sector's performance is expected to see rapid growth in domestic and international capacity releases over the next few years, particularly in overseas DCI business, which will contribute to significant revenue growth [5]. - The overall sentiment in the communication sector is optimistic, with expectations of continued improvement in profitability and growth trajectories for companies involved in new product releases and increased shipments [6][7].
图像编辑开源新SOTA,来自多模态卷王阶跃!大模型行业正步入「多模态时间」
量子位· 2025-04-28 03:43
衡宇 发自 凹非寺 量子位 | 公众号 QbitAI 全球AI大模型智能涌现,现在正在进入"多模态时间"。 一方面,全球业内各式各样的技术进展,都围绕多模态如火如荼展开。 另一方面,AI应用和落地的需求中,多模态也是最重要的能力。没有多模态技术,何谈应用和落地? 实际上,多模态的先锋共识和趋势,把代表性玩家的进展连点成线,也能看出来…… 看看行业公认的多模态卷王,阶跃星辰—— 刚刚过去的一个月,陆续上新的3款模型,全是多模态 ,有图生视频开源模型,有多模态推理模型,还有图像编辑开源模型。 模态丰富,上新频繁,性能出色。 之所以把阶跃的这些发布连点成线解读,也是因为阶跃从一开始的强落地和强应用属性。 目前,阶跃已发布的模型里,七成都是多模态。鉴于多模态是Agent的必备要素,今年阶跃化身「落地型玩家」的态势愈发明显: 发力智能 终端Agent 。 过去一个月,卷王卷出了些啥? 据量子位整理回顾,过去一个月,阶跃星辰接连上新了3款模型: 它们覆盖了当前多模态模型的几大刚需方向,并且其中Step1X-Edit和Step-Video-TI2V已面向开发者开源。 怎么说呢,这很阶跃,也很符合技术流和行业玩家们对"多模态 ...
重磅发布 | 复旦《大规模语言模型:从理论到实践(第2版)》全新升级,聚焦AI前沿
机器之心· 2025-04-28 01:26
机器之心发布 机器之心编辑部 《大规模语言模型:从理论到实践(第 2版)》 是一本理论与实践并重的专业 技术书 ,更是 AI时代不可或缺的知识工具书。 任何人 都能在本 书中找到属于自己的成长路径。 在人工智能浪潮席卷全球的今天,大语言模型正以前所未有的速度推动着科技进步和产业变革。从 ChatGPT 到各类行业应用,LLM 不仅重塑 了人机交互的方式,更成为推动学术研究与产业创新的关键技术。 面对这一飞速演进的技术体系,如何系统理解其理论基础、掌握核心算法与工程实践,已成为每一位 AI 从业者、研究者、高校学子的必修课。 2023 年 9 月,复旦大学张奇、桂韬、郑锐、黄萱菁研究团队面向全球学术界与产业界正式发布了《大规模语言模型:从理论到实践》。短短 两年,大语言模型在理论研究、预训练方法、后训练技术及解释性等方面取得了重要进展。业界对大语言模型的研究更加深入,逐渐揭示出许多 与传统深度学习和自然语言处理范式不同的特点。例如, 大语言模型仅需 60 条数据就能学习并展现出强大的问题回答能力,显示了其惊人的 泛化性 。然而,本书作者们也发现大语言模型存在一定的脆弱性。例如, 在一个拥有 130 亿个参数的模 ...