Workflow
多模态
icon
Search documents
通信ETF(515880)涨超1.1%,端侧AI驱动行业增长
Mei Ri Jing Ji Xin Wen· 2025-06-18 05:53
Group 1 - The core viewpoint is that the development space for edge AI is vast, with the Volcano Engine showcasing various smart hardware products like AI alarm clocks and AI learning machines, indicating a continuous increase in the categories of large models being implemented on hardware [1] - The Doubao large model family has been fully launched, covering various types such as language, video, and voice, demonstrating outstanding performance and cost advantages, and has been widely applied in industries like mobile, automotive, and finance, accelerating the process of industrial intelligence [1] - Multi-modal vertical upgrades are becoming an important path for the implementation of AI, with edge AI accelerating penetration in IoT and other fields, and technological advancements expected to show exponential growth with improvements in computing power and model optimization [1] Group 2 - The communication industry is performing steadily, with edge AI emerging as a significant growth point [1] - The Communication ETF (515880) tracks the communication equipment index (931160), which is compiled by China Securities Index Co., Ltd., selecting listed companies involved in communication network equipment, terminal devices, and related services from the A-share market to reflect the overall performance of the communication equipment industry [1] - Investors without stock accounts can consider the Guotai Zhongzheng All Index Communication Equipment ETF Connect A (007817) and Guotai Zhongzheng All Index Communication Equipment ETF Connect C (007818) [1]
UU Holo随身AI全球首秀:多模态交互重构“所见皆可问”智能体验
Group 1 - The second "Belt and Road" Technology Exchange Conference was held in Chengdu, Sichuan from June 10 to 12, showcasing cutting-edge technologies and their potential to enhance daily life and future cities [1] - Koala Youran presented three innovative multimodal AI products, including the UU Holo portable AI, which integrates core multimodal large model technology and offers features such as scene recognition, intelligent explanation, multilingual Q&A, and autonomous task execution [1][2] - The UU Holo served as a bilingual AI video guide for the conference, providing immersive intelligent service experiences to attendees [1] Group 2 - The urban traffic video semantic analysis and Youran Smart Central, based on the self-developed Youran Full Modal AI application platform, enable rapid processing and intelligent analysis of massive offline video data, transforming traditional video retrieval methods [2] - The system can automatically parse video elements, generating structured results such as video summaries, environmental analysis, and behavioral insights, allowing users to perform keyword-based video searches in seconds [2] - Youran Smart Central enhances urban governance with high precision (covering over 100 types of events with an accuracy rate of over 90%), high efficiency (processing millions of events daily), and localized development capabilities [2] Group 3 - The company aims to promote technological innovation in multimodal AI and explore new paths for technology to empower human development in collaboration with global partners [3] - The showcased results reflect the company's deep expertise in the AI field and its contributions to smart city construction [3]
还不知道发什么方向论文?别人已经投稿CCF-A了......
具身智能之心· 2025-06-18 03:03
辅导老师介绍 老师均在CVPR、ICCV、ECCV、ICLR、RSS、ICML、ICRA等顶级会议上发表论文,有较丰富的 指导经验。 学员要求 自带一份简历,学校背景:国内TOP100高校,国外QS200以内; 具身智能之心论文辅导正式推出啦!去年的成果还算不错,几个同学中了CVPR和ICRA等会议, 今年和老师们沟通过后,准备继续辅导几名同学冲下顶会,感兴趣的同学可以咨询,辅导方向如 下。 主要方向 更多咨询 多模态大模型,VLA、机器人导航、机器人抓取、具身泛化、具身合成数据、端到端具身智能 体、3DGS等方向; 详细内容欢迎添加微信:oooops-life,做进一步了解。 ...
资金流入游戏板块,游戏ETF(516010)近10日净流入近4亿元,AI技术赋能商业化进程受关注
Mei Ri Jing Ji Xin Wen· 2025-06-18 02:22
Group 1 - The core viewpoint is that the gaming industry is expected to see accelerated application and commercialization of AI products, particularly focusing on AI Agents, AI companionship, and AI multimodal technologies [1] - AI Agents are viewed as productivity tools that enhance efficiency through autonomous decision-making and dynamic interaction, with expectations for continuous optimization throughout the year [1] - AI companionship addresses personalized interaction needs and falls within the broader entertainment sector [1] Group 2 - AI multimodal technologies, including audio, video, and 3D models, are undergoing continuous iteration, driving the accelerated implementation of industry applications [1] - The gaming ETF (code: 516010) tracks the animation and gaming index (code: 930901), which is compiled by China Securities Index Co., Ltd., reflecting the overall performance of listed companies in the Chinese animation and gaming industry [1] - The index constituents are primarily distributed across cultural media and software development sectors, showcasing both industry concentration and innovative growth characteristics [1]
OpenAI以65亿美元收购Jony Ive的io背后,软硬件结合的AI原生硬件公司正在崛起
3 6 Ke· 2025-06-17 23:51
Core Insights - OpenAI has acquired Jony Ive's company io for $6.5 billion to develop a series of hardware products, indicating a strategic move towards integrating hardware with AI capabilities [1] - The emergence of AI-native hardware is facing challenges, including slow market penetration and user acceptance due to overly ambitious product designs [2][4] - The second wave of AI-native hardware is focusing on specific applications, such as meeting transcription and summarization, which have clear user demand and willingness to pay [6][8] Group 1: AI Hardware Development - The development of AI-native hardware is driven by advancements in large language models, enabling more sophisticated human-computer interactions [2] - Initial AI hardware products struggled due to high learning costs and lack of clear application scenarios, leading to poor market performance [4][5] - Companies are now focusing on refining their products to meet specific user needs, resulting in more mature offerings [9] Group 2: Market Dynamics - The pricing of AI hardware, such as the AI Pin at $699 and Apple's Vision Pro at $3,499, limits their market penetration due to high costs compared to traditional smartphones [5] - The supply chain challenges in Silicon Valley hinder rapid hardware iteration and competitive pricing, making it difficult for these companies to gain market share [5][15] - Chinese entrepreneurs benefit from a robust AI hardware supply chain and a large market, positioning them well for future growth in this sector [15][16] Group 3: Future Prospects - The evolution of AI-native hardware may eventually lead to the replacement of smartphones and tablets, necessitating the development of AI-native operating systems [13][14] - The potential for AI hardware to penetrate various sectors, including education and healthcare, is significant as capabilities improve and applications expand [12][16] - Companies are increasingly focusing on specific use cases, such as educational tools and personal companion robots, to drive adoption and revenue [10][12]
【公告全知道】脑机接口+算力+固态电池+机器人+国产芯片!公司参股企业主要从事医疗级全植入式无线脑机接口系统研发
财联社· 2025-06-17 14:09
①脑机接口+算力+固态电池+机器人+国产芯片+国企改革!这家公司参股企业主要从事医疗级全植入式无 线脑机接口系统研发;②脑机接口+边缘计算+机器人+ AI智能体+多模态AI+跨境电商!这家公司脑机技术 聚焦教育、医疗、养老三大核心应用场景;③创新药+细胞免疫治疗!公司创新药产品获欧盟孤儿药资格 认定。 每周日至每周四推送明日股市重大公告!内容包含"停复牌、增减持、投资中标、收购、业绩、解禁、 高送转"等一系列个股利好利空公告,其中重要公告均以红色标注,帮助投资者提前寻找到投资热点, 防范各类黑天鹅事件,并且有充足的时间进行分辨和寻找合适的上市公司。 前言 ...
迈向通用具身智能:具身智能的综述与发展路线
具身智能之心· 2025-06-17 12:53
作者丨 视觉语言导航 编辑丨 视觉语言导航 点击下方 卡片 ,关注" 具身智能之心 "公众号 >> 点击进入→ 具身 智能之心 技术交流群 更多干货,欢迎加入国内首个具身智能全栈学习社区 : 具身智能之心知识星球 (戳我) , 这里包含所有 你想要的。 主要贡献 研究背景 具身AGI的定义 : 论文将具身AGI定义为能够以人类水平的熟练度完成多样化、开放式现实世界任务的具身AI系统,强调 其人类交互能力和任务执行能力。 通用具身智能路线 现状 : 现有的具身AI模型(如视觉-语言-动作模型,VLA)大多仅支持视觉和语言输入,并且输出仅限于动作 空间。 挑战 : 类人认知能力不足 论文提出了一个从L1到L5的五级路线图,用于衡量和指导具身AGI的发展,每个级别基于四个核心维度:模 态(Modalities)、类人认知能力(Humanoid Cognitive Abilities)、实时响应能力(Real-time Responsiveness)和泛化能力(Generalization Capability)。 | | | 作者: Yequan Wang , AixinSun 单位: 北京人工智能研究院, 南洋理 ...
一口气发布4个大模型,火山引擎这次真的杀疯了!
Sou Hu Cai Jing· 2025-06-17 09:09
Core Insights - The recent FORCE conference in Beijing showcased the launch of new AI models by Volcano Engine, including the Doubao Model 1.6 and Seedance 1.0 pro, highlighting advancements in multimodal interaction and content generation capabilities [2][3] - The global AI model market is highly competitive, with Volcano Engine's new models standing out due to their comprehensive multimodal capabilities and cost-effective pricing strategies [2][3] Product Launches - Doubao Model 1.6 supports multimodal understanding and graphical interface operations, excelling in complex reasoning and multi-turn dialogue tasks, ranking among the top globally [3] - Seedance 1.0 pro generates high-quality 1080P videos with seamless transitions and has achieved top rankings in international assessments for video generation tasks [4] Industry Applications - In the automotive sector, Mercedes-Benz has partnered with Volcano Engine to enhance its smart cabin information retrieval and system response speed using the Doubao model [8] - In finance, Haier Consumer Finance has implemented a tailored large model to meet over 90% of intelligent scenario needs, significantly improving operational efficiency and reducing risks [8] - In education, collaborations with top universities have led to the development of AI applications that enhance research efficiency and quality [9] Technological Innovations - Volcano Engine has introduced an Agent development suite that innovates the entire lifecycle of AI Agent development, enhancing user intent parsing and instruction optimization [5] - The launch of a multimodal data lake solution addresses challenges in data processing, improving resource efficiency and compatibility with various systems [6] - AICC's secure computing technology enhances AI security and privacy, reducing data leakage risks through hardware-protected environments [7] Future Trends - The development of intelligent Agents is expected to drive digital transformation in enterprises, with trends indicating deeper multimodal integration and enhanced autonomous learning capabilities [12][14] - Gartner predicts that by 2028, at least 15% of daily work decisions will be made using Agentic AI, highlighting the growing importance of intelligent Agents in business [12]
直击CVPR现场:中国玩家展商面前人从众,腾讯40+篇接收论文亮眼
量子位· 2025-06-17 07:41
白交 发自 凹非寺 量子位 | 公众号 QbitAI CVPR 2025落下帷幕,这次关注度和社交参与感,非常深度了。 比如随手抓住一只何恺明,直接变成追星现场。 在以谷歌/Meta等国际巨头为主导的展区里,中国企业规模创纪录,像腾讯、字节等大展区里面人从众。 总结下来,有这样几个有意思的发现。 展台面前排队体验的技术Demo,妥妥都是技术风向标~ 首先, 多模态、3D生成 是此次论文接收和现场研讨的热门方向,尤其像3D生成是亮点,背后高斯泼溅技术成为此次论文标题出现次数最多 的前五关键词之一。 其次, 对于基础模型的讨论远比以往更加深入,并且延伸到了产业落地 。具身智能、机器人AI在Workshop议程设置中独立出来一个大的板 块。 最后,中国企业今年参与得很深度,不过目前还是聚焦在已经成熟商业化的大公司。 多模态成为接收论文标题中的高频词,3D发展速度快、成果亮眼。 有热心网友整理了2878篇论文标题,得出了以下高频词。 除此之外还有哪些亮点,现在就带大家一网打尽。 探展CVPR 2025 CVPR含金量提升 CVPR,视觉领域妥妥的顶会,甚至与其他两位并称的顶会ICCV和ECCV相比,名气还要高那么一点 ...
MiniMax发布推理模型对标DeepSeek,算力成本仅约53万美元
Di Yi Cai Jing· 2025-06-17 07:26
Core Insights - MiniMax, one of the "Six Little Dragons," has announced significant updates, starting with the release of its first open-source inference model, MiniMax-M1 [1] - MiniMax-M1 has shown competitive performance in benchmark tests, comparable to leading overseas models like DeepSeek-R1 and Qwen3 [3] - The model's training was completed in just three weeks using 512 H800 GPUs, with a total computing cost of only $534,700, which is an order of magnitude lower than initially expected [3][8] Performance Metrics - MiniMax-M1's context window length is 1 million tokens, which is eight times that of DeepSeek R1 and matches Google's Gemini 2.5 Pro, allowing superior performance in long-context understanding tasks [5] - In the TAU-bench evaluation, MiniMax-M1 outperformed DeepSeek-R1-0528 and Google's Gemini 2.5 Pro, ranking just below OpenAI o3 and Claude 4 Opus globally [7] - The model excels in coding capabilities, significantly surpassing most open-source models, with only a slight gap behind the latest DeepSeek R1 [7] Innovations and Cost Efficiency - MiniMax-M1 utilizes a hybrid architecture based on a lightning attention mechanism, enhancing efficiency in long-text input and deep reasoning tasks [7] - The introduction of the CISPO reinforcement learning algorithm has resulted in faster convergence performance compared to Byte's recent DAPO algorithm, contributing to the low training cost [8] - MiniMax's pricing strategy is tiered based on input length, with costs ranging from $0.8 to $2.4 per million tokens for input and $8 to $24 for output, offering competitive pricing against DeepSeek [8] Competitive Landscape - Concurrently, another competitor, Moonlight, has released its programming model Kimi-Dev-72B, which reportedly achieved the highest open-source model level in SWE-bench tests, surpassing the new DeepSeek-R1 [8] - However, Kimi-Dev-72B faced scrutiny for potential overfitting, as it generated less code than required for certain tasks, raising questions about its performance reliability [9] - The AI industry is witnessing renewed competition among the "Six Little Dragons," with MiniMax expected to release further updates in the coming days, potentially impacting the multi-modal AI landscape [9]