Workflow
大模型
icon
Search documents
腾讯研究院AI速递 20250710
腾讯研究院· 2025-07-09 14:49
Group 1: Veo 3 Upgrade - The Google Veo 3 upgrade allows audio and video generation from a single image, maintaining high consistency across multiple angles [1] - The new feature is implemented through the Flow platform's "Frames to Video" option, enhancing camera movement capabilities, although the Gemini Veo3 entry is currently unavailable [1] - User tests indicate natural expressions and effective performances, marking a significant breakthrough in AI storytelling applicable in advertising and animation [1] Group 2: Hugging Face 3B Model - Hugging Face has released the open-source 3B parameter model SmolLM3, outperforming Llama-3.2-3B and Qwen2.5-3B, supporting a 128K context window and six languages [2] - The model features a dual-mode system allowing users to switch between deep thinking and non-thinking modes [2] - It employs a three-stage mixed training strategy, trained on 11.2 trillion tokens, with all technical details, including architecture and data mixing methods, made available [2] Group 3: Kunlun Wanwei Skywork-R1V 3.0 - Kunlun Wanwei has open-sourced the Skywork-R1V 3.0 multimodal model, achieving a score of 142 in high school mathematics and 76 in MMMU evaluation, surpassing some closed-source models [3] - The model utilizes a reinforcement learning strategy (GRPO) and key entropy-driven mechanisms, achieving high performance with only 12,000 supervised samples and 13,000 reinforcement learning samples [3] - It excels in physical reasoning, logical reasoning, and mathematical problem-solving, setting a new performance benchmark for open-source models and demonstrating cross-disciplinary generalization capabilities [3] Group 4: Vidu Q1 Video Creation - Vidu Q1's multi-reference video feature allows users to upload up to seven reference images, enabling strong character consistency and zero storyboard video generation [4] - Users can combine multiple subjects with simple prompts, with clarity upgraded to 1080P, and support for character material storage for repeated use [5] - Test results show it is suitable for creating multi-character animation trailers, supporting frame extraction and quality enhancement, reducing video production costs to less than 0.9 yuan per video [5] Group 5: VIVO BlueLM-2.5-3B Model - VIVO has launched the BlueLM-2.5-3B edge multimodal model, which excels in over 20 evaluations and supports GUI interface understanding [6] - The model allows flexible switching between long and short thinking modes, introducing a thinking budget control mechanism to optimize reasoning depth and computational cost [6] - It employs a sophisticated structure (ViT+Adapter+LLM) and a four-stage pre-training strategy, enhancing efficiency and mitigating the text capability forgetting issue in multimodal models [6] Group 6: DeepSeek-R1 System - The X-Masters system, developed by Shanghai Jiao Tong University and DeepMind Technology, has achieved a score of 32.1 in the "Human Last Exam" (HLE), surpassing OpenAI and Google [7] - The system is built on the DeepSeek-R1 model, enabling smooth transitions between internal reasoning and external tool usage, using code as an interactive language [7] - X-Masters employs a decentralized-stacked multi-agent workflow, enhancing reasoning breadth and depth through collaboration among solvers, critics, rewriters, and selectors, with the solution fully open-sourced [7] Group 7: Zhihui Jun's Acquisition - Zhihui Jun's Zhiyuan Robot has acquired control of the listed company Shuangwei New Materials for 2.1 billion yuan, aiming for a 63.62%-66.99% stake [8] - Following the acquisition, Shuangwei New Materials' stock resumed trading with a limit-up, reaching a market value of 3.77 billion yuan, with the actual controller changing to Zhiyuan CEO Deng Taihua and core team members including "Zhihui Jun" Peng Zhihui [8] - This acquisition, conducted through "agreement transfer + active invitation," is seen as a landmark case for new productivity enterprises in A-shares following the implementation of national policies [8] Group 8: AI Model Usage Trends - In the first half of 2025, the Gemini series models captured nearly half of the large model API market, with Google leading at 43.1%, followed by DeepSeek and Anthropic at 19.6% and 18.4% respectively [9] - DeepSeek V3 has maintained a high user retention rate since its launch, ranking among the top five in usage, while OpenAI's model usage has fluctuated significantly [9] - The competitive landscape shows differentiation: Claude-Sonnet-4 leads in programming (44.5%), Gemini-2.0-Flash excels in translation, GPT-4o leads in marketing (32.5%), and role-playing remains highly fragmented [9] Group 9: AI User Trends - A report by Menlo Ventures indicates that there are 1.8 billion AI users globally, with a low paid user rate of only 3%, and a high student usage rate of 85%, while parents are becoming heavy users [10] - AI is primarily used for email writing (19%), researching topics of interest (18%), and managing to-do lists (18%), with no single task dependency exceeding one-fifth [10] - The next 18-24 months are expected to see six major trends in AI: rise of vertical tools, complete process automation, multi-person collaboration, explosion of voice AI, physical AI in households, and diversification of business models [10]
中央企业产业大模型“上新”
Zhong Guo Xin Wen Wang· 2025-07-09 13:48
Group 1 - The "Xiaomiao" industrial model, developed by the Smart Building Materials Research Institute funded by China National Building Material Group, has been publicly launched, focusing on the cement sector as a testing ground [1] - The model integrates three core technologies: the fusion of time-series data with industrial mechanisms, multi-modal scenario collaboration, and decision-making fault tolerance, achieving over 1% reduction in cement batching costs [1] - After over two years of application, the model has established a mature engineering delivery capability, successfully implemented in nearly 100 cement enterprises, with data governance cycles reduced to as short as 14 days and model deployment within 7 days [1] Group 2 - China National Building Material Group's chairman believes AI will act as a "super accelerator" for new material research, significantly shortening development cycles and reducing trial-and-error costs [2] - The group is currently promoting AI's integration into strategic emerging industries for new materials, having built 231 scenario models covering the entire chain from core manufacturing to R&D and supply chain management [2] - In 2024, the State-owned Assets Supervision and Administration Commission will launch the "AI+" initiative for central enterprises, with several enterprises releasing industrial models, including China National Petroleum and State Grid [2]
黑芝麻智能助力!「深庭纪智能」完成种子+轮融资!
机器人大讲堂· 2025-07-09 13:38
机器人大讲堂获悉, 户外场景下的 AI伴随机器人研发商 「 深庭纪智能」近日成功完成数千万元种子 +轮融 资 , 由黑芝麻智能,软通高科、粒子未来基金等机构参投 。 本轮融资将 用于 持续提升自研 AI大脑的核心 能力,完善从感知、认知到决策的闭环架构,并加速新一代家用机器人产品的量产与落地。 图片来源:深庭纪智能 在产品研发与业务拓展方面 , 深庭纪智能 用端侧 AI大模型、环境感知、多模态交互等最新技术, 打造 自 主前行、主动互动、真正帮助用户解决实际需求的 AI 伙伴 , 目标是让 "陪伴"变得更智能、更安全、更有 趣。 深庭纪智能 申报了多项原创发明专利,不断拓展机器人的 "智力"和"情商"。 例如 多模态的家庭人员 关系自主学习方法和装置 , 大模型智能体记忆能力优化方法和装置 , 情绪输出方法、装置和机器人 , 基 于跟踪的聚类人脸识别方法和装置 , 连续手势识别方法、装置和计算机设备 等。应用最新技术, 深庭纪智 能打造的初版原型机已经在户外场景展开测试。凭借先进的算法,机器人能够精准分辨复杂地形和路线,忠实 地伴随人类左右,甚至可以与人类互动踢足球。机器狗作为其中的代表,曾亮相苏超球场,具备 ...
未来50年最具突破潜力的方向是什么?这些科学家共话科学发展趋势
Zheng Quan Shi Bao· 2025-07-09 13:24
Group 1 - The Future Science Prize 10th Anniversary Celebration highlighted discussions on disruptive scientific changes over the next 20 years and breakthrough potentials over the next 50 years [1] - Zhang Jie from Shanghai Jiao Tong University emphasized that the achievement of net energy gain from inertial confinement nuclear fusion in December 2022 marks a significant milestone for controllable nuclear fusion technology, which could transform society towards non-carbon-based energy [1] - Ding Hong, also from Shanghai Jiao Tong University, identified general quantum computing as the most disruptive technology in the next 20 years, while AI for Science will be a key focus in the next 50 years [1] Group 2 - Xue Qikun, President of Southern University of Science and Technology, stated that controlled nuclear fusion could permanently solve energy issues and support industrial revolutions in the next 20 years, while room-temperature superconductivity could lead to major scientific and technological changes in the next 50 years [2] - Chen Xianhui from the University of Science and Technology of China highlighted that core key materials could drive significant human transformations in the next 20 years, with room-temperature superconductivity breaking cost barriers in fields like medical MRI and quantum computing cooling in the next 50 years [2] - Shi Yigong from Westlake University discussed how AI technologies like AlphaFold have revolutionized traditional biological research, urging researchers to embrace AI to expand scientific boundaries while maintaining critical thinking and interdisciplinary collaboration [2] Group 3 - Shen Xiangyang, Chairman of the Board of Hong Kong University of Science and Technology, described large models as encompassing technology, business, and governance, with multimodal development being a crucial milestone involving computation, algorithms, and data [3] - Yang Yaodong from Peking University emphasized the importance of alignment technology for large models to comply with human instructions, noting current weaknesses in reinforcement learning-based alignment and suggesting enhancements through computer science and cryptography [3]
智谱获10亿战略投资 商业化之路仍待开启
Core Insights - Zhiyuan has received a strategic investment of 1 billion yuan from Pudong Venture Capital Group and Zhangjiang Group, with the first transaction completed recently [1] - The CEO of Zhiyuan announced the release of a new general visual language model, GLM-4.1V-Thinking, which enhances multimodal model performance [1][2] - Zhiyuan has initiated IPO guidance, becoming the first among the "six small tigers" in the large model sector to pursue listing [2] Investment and Financial Activities - Zhiyuan has secured multiple strategic investments from state-owned enterprises, including over 1 billion yuan in March from Hangzhou City Investment Industrial Fund and Up City Capital, and additional investments from Zhuhai Huafa Group and Chengdu High-tech Zone [2] - The company is transitioning its business strategy from "selling models" to "selling services" starting in early 2025, indicating a shift in focus towards application development [4] Product Development and Technology - The GLM-4.1V-Thinking model supports various multimodal inputs and is designed for complex cognitive tasks, featuring a chain-of-thought reasoning mechanism and reinforcement learning strategies [2][3] - The lightweight version, GLM-4.1V-9B-Thinking, maintains performance while optimizing deployment efficiency, achieving top scores in 23 out of 28 authoritative evaluations [3] Market Position and Competitive Landscape - Zhiyuan's GLM model is recognized as a representative large model in China, with strong capabilities in Chinese language understanding and generation, particularly suited for education, government, and cultural sectors [5][6] - The company offers competitive pricing for its API, significantly lower than international models, making it suitable for large-scale commercial use [7] Challenges and Limitations - The company faces challenges in commercializing its models, particularly in light of strong competition from open-source models and the need for higher computational resource utilization [4][9] - Zhiyuan's multimodal capabilities are still developing, with plans to launch a new model in 2024, while its English language performance lags behind competitors [7][8]
人工智能与大模型专题:央国企科技创新系列报告之四
CMS· 2025-07-09 13:00
Group 1: AI Industry Development - The AI industry follows a "technology-hardware-terminal-application" development model, with a shift from communication networks to large model theoretical research[1] - Domestic chip manufacturers are accelerating technological breakthroughs, enhancing the application ecosystem, and driving the deep integration of generative AI across multiple industries[2] - The global large model technology is entering a deep competitive phase, with differentiated development paths between China and the US[2] Group 2: AI Chip and Hardware Investment - AI chips are the cornerstone of the large model industry, characterized by long R&D cycles, high technical barriers, and significant investment costs[2] - China has established a basic layout in GPU, ASIC, and FPGA chips, meeting standards for various application scenarios[2] - Investment opportunities exist in the AI industry chain, including optical modules, power distribution technology, and liquid cooling technology[2] Group 3: Market Trends and Opportunities - The domestic AI industry is experiencing a strategic transformation from "software-hardware decoupling" to "full-stack collaboration"[2] - The market for AI software ecosystems is still dominated by foreign open-source frameworks, but domestic companies are accelerating their AI ecosystem layout[2] - The procurement rate of domestic large models in key industries like finance and telecommunications has exceeded 45%[2] Group 4: Risks and Challenges - Risks include slower-than-expected technological iterations, industry growth rates, and potential policy risks[2] - The need for high-quality data and standards in model training remains a challenge for the domestic AI industry[2]
师兄自己发了篇自动驾大模型,申博去TOP2了。。。
自动驾驶之心· 2025-07-09 12:56
点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近15个 方向 学习 路线 大模型在自动驾驶功能上的落地逐渐清晰化,理想、华为等公司开始推行自己的VLA、VLM方案。那么下一代大 模型需要关注的点有哪些呢? 按照早期自动驾驶技术发展的规律,当数据和方案基本验证有效后,开始重点关注轻量化与硬件适配、知识蒸馏与 量化加速、高效微调大模型等方向! 除此之外目前大火的CoT方案也是后期完成空间感知的重点,VLA+强化学习等高级推理范式也被行业重点关注。 这些问题是学术界和工业界亟需要解决的,相关的论文研究受到审稿人的青睐,国内外越来越多的团队正在从事相 关方向研究。我们了解到不少同学靠着自己的努力,发了篇和自动驾驶相关的大模型工作,申博去了TOP2!前面 收到很多同学的求助,希望能够辅助开展大模型相关的论文指导研究,解决无人带发论文,缺乏指导的痛点。 自动驾驶之心联合业内知名大模型方向学者,开展了1v6的大模型论文指导小班课,解决无人带、易踩坑、不知如 何写稿、投稿的难题。 ⼀、课程介绍⭐ 随着大语言模型(LLM)和多模态模型的快速发展,如何提升模型效率、扩展知识能力以及增强推理性能已成为 ...
以五维安全体系构筑基石,中汽中心、清华、华为联合发布智驾技术白皮书
财联社· 2025-07-09 12:48
企业的目标是智能驾驶技术的研发与商业化落地,但以智能化为核心的新能源汽车"下半场"不 应仅是发展技术,更需要全行业以客观公正的视角去谨慎定义、理性看待。 7月8日,中汽中心、清华大学、华为联合发布《汽车智能驾驶技术及产业发展白皮书》(下 称《 智驾技术 白皮书》)。" 《智驾技术白皮书》 系统梳理了技术演进脉络、产业生态格局 与未来发展趋势,尤其注重对安全保障体系、风险挑战的剖析,既是对行业经验的系统性凝 练,更是以精准角度剖析发展瓶颈,以创新思维擘画破局路径。 希望 《智驾技术白皮书》 能成为'政产学研'各界人士的参考,助力中国自动驾驶产业在技术 攻坚、安全保障、商业落地与社会价值创造中实现新的突破。"中国工程院院士、清华大学车 辆与运载学院教授李骏在《 智驾技术 白皮书》序言中表示。 华为作为《 智驾技术 白皮书》产学研编写组成员中唯一的产业代表,在其中扮演角色的重要 性不言而喻。开源证券研报称,汽车产业已进入技术转换周期,软件在汽车产业的价值不断加 大,华为凭强劲实力深度布局汽车市场,志在打造汽车产业智能化开放平台。"华为ADS智 能 辅助 驾 驶 功能升级,技术生态成型深度赋能,加速行业格局演化。" ...
金融大模型迈向价值创造,智能体如何突破“最后一公里”
Di Yi Cai Jing· 2025-07-09 12:41
应对数据安全、算法可靠性等关键挑战。 在近日举办的"大模型金融应用及创新论坛"上,来自金融机构、科技企业和监管机构的众多专家齐聚一 堂,共同探讨了人工智能(AI)和大模型技术在金融领域的应用现状与未来发展方向。 在外资银行方面,东亚银行资讯科技架构平台部总经理张方昌指出,外资银行在AI应用中面临着投入 有限、市场竞争激烈等挑战。然而,通过与全球集团方案的结合和本地化创新,东亚银行在跨境审单等 场景中实现了智能化应用,提升了业务效率和客户体验。 数据、安全与技术难题 尽管应用广泛,金融大模型的深度落地仍面临多重障碍。数据安全与算法可靠性构成首要掣肘。 北京国家金融科技认证中心认证二部负责人段力畑在论坛上发布了《大模型金融应用安全风险测评结 果》。他指出,大模型在金融场景中的应用存在安全能力不足、推理能力与数理计算能力不匹配、幻觉 现象等问题。 中国金融电子化集团党委委员、副总经理潘润红指出,现阶段大模型在金融领域的应用面临数据安全和 算法可靠性等风险、实施路径不明晰、功能边界有待验证、核心场景中的渗透率不足等问题。 论坛聚焦于AI技术如何从降本增效迈向价值创造,以及如何应对数据安全、算法可靠性等关键挑战。 与会 ...
赛意信息全球研发中心奠基:汇聚全球英才 助力中国工业软件走向世界
Guang Zhou Ri Bao· 2025-07-09 11:43
Core Insights - The establishment of the global R&D center marks a significant milestone for the company, enhancing its product development and innovation capabilities while supporting its global strategy [2] - The company aims to recruit over 1,000 R&D personnel within five years to elevate the level of industrial software in China, leveraging the country's manufacturing strengths [3] - The global R&D center will focus on industrial AI, industrial internet platforms, and core industrial software product development, positioning itself as a benchmark in the digital intelligence field [4] Company Developments - The global R&D center will cover an area of 11.72 acres with a total investment exceeding 300 million yuan, featuring a forward-looking design that symbolizes the company's ambition in digital intelligence [4] - The company has accumulated extensive industry experience over 20 years, allowing it to explore AI applications across various sectors, particularly in manufacturing [5] - The R&D center will support the company's goal of providing comprehensive intelligent upgrade solutions for enterprises, facilitating China's transition to high-end, intelligent, and sustainable manufacturing [5]