计算机视觉

Search documents
科学界论文高引第一人易主!AI站上历史巅峰
量子位· 2025-08-25 05:54
一水 发自 凹非寺 量子位 | 公众号 QbitAI 魔镜魔镜,谁是有史以来被引用次数最多的科学家? 答案: 深度学习三巨头之一、图灵奖得主Yoshua Bengio 。 如你所见,之所以提出这个问题,其实是因为相关消息正在引起热议ing。 并且这一次,Bengio的"最高引"头衔不仅限于计算机领域,而是"称霸"所有学科,属于 "各领域被引用次数最多的在世科学家" 。 在这之前,早在2018年,Bengio就是世界计算机研究者中单日引用次数最高的人 (同一年获图灵奖) ,2022年还一举成为世界上被引用次 数最多的计算机科学家。 其贡献最大的几篇论文《一种神经概率语言模型》(发表于2003年)、《Generative adversarial nets》(发表于2014年的GAN)、 《Deep learning》(发表于2015年)全都为深度学习领域奠定了重要基础,深刻影响着如今大火的自然语言处理、计算机视觉等研究。 | | | 而在网友们的讨论中,热议背后更深层的意义也逐渐明晰:AI的胜利。 Bengio改变了人工智能,其对深度学习的贡献真正塑造了现代人工智能研究。 所以,借此机会,我们不妨再来回顾一下Be ...
"六边形战士"GPU公司完成亿元新融资
是说芯语· 2025-08-24 01:39
经过8年持续的技术研发与产品迭代,芯动力科技已建立起完整的AI计算产品矩阵。 其核心技术为 可重构并行处理器架构(简称"RPP") ,是自主研发专为并行计算设计的处理器架构。 自研AI芯片已适配主流开源大模型。 8月21日消息,清华系创新架构算力芯片企业「珠海市芯动力科技有限公司」(简称"芯动力科技")近日 完成 近亿元 B2轮融资,由飞图创投领投。 所获资金将重点投入RPP芯片产业化推进、核心技术研发升级以及边缘计算和AI芯片推理市场的加速拓 展。 芯动力科技成立于2017年,已在珠海、深圳、西安及美国设立研发中心。此前在今年3月,该公司宣布 完成数千万元B1轮融资,由长石资本领投,达泰资本、江门长信、硕明等机构跟投。 M.2加速卡拥有高达32TOPS的算力以及60GB/s的内存带宽,功耗可动态控制,同时可支撑大模型在笔记 本电脑等设备上运行,适配了DeepSeek、Llama3-8B、Stable Diffusion、通义千问,BitNet等开源模 型。 随着新一轮融资完成,芯动力科技计划围绕打造我国自有产权高端通用型芯片的发展方向前行。 转自:芯东西 加入"中国IC独角兽联盟",请点击进入 是说芯语转载 ...
格灵深瞳: 格灵深瞳2025年半年度报告
Zheng Quan Zhi Xing· 2025-08-22 16:29
北京格灵深瞳信息技术股份有限公司2025 年半年度报告 公司代码:688207 公司简称:格灵深瞳 北京格灵深瞳信息技术股份有限公司 北京格灵深瞳信息技术股份有限公司2025 年半年度报告 重要提示 一、 本公司董事会、监事会及董事、监事、高级管理人员保证半年度报告内容的真实性、准确 性、完整性,不存在虚假记载、误导性陈述或重大遗漏,并承担个别和连带的法律责任。 二、 重大风险提示 具体详见本报告"第三节 管理层讨论与分析"之"四、风险因素"。 三、 公司全体董事出席董事会会议。 四、 本半年度报告未经审计。 五、 公司负责人赵勇、主管会计工作负责人吴梦及会计机构负责人(会计主管人员)杜家芳声 明:保证半年度报告中财务报告的真实、准确、完整。 六、 董事会决议通过的本报告期利润分配预案或公积金转增股本预案 无。 七、 是否存在公司治理特殊安排等重要事项 □适用 √不适用 八、 前瞻性陈述的风险声明 √适用 □不适用 本报告所涉及的公司未来计划、发展战略等前瞻性陈述,不构成公司对投资者的实质承诺, 请投资者注意投资风险。 九、 是否存在被控股股东及其他关联方非经营性占用资金情况 否 十、 是否存在违反规定决策程 ...
视觉强化学习最新综述:全领域梳理(新加坡国立&浙大&港中文)
自动驾驶之心· 2025-08-16 00:03
图 1:代表性视觉强化学习模型时间线。该图按时间顺序概述了 2023 年至 2025 年的关键视觉强化学习(Visual RL)模型,并将其分为四个领域:多模态大语 言模型(Multimodal LLM)、视觉生成(Visual Generation)、统一模型(Unified Models)和视觉 - 语言 - 动作模型(VLA Models)。 在 大语言模型(LLM) 的江湖里, 强化学习(RL) ,特别是带有 人类反馈的强化学习(RLHF) ,早已不是什么新鲜词。正是它,如同一位内 力深厚的宗师,为 GPT、Qwen、DeepSeek 等模型注入了"灵魂",使其回答能够如此贴合人类的思维与价值观。这场由 RL 主导的革命,彻底改变 了我们与AI的交互方式。 然而,当所有人都以为强化学习的舞台仅限于文字的方寸之间时,一股同样的浪潮,正以迅雷不及掩耳之势,"卷"向了另一个更为广阔的领域—— 计算机视觉(CV) 。 点击下方 卡片 ,关注" 大模型之心Tech "公众号 戳我 -> 领取大模型巨卷干货 >> 点击进入→ 大模型技术 交流群 本文只做学术分享,如有侵权,联系删文 写在前面 当RLHF"卷入"计 ...
吞下17亿图片,Meta最强巨兽DINOv3开源,重新定义CV天花板
3 6 Ke· 2025-08-15 07:29
Core Insights - Meta has developed DINOv3, a self-supervised learning model trained on 1.7 billion images with 7 billion parameters, which has been successfully utilized by NASA for Mars exploration [1][3][26] - DINOv3 sets a new benchmark in computer vision performance, surpassing specialized solutions in various dense prediction tasks [1][10][19] - The model is fully open-sourced, including the pre-trained backbone, adapters, and training and evaluation code, making it suitable for commercial use [6][26] Performance Metrics - DINOv3 achieved significant improvements in various benchmarks compared to its predecessors, such as: - Segmentation on ADE-20k: 55.9 (up from 49.5 with DINOv2) [2] - Depth estimation on NYU I: 0.309 (improved from 0.372 with DINOv2) [2] - Video tracking on DAVIS: 83.3 (up from 76.6 with DINOv2) [2] - Instance retrieval on Met: 55.4 (increased from 44.6 with DINOv2) [2] - Image classification on ImageNet ReaL: 90.4 (up from 86.1 with DINOv2) [2] Applications and Impact - DINOv3's self-supervised learning approach allows it to function effectively in scenarios where labeled data is scarce, such as satellite imagery and medical imaging [10][12][15] - The model has been applied in real-world scenarios, such as monitoring deforestation and supporting ecological restoration efforts by the World Resources Institute [16] - DINOv3 has demonstrated a reduction in measurement error for tree canopy height estimation in Kenya, from 4.1 meters to 1.2 meters [17] Model Flexibility and Deployment - DINOv3's architecture allows for high efficiency and versatility, enabling it to perform multiple visual tasks without the need for fine-tuning [22][24] - Meta has created a family of models ranging from lightweight to high-performance versions to cater to various computational needs, ensuring practical deployment across different applications [26]
用时间积累换突破——月之暗面专注通用人工智能领域
Jing Ji Ri Bao· 2025-08-11 22:12
Core Insights - Moonshot AI, based in Beijing, is gaining attention for its open-source model Kimi K2, which ranked fifth globally upon its launch in July 2023 [1] - The company's mission is to explore the limits of intelligence and make AI universally accessible [1] Company Overview - Founded in April 2023 by a team with extensive experience in natural language processing (NLP), Moonshot AI aims to discover transformative possibilities in artificial intelligence [1] - The company has approximately 300 employees, with a significant portion being young talent from the '90s generation [2] Product Development - Kimi K2, a trillion-parameter model, has a unique capability to handle long texts, supporting up to 200,000 Chinese characters [2][5] - The Kimi intelligent assistant was launched in October 2023, followed by several product releases, including Kimi browser assistant and Kimi-Researcher [2] Technical Innovations - Kimi K2's architecture allows for complex tasks at a lower cost, with only 32 billion active parameters [3] - The model has excelled in various benchmarks, particularly in programming, tool usage, and mathematical reasoning [6] User Engagement - Kimi K2's long-text capability has led to a significant increase in user adoption, with user numbers growing from hundreds of thousands to tens of millions in 2024 [5] - The model is designed to be user-friendly, allowing non-programmers to utilize its capabilities effectively [7] Future Aspirations - Moonshot AI aims to create a general-purpose AI that surpasses human intelligence, focusing on developing versatile skills that can enhance each other [8] - The company emphasizes the importance of building a strong foundational model before releasing products, ensuring robust performance and capabilities [8]
秒测!AI视觉技术让油菜籽品质检测像扫码一样简单
Xin Jing Bao· 2025-08-11 06:12
为此,科研人员提出"拍照即测"的创新方案,利用计算机视觉技术,训练轻量化深度学习模型,开发出 适用于电脑端和手机端的SeedVision软件,检测人员只需拍照上传相关图像,10秒内即可检测出油菜籽 含油量和蛋白含量等品质指标,检测结果准确率超过88%,平均误差保持在5%以内,为油菜籽乃至花 生、大豆等油料作物品质实时在线检测提供了技术支撑。该成果已申请发明专利3项、软件著作权1项。 新京报讯 据中国农业科学院网站消息,近日,中国农业科学院油料作物研究所油料品质化学与加工利 用创新团队利用计算机视觉和人工智能,构建了油菜籽高质量图像数据库与模型库,实现了油菜籽品质 在线实时秒测。相关研究成果发表在《食品化学(Food Chemistry)》上。 传统的油菜籽品质检测方法依赖精密仪器和实验室分析,不仅样本易破坏,还费时费力,难以满足大规 模、实时检测需求。 该研究得到"十四五"国家重点研发计划、国家自然科学基金、中国农业科学院科技创新工程等项目的资 助。 ...
推荐几个具身智能与机器人私房菜!
具身智能之心· 2025-08-10 06:54
Core Viewpoint - The furniture and autonomous driving industries are experiencing significant growth in production, financing, and recruitment, with a strong emphasis on practical technology and skilled talent acquisition [1][2]. Group 1: Industry Trends - The autonomous driving sector is seeing a surge in companies scaling up production and hiring, indicating a competitive job market where securing positions is challenging due to high skill requirements [1]. - The emergence of high-level autonomous driving demonstration zones, such as in Beijing, is fostering innovation in policy, technology, and commercialization [1]. Group 2: Learning and Community Resources - Several influential communities focused on embodied intelligence, autonomous driving, computer vision, and AI are recommended for systematic learning and skill enhancement [1]. - The "Automatic Driving Heart" community is the largest developer community in China, focusing on various technical aspects of autonomous driving, attracting significant attention from industry professionals [2]. - The "Computer Vision Research Institute" shares the latest research and practical applications in AI, emphasizing technology research and implementation [5]. - The "Embodied Intelligence Heart" community is the first full-stack technical exchange platform in China, covering a wide range of topics related to embodied intelligence [8].
从自动驾驶到具身智能,这几个社区撑起了半边天!
自动驾驶之心· 2025-08-08 16:04
Core Viewpoint - The furniture and autonomous driving industries are experiencing significant growth in production, financing, and recruitment, leading to a highly competitive job market where skilled professionals are in high demand [1]. Group 1: Industry Trends - The industry is focusing on practical technologies, with companies competing to secure talent with relevant skills [1]. - The job market is described as "highly competitive," making it difficult for candidates to secure positions despite the availability of openings [1]. Group 2: Recommended Learning Communities - "Smart Driving Frontier" is a comprehensive media platform dedicated to the autonomous driving sector, providing technical insights and industry news [1]. - "Computer Vision Research Institute" focuses on AI research and practical applications, sharing the latest algorithms and project experiences [3]. - "Visual Language Navigation" aims to create a professional platform for navigation technologies, sharing technical insights and industry news [5]. - "Embodied Intelligence Research Lab" emphasizes core areas such as reinforcement learning and multi-agent collaboration, providing research updates and practical case studies [6]. - "Embodied Intelligence Heart" is the largest community for embodied intelligence, covering various technical directions and encouraging collaboration among developers [7]. - "arXiv Daily Academic Express" offers daily updates on academic papers across multiple fields, including AI and robotics, facilitating quick access to relevant research [8]. - "Autonomous Driving Heart" is a community for developers in the autonomous driving field, focusing on various technical aspects and job opportunities [10].