深度学习
Search documents
何恺明NeurIPS 2025演讲盘点:视觉目标检测三十年
机器之心· 2025-12-11 10:00
Core Insights - The article highlights the significance of the "Test of Time Award" received by the paper "Faster R-CNN," co-authored by renowned researchers, marking its impact on the field of computer vision since its publication in 2015 [1][5][25] - The presentation by He Kaiming at NeurIPS 2025 summarizes the evolution of visual object detection over the past 30 years, showcasing key milestones and influential works that have shaped the field [6][31] Historical Development - The early attempts at face detection in the 1990s relied on handcrafted features and statistical methods, which were limited in adaptability and speed [12] - The introduction of AlexNet in 2012 demonstrated the superior feature extraction capabilities of deep learning, paving the way for its application in object detection [15] - The R-CNN model, proposed in 2014, revolutionized object detection by integrating CNNs for feature extraction and classification, although it initially faced computational challenges [17][18] Technological Advancements - The development of Faster R-CNN in 2015 addressed the speed bottleneck by introducing the Region Proposal Network (RPN), allowing for end-to-end real-time detection [25] - Subsequent innovations, such as YOLO and SSD in 2016, further enhanced detection speed by enabling direct output of object locations and categories [32] - The introduction of Mask R-CNN in 2017 added instance segmentation capabilities, while DETR in 2020 redefined detection using Transformer architecture [32][34] Future Directions - The article concludes with reflections on the ongoing exploration in computer vision, emphasizing the need for innovative models to replace outdated components as bottlenecks arise [35][36]
地平线苏箐:未来三年 自动驾驶行业将告别范式迭代狂飙
Zhong Guo Jing Ying Bao· 2025-12-11 04:28
在苏箐看来,这一阶段的核心命题,是将现有技术的潜力发挥到极致,比如地平线将持续提升芯片算力 与模型容量,以统一范式推进L2到L4的融合,让城市L2从20万级车型下探至10万级市场,让准L4系统 以平民化价格走进大众。同时,强化工程与组织能力,应对海量长尾场景的打磨,这才是穿越周期的关 键。 "自动驾驶的终极目标,是造出能替代人类司机的机器。这场长跑,在范式革命之后,考验的是行业沉 下心来做'精活'的耐力。在未来几年内,能够把L4级别的车,以平民化的价格送到用户手上。这才是我 们所有人辛苦了这20年做这一行的意义所在。" (文章来源:中国经营报) "未来三年,自动驾驶行业将告别范式迭代的狂飙,进入极致优化的'苦日子'。" 12月9日,在"2025地平线技术生态大会"上,作为深耕自动驾驶20年的老兵,一向"反共识"的地平线副 总裁&首席架构师苏箐分享了对自动驾驶未来趋势的判断。 值得一提的是,对于当下,苏箐则给出了冷静的判断:"行业需要清醒。当前深度学习已显露天花板迹 象,AGI基础理论暂无突破信号,下一轮内核重构至少还需5—20年的技术沉淀。未来三年,自动驾驶 行业将告别范式迭代的狂飙,进入极致优化的'苦日子' ...
工业界大佬带队!三个月搞定3DGS理论与实战
自动驾驶之心· 2025-12-09 19:00
Core Insights - The article discusses the rapid advancements in 3D Generative Synthesis (3DGS) technology, highlighting its applications in various fields such as 3D modeling, virtual reality, and autonomous driving simulation [2][4] - A comprehensive learning roadmap for 3DGS has been developed to assist newcomers in mastering both theoretical and practical aspects of the technology [4][6] Group 1: 3DGS Technology Overview - The core goal of new perspective synthesis in machine vision is to create 3D models from images or videos that can be processed by computers, leading to numerous applications [2] - The evolution of 3DGS technology has seen significant improvements, including static reconstruction (3DGS), dynamic reconstruction (4DGS), and surface reconstruction (2DGS) [4] - The introduction of feed-forward 3DGS has addressed the inefficiencies of per-scene optimization methods, making the technology more accessible [4][14] Group 2: Course Structure and Content - The course titled "3DGS Theory and Algorithm Practical Tutorial" covers detailed explanations of 2DGS, 3DGS, and 4DGS, along with important research topics in the field [6] - The course is structured into six chapters, starting from foundational knowledge in computer graphics to advanced topics like feed-forward 3DGS [10][11][14] - Each chapter includes practical assignments and discussions to enhance understanding and application of the concepts learned [10][15] Group 3: Target Audience and Prerequisites - The course is designed for individuals with a background in computer graphics, visual reconstruction, and programming, particularly in Python and PyTorch [19] - Participants are expected to have a GPU with a recommended computing power of 4090 or higher to effectively engage with the course material [19] - The course aims to benefit those seeking internships, campus recruitment, or job opportunities in the field of 3DGS [19]
黄仁勋最新采访:依然害怕倒闭,非常焦虑
半导体芯闻· 2025-12-08 10:44
Core Insights - The discussion highlights the transformative impact of artificial intelligence (AI) and the role of NVIDIA in driving this technological revolution, emphasizing the importance of GPUs in various applications from gaming to modern data centers [2][10]. Group 1: AI and Technological Competition - The conversation underscores that the world is in a significant technological race, particularly in AI, where the first to reach advanced capabilities will gain substantial advantages [11][12]. - Historical context is provided, indicating that the U.S. has always been in a technological competition since the Industrial Revolution, with AI being the latest frontier [12][13]. Group 2: Energy and Manufacturing - The importance of energy growth and domestic manufacturing is emphasized as critical for national security and economic prosperity, with a call for revitalizing U.S. manufacturing capabilities [8][9]. - The discussion points out that without energy growth, industrial growth and job creation would be severely hindered, linking energy policies directly to advancements in AI and technology [9][10]. Group 3: AI Development and Safety - Concerns about the risks associated with AI are acknowledged, particularly regarding its potential military applications and ethical implications [19][20]. - The conversation suggests that AI's development will be gradual rather than sudden, with a focus on enhancing safety and reliability in AI systems [14][15]. Group 4: Future of AI and Knowledge Generation - The potential for AI to generate a significant portion of knowledge in the future is discussed, with predictions that AI could produce up to 90% of knowledge within a few years [41][42]. - The necessity for continuous verification of AI-generated information is highlighted, stressing the importance of ensuring accuracy and reliability in AI outputs [41][42]. Group 5: Cybersecurity and Collaboration - The dialogue emphasizes the collaborative nature of cybersecurity, where companies share information and best practices to combat threats collectively [23][24]. - The need for a unified approach to cybersecurity in the face of evolving threats is reiterated, suggesting that cooperation is essential for effective defense [23][24].
算力十年狂飙100000倍,他却每天担心破产!黄仁勋亲述:如何用“30天危机感”逆袭万亿AI市场
AI前线· 2025-12-08 07:18
Core Insights - The article discusses the pivotal moments in NVIDIA's history, highlighting the company's early struggles, strategic pivots, and the introduction of groundbreaking technologies like the CUDA Toolkit 13.1 and the CUDA Tile programming model [1][2][4][5]. Group 1: NVIDIA's Historical Context - NVIDIA faced significant challenges in its early days, including near bankruptcy and strategic missteps, which led to a critical reassessment of its technology and direction [8][9]. - The company’s turnaround involved a focus on 3D graphics technology, leveraging insights from Silicon Graphics to innovate and compress workstation performance into PC graphics cards [8][9][74]. Group 2: Technological Advancements - The launch of CUDA Toolkit 13.1 is described as the most comprehensive update in 20 years, introducing the CUDA Tile programming model, which simplifies GPU programming and enhances compatibility across generations [2][4][5]. - Key features of the new toolkit include improved resource management, enhanced precision simulation in cuBLAS, and a complete overhaul of documentation and tools, aimed at increasing usability for developers [7][8]. Group 3: CEO's Vision and Philosophy - CEO Jensen Huang emphasizes a continuous sense of urgency and fear of failure as driving forces behind NVIDIA's innovation and resilience [8][9]. - Huang's perspective on technology competition highlights the ongoing race in AI development, asserting that technological leadership is crucial for gaining advantages in various fields [13][14][20]. Group 4: Future of AI and Workforce Implications - Huang discusses the transformative potential of AI, predicting that its capabilities have improved by 100 times in the past two years, and emphasizes the importance of guiding AI development towards safety and accuracy [12][16][50]. - The conversation touches on the implications of AI on jobs, suggesting that while some roles may be automated, new opportunities will emerge, and the essence of work will shift towards more meaningful contributions beyond mere task execution [38][45][48].
黄仁勋最新采访:依然害怕倒闭,非常焦虑
半导体行业观察· 2025-12-06 03:06
Core Insights - The discussion highlights the transformative impact of artificial intelligence (AI) and the role of NVIDIA in driving this technological revolution, emphasizing the importance of GPUs in various applications from gaming to modern data centers [1] - Huang Renxun discusses the risks and rewards associated with AI, the global AI race, and the significance of energy and manufacturing for future innovations [1] Group 1: AI and Technological Competition - The ongoing technological competition has been a constant since the Industrial Revolution, with the current AI race being one of the most critical [10][11] - Huang Renxun emphasizes that technological leadership is essential for national security and economic prosperity, linking energy growth to industrial growth and job creation [7][8] - The conversation touches on the historical context of technological races, including the Manhattan Project and the Cold War, underscoring the continuous nature of these competitions [11] Group 2: AI Development and Safety - Huang Renxun expresses optimism about the gradual development of AI, suggesting that advancements will be incremental rather than sudden [13] - The discussion addresses concerns about AI's potential risks, including the ethical implications of military applications and the need for robust cybersecurity measures [16][20] - Huang Renxun believes that AI's capabilities will increasingly focus on safety and reliability, reducing the occurrence of errors or "hallucinations" in AI outputs [14] Group 3: Future of Work and AI's Impact - The conversation explores the potential for AI to create a future where traditional jobs may become obsolete, leading to a society where individuals receive universal basic income [37] - Huang Renxun acknowledges the challenges of identity and purpose as AI takes over tasks traditionally performed by humans, emphasizing the need for society to adapt to these changes [38] - The discussion highlights the importance of maintaining human engagement and problem-solving in a future dominated by AI technologies [38] Group 4: Quantum Computing and Security - Huang Renxun discusses the implications of quantum computing on encryption and cybersecurity, suggesting that while current encryption methods may become outdated, the industry is actively developing post-quantum encryption technologies [22][23] - The conversation emphasizes the collaborative nature of cybersecurity efforts, where companies share information to enhance collective defenses against threats [20][21] - Huang Renxun asserts that AI will play a crucial role in future cybersecurity measures, leveraging its capabilities to protect against evolving threats [21]
对话任少卿:2025 NeurIPS 时间检验奖背后,我的学术与产业观
雷峰网· 2025-12-05 10:24
Group 1 - NeurIPS is recognized as the "Oscar of AI" and serves as a global annual barometer for the artificial intelligence field [1] - The NeurIPS Time-Tested Award honors foundational works that have significantly influenced the discipline over a decade [1] - The award was given to the authors of "Faster R-CNN," which has been cited over 98,000 times, making it the most cited paper by a Chinese first author at this conference [2] Group 2 - "Faster R-CNN," developed in 2015, improved object detection efficiency by over 10 times and introduced an end-to-end real-time detection model [2] - The core ideas of this model have been deeply integrated into the foundational technologies of AI, impacting key sectors such as autonomous driving and medical imaging [2] - The collaboration between the authors, including Ren Shaoqing and He Kaiming, has led to significant advancements in deep learning frameworks [2] Group 3 - Ren Shaoqing joined NIO in August 2020, focusing on building a team and developing self-research chips for autonomous driving [13][14] - NIO's first generation of vehicles utilized the Mobileye solution, while the second generation was the first globally to mass-produce the NVIDIA Orin chip [14] - The challenges faced during the development included adapting to new architectures and ensuring the stability of the new chip [15] Group 4 - NIO emphasized the importance of data collection and analysis, focusing on corner cases to improve the performance of their models [19][20] - The company established a flexible system for cloud computing and data management, allowing for rapid iteration of models [21] - NIO's approach to active safety has enabled them to achieve a standard of 200,000 kilometers per false positive, significantly improving their testing efficiency [22] Group 5 - The concept of end-to-end solutions in autonomous driving has evolved, with discussions on integrating various technologies to enhance performance [24][25] - NIO is exploring the development of world models to improve long-term decision-making capabilities in autonomous systems [27][28] - The world model approach aims to address the limitations of traditional methods by incorporating both spatial and temporal understanding [30][31]
全球引才:Faster R-CNN、ResNet作者,中国科大任少卿,招募教授、学者和学生
机器之心· 2025-12-05 10:17
Core Viewpoint - The article highlights the achievements and contributions of Professor Ren Shaoqing in the field of artificial intelligence, particularly in deep learning and computer vision, emphasizing his role in advancing key technologies that impact various sectors such as autonomous driving and medical imaging [4][5][6]. Group 1: Academic Achievements - Professor Ren has made foundational and pioneering contributions in deep learning, computer vision, and intelligent driving, with his research serving as a core engine for critical areas of national economy and livelihood [5]. - His academic papers have been cited over 460,000 times, ranking him first among domestic scholars across all disciplines [5]. - He has received multiple prestigious awards, including the 2023 Future Science Prize in Mathematics and Computer Science and the 2025 NeurIPS Time Test Award [5]. Group 2: Key Research Contributions - The paper "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," awarded the NeurIPS 2025 Time Test Award, is considered a milestone in computer vision, having been cited over 98,000 times since its publication in 2015 [6]. - Faster R-CNN introduced a fully learnable two-stage pipeline that replaced traditional methods, achieving high precision and near real-time detection, significantly influencing the development of visual models over the past decade [6]. Group 3: Research Institute and Talent Recruitment - The General Artificial Intelligence Research Institute at the University of Science and Technology of China focuses on cutting-edge areas such as AI, world models, embodied intelligence, and autonomous driving, aiming for integrated innovation in research, talent cultivation, and industrial application [7]. - The institute is actively recruiting for various positions, including professors, researchers, postdoctoral fellows, engineers, and students at different academic levels, with a commitment to supporting high-level talent projects [9][10].
黄仁勋万字访谈:33年来每天都觉得公司要倒闭,AI竞赛无“终点线”,技术迭代才是关键
华尔街见闻· 2025-12-05 09:39
英伟达创始人兼CEO黄仁勋近日在播客节目中进行了一场长达两小时的深度访谈,详细阐述了他对人工智能竞赛、公司经营以及个人成长的看法。 这位全球市值最高科技公司之一的掌门人,以罕见的坦诚揭示了一个令人意外的事实: 尽管英伟达已成为AI时代的核心企业,但他每天醒来仍然感到公司"距 离倒闭还有30天"。 在谈及当前全球关注的AI竞赛时,黄仁勋提出了与主流观点截然不同的看法。他认为,这场竞赛并非如外界想象的那样存在一条明确的"终点线",也不会出现 某一方突然获得压倒性优势的局面。相反, 技术进步将是渐进式的,所有参与者都将站在AI的肩膀上共同进化。 他认为, 真正的竞争力在于持续迭代能力,而非一次性突破 。过去10年AI算力提升10万倍,但这些算力用于让AI更谨慎思考、检验答案,而非做危险的事。 英伟达2005年推出CUDA时股价暴跌80%,但坚持投入最终成就了今天AI革命的基础设施。迭代不是重复,而是基于第一性原理的持续修正。 黄仁勋还详细回顾了英伟达多次濒临破产的创业经历,包括1995年技术路线选择错误、依靠世嘉500万美元投资和台积电张忠谋的信任才得以生存的惊险时 刻。 这些经历塑造了他对风险、战略和领导力的独特 ...
X @外汇交易员
外汇交易员· 2025-12-05 07:14
英伟达CEO黄仁勋访谈内容要点:• 宏观经济与增长的底层逻辑:“如果我们没有能源增长,我们就无法有工业增长。如果我们没有工业增长,我们就无法有就业增长。就这么简单。”• AI生成内容的未来趋势:“在未来,也许两三年内,世界上90%的知识可能会由AI生成。”• AI能源成本的通缩效应(反直觉观点):“在10年的时间里,大多数人使用人工智能所需的能量将是微不足道的,极其微不足道。……因为它不消耗那么多能量。”• 深度学习的无限潜力(TAM市场规模):“深度学习可以解决任何问题,所有有趣的问题,只要我们有输入和输出。”• 专注与通过细分市场取胜的策略:“与其为每一个3D图形应用制造3D图形芯片,我们决定只为一种应用制造3D图形芯片。我们孤注一掷在视频游戏上。”• 在市场不理解时坚持投资(如CUDA):“如果你相信那个未来而不去做,你会后悔一辈子。……如果它真的非常难做,那就值得做。”• 企业的危机意识与生存心态:“‘离倒闭只有30天’这句话我用了33年。……那种脆弱感、不确定感、不安全感。它不会离开你。”• 领导者的纠错能力与转型:“如果你把自己放在超能力的位置上,我们就很难转型策略,因为我们本该一直是对的。……如 ...