深度学习
Search documents
每经热评丨人形机器人运动会启示:前沿技术走向大众需要催化剂
Mei Ri Jing Ji Xin Wen· 2025-08-19 15:07
Group 1: Core Insights - The first humanoid robot sports competition was held in Beijing, featuring 280 teams and over 500 humanoid robots from 16 countries, marking a significant step for humanoid robots from laboratory to real-world applications [1][2] - The event serves as a platform for technology exchange and validation, showcasing advancements in humanoid robotics, which integrates artificial intelligence and deep learning for autonomous decision-making and adaptive capabilities [1][2] - The competition highlights the current early-stage development of humanoid robot technology, allowing companies and research institutions to demonstrate their progress and identify areas for improvement [1][2] Group 2: Industry Impact - The robot sports competition acts as a crucial indicator for the global humanoid robot industry, attracting 192 university teams and 88 corporate teams, fostering a high-density gathering of technology, talent, and capital [2] - The event assesses not only individual robot performance but also the maturity of the entire ecosystem, including algorithm quality, hardware supply chain stability, and operational response speed, driving improvements across the industry [2] - Investors gain direct insights into the practical applications and market potential of the technology, while government entities can better understand industry trends to formulate targeted policies that support development [2] Group 3: Social Influence - The competition builds a bridge between the public and cutting-edge technology, providing a unique platform for 16 countries to compete, which enhances public understanding of humanoid robots [3] - Spectators can directly observe the advancements and potential of robot technology, which could lead to exponential growth in product orders once market expectations are met [3] - The competitive atmosphere and recognition system encourage young talent to engage in robotics research, accelerating talent cultivation and accumulation in the industry [3]
图灵奖得主杨立昆沉寂数月后发声:AI安全是工程问题,不必恐慌“失控”
3 6 Ke· 2025-08-19 02:50
划重点: 2025年7月1日,Meta正式宣布成立 Meta超级智能实验室(Meta Superintelligence Labs),以加速通用人工智能(AGI)的研发进程。在 这一重组过程中,马克·扎克伯格展现出对AI顶尖人才的强烈渴求,他主导的一系列高调挖角行动迅速震动了整个科技行业。然而,在此 过程中,长期担任Meta基础AI研究实验室(FAIR)首席科学家的杨立昆(Yann LeCun)却逐渐被"边缘化",淡出了公众视野。 自2023年夏季以来,Llama模型家族已被下载约8亿次,这个数字令人震惊。 让AI的行为与人类价值观保持一致,更像是一个工程与设计问题,就像当年通过工程手段让喷气式飞机安全飞行一 样。 杨立昆提出"目标驱动架构",核心思路是为系统设定清晰的目标和必要的安全边界,使其在既定范围内执行任务。 AI可能像15世纪的印刷术一样,引发一场新的文艺复兴,放大人类智慧。 年轻人不要被负面的、耸人听闻的故事吓倒;要认识到自己的力量,主动塑造自己想要的未来。 沉寂数月后,这位被誉为"人工智能教父"之一的科学家,在法国巴黎接受一位人工智能专家(也是硅谷一家初创公司的联合创始人、首 席技术官)——芭芭 ...
几何计算联袂深度学习 提升疾病诊断准确率
Ke Ji Ri Bao· 2025-08-19 01:22
科技日报讯 (记者金凤)记者8月17日从东南大学获悉,该校教授顾忠泽团队联合中国科学院外籍 院士丘成桐团队等研发出基于几何表面参数化的多组学预测技术。该技术能提升对结直肠肿瘤等实体瘤 的组织分型与分子标志物的预测准确率,有望支撑人工智能在病理图像分析领域的应用。相关成果近日 刊发于中国工程院院刊《工程学》。 "人工智能已经在病理诊断中有不少应用,但现有算法多用于自然图像领域,其处理不规则病理图 像的能力有限。"论文第一作者、东南大学生物科学与医学工程学院博士生黄锴介绍,当碰到病理组织 分布不均匀等情况时,人工智能对疾病的预测准确度就会大打折扣。 "病理切片中的组织形状很不规则,造成图像中有许多空白区域,这些空白对于疾病的诊断没有意 义,我们将包含少量空白的不规则组织的图像提取出来,再将这些图像转换为正方形。"论文共同通讯 作者李铁香介绍,团队通过几何映射技术,保留了病理图像关键特征,同时引入多尺度和各向异性信 息,提升病理切片图像中有关肿瘤区域的信息量。经过这番处理,既减少了对无用信息的储存、处理, 又能增强卷积神经网络对肿瘤特征的学习能力。 黄锴介绍,团队使用该方法在573名结直肠癌患者的1802张切片上进 ...
每经热评︱人形机器人运动会启示:前沿技术走向大众需要催化剂
Mei Ri Jing Ji Xin Wen· 2025-08-18 07:40
每经评论员 张蕊 近日,全球首个人形机器人运动会在北京盛大举行。来自五大洲16个国家的280支队伍、500余台人形机 器人齐聚一堂,在国家速滑馆展开26个大项、538个小项的激烈竞技。这不仅是机器人领域的一场盛 会,更是人形机器人从实验室走向现实生活的重要一步,是产业创新的"催化剂"。可以说,中国正在用 最直观、最具传播力的方式,把实验室里的技术语言翻译成产业资本"听得懂"的赛道红利。 从技术层面看,人形机器人运动会为技术交流和检验提供了绝佳平台。 当下,人形机器人代表着全球科技最前沿的趋势。与传统机器人倚重机械工程和自动化控制不同,人形 机器人以人工智能和深度学习为内核,具备自主决策和自适应能力,实现了感知、行动与认知的融合。 然而,目前人形机器人技术仍处于发展初期。此时通过运动会这个平台,各家企业和科研机构可以集中 展示人形机器人领域的技术进展,并及时检验技术成果。 普通观众通过观赏赛事,可以直观感受到机器人技术的进步与潜力。一旦达到市场预期,相关产品的订 单就有可能获得指数级放大。同时,赛事的荣誉体系和竞争氛围,也为青年人才提供了展示舞台,能够 激励他们投身机器人科研领域,加速行业人才培养与积累。 事实 ...
卖酒的茅台要学AI了!和奔驰麦当劳一起拜师百度
量子位· 2025-08-17 03:43
Core Viewpoint - The article discusses the launch of the ninth session of Baidu's Chief AI Architect Training Program (AICA), highlighting its significance in cultivating AI talent and the increasing interest from top executives across various industries in AI education [2][41]. Group 1: AICA Program Overview - The AICA program aims to train composite AI architects who can engage in both technical development and project implementation, leveraging Baidu's self-developed deep learning platform, PaddlePaddle, and the Wenxin large model [5][41]. - This session has attracted 96 students selected from over 500 applicants, with 61% coming from state-owned enterprises, listed companies, and leading T1 application service providers [42][41]. - The curriculum includes new modules on Wenxin open-source, cutting-edge technologies, multimodal data, and practical case studies of Baidu's key technologies [44][41]. Group 2: Industry Trends and AI Development - The focus of discussions during the opening ceremony was on large models, which accounted for 51% of the topics covered, emphasizing their role in driving industrial transformation [6][7]. - AI is seen as a pivotal technology for economic development, comparable to the steam engine and the internet, with a shift towards practical applications in various sectors such as manufacturing, healthcare, and finance [12][13]. - Current trends in AI development include a shift from technical competition to commercial applications and a consolidation of industry leadership among major companies [18][20]. Group 3: Challenges and Recommendations - Despite the advancements, the effectiveness of AI products has not fully materialized, with issues of homogeneity and lack of innovation in new products [20]. - There is a need for AI to be closely integrated with core business operations to demonstrate its value and drive revenue [21][20]. - Recommendations include focusing on value creation through AI, enhancing talent development, and fostering collaboration between AI companies and user enterprises [21][20]. Group 4: Technical Insights and Future Directions - The evolution of large models has led to significant improvements in multi-task generalization capabilities, with AI code generation adoption rates increasing from 5% and 15% in 2022 to 50% and 80% [28][25]. - The article outlines four key areas for AI architects to focus on: prompt engineering, model tuning, full-stack system design, and understanding industry-specific challenges [33][32]. - The future of large models will continue to rely on the Transformer architecture while emphasizing the importance of expert MoE structures and efficient inference deployment [36][40].
Cell重磅:AI破局抗生素耐药危机,从头设计全新抗生素,精准杀灭耐药菌
生物世界· 2025-08-15 04:21
Core Viewpoint - The article discusses the urgent need for novel antibiotics to combat antibiotic resistance, highlighting the potential of generative artificial intelligence (AI) in designing new antibiotic compounds [2][5][11]. Group 1: Antibiotic Resistance Crisis - Antibiotic resistance (AMR) has led to 4.71 million deaths globally in 2021, with 1.14 million directly attributable to AMR [2]. - The CDC has classified Neisseria gonorrhoeae and Staphylococcus aureus as "urgent" and "serious" threats due to their widespread resistance to existing antibiotics [5]. - Between 1980 and 2003, only five new antibacterial drugs were developed by the top 15 pharmaceutical companies, indicating a critical need for innovative compounds [5]. Group 2: Generative AI in Antibiotic Development - Generative AI can design antibiotic molecules from scratch, allowing for the exploration of vast chemical spaces beyond existing compound libraries [7][11]. - The research team developed a generative AI platform that successfully designed two novel antibiotic molecules targeting resistant bacteria, demonstrating safety in human cells and efficacy in reducing bacterial load in mouse models [3][10]. Group 3: Research Methodology - The study utilized two methods for antibiotic design: a fragment-based approach (CReM) and an unconstrained de novo generation method (VAE), resulting in over 36 million novel compounds with predicted antibacterial activity [8][10]. - Out of 24 synthesized compounds, seven exhibited selective antibacterial activity, with two lead compounds (NG1 and DN1) showing significant efficacy against multi-drug resistant strains [10][11]. Group 4: Implications and Future Directions - The generative AI framework developed in this research provides a platform for exploring unknown chemical spaces, potentially leading to the discovery of new antibiotics [11].
NVIDIA英伟达进入自动驾驶领域二三事
自动驾驶之心· 2025-08-13 23:33
Core Viewpoint - The article discusses the evolution of the partnership between Tesla and NVIDIA in the autonomous driving sector, highlighting the challenges and innovations that have shaped their collaboration. Group 1: Tesla's Journey in Autonomous Driving - In September 2013, Tesla officially entered the autonomous driving arena, emphasizing internal development rather than relying on external technologies [5] - Initially, Tesla partnered with Mobileye due to the lack of suitable self-developed autonomous driving chips, enhancing Mobileye's technology with unique innovations like Fleet Learning [9][12] - Tensions arose between Tesla and Mobileye as Tesla sought to develop its own algorithms, leading to Mobileye's demand for Tesla to halt its internal vision efforts [12][13] Group 2: NVIDIA's Strategic Shift - In 2012, NVIDIA's CEO Jensen Huang recognized the potential of autonomous driving in electric vehicles, leading to a focus on deep learning and computer vision [15] - By November 2013, Huang highlighted the importance of digital computing in modern vehicles, indicating a shift towards automation in the automotive industry [17] - In January 2015, NVIDIA launched the DRIVE brand, introducing the DRIVE PX platform, which provided significant computational power for autonomous driving applications [18] Group 3: The Partnership Development - Following a significant accident in May 2016, Mobileye ended its partnership with Tesla, prompting Tesla to choose NVIDIA as its new technology partner [19][20] - In October 2016, Tesla announced that all its production models would feature hardware capable of full self-driving capabilities, utilizing NVIDIA's DRIVE PX 2 platform [20] - By early 2017, Tesla publicly announced its plans to develop its own chips, indicating a shift in its strategy while NVIDIA continued to expand its automotive partnerships [25][26] Group 4: Technological Advancements - In 2018, NVIDIA introduced the DRIVE Xavier platform, which improved computational performance while reducing power consumption [28] - Tesla's HW3, launched in April 2019, was described by Musk as the most advanced computer designed specifically for autonomous driving, marking the end of NVIDIA's direct involvement in Tesla's autonomous driving hardware [30][32]
OpenAI联合创始人Greg Brockman:对话黄仁勋、预言GPT-6、我们正处在一个算法瓶颈回归的时代
AI科技大本营· 2025-08-13 09:53
Core Insights - The article emphasizes the importance of focusing on practical advancements in AI infrastructure rather than just the theoretical discussions surrounding AGI [1][3] - It highlights the duality of the tech world, contrasting the "nomadic" mindset that embraces innovation and speed with the "agricultural" mindset that values order and reliability in large-scale systems [3][5] Group 1: Greg Brockman's Journey - Greg Brockman's journey from a young programmer to a leader in AI infrastructure showcases the evolution of computing over 70 years [3][5] - His early experiences with programming were driven by a desire to create tangible solutions rather than abstract theories [9][10] - The transition from academia to industry, particularly his decision to join Stripe, reflects a commitment to practical problem-solving and innovation [11][12] Group 2: Engineering and Research - The relationship between engineering and research is crucial for the success of AI projects, with both disciplines needing to collaborate effectively [27][29] - OpenAI's approach emphasizes the equal importance of engineering and research, fostering a culture of collaboration [29][30] - The challenges faced in integrating engineering and research highlight the need for humility and understanding in team dynamics [34][35] Group 3: AI Infrastructure and Future Directions - The future of AI infrastructure requires a balance between high-performance computing and low-latency responses to meet diverse workload demands [45][46] - The development of specialized accelerators for different types of AI tasks is essential for optimizing performance [47][48] - The concept of "mixture of experts" models illustrates the industry's shift towards more efficient resource utilization in AI systems [48]
超低标注需求,实现医学图像分割,UCSD提出三阶段框架GenSeg
3 6 Ke· 2025-08-12 03:24
Core Insights - GenSeg utilizes AI to generate high-quality medical images and corresponding segmentation labels, significantly reducing the manual labeling burden on medical professionals [1][20] - The framework addresses the critical challenge of dependency on large amounts of high-quality annotated data in medical image semantic segmentation [1][20] Summary by Sections Technology Overview - GenSeg is a three-stage framework that tightly couples data augmentation model optimization with semantic segmentation model training, ensuring that generated samples effectively enhance segmentation model performance [2][10] - It can be applied to various segmentation models, such as UNet and DeepLab, improving their performance in both in-domain and out-of-domain scenarios [4][20] Methodology - The framework consists of two main components: a semantic segmentation model that predicts segmentation masks and a mask-to-image generation model that predicts corresponding images [9] - The training process involves three phases: training the generation model with real image-mask pairs, augmenting real segmentation masks to create synthetic image-mask pairs, and evaluating the segmentation model on a validation set to update the generation model [9][10] Experimental Results - GenSeg demonstrates significant sample efficiency, achieving comparable or superior segmentation performance while drastically reducing the number of training samples required [11][20] - In in-domain experiments, GenSeg-UNet requires only 50 images to achieve a Dice score of approximately 0.6, compared to 600 images for standard UNet, representing a 12-fold reduction in data [13] - In out-of-domain tasks, GenSeg-DeepLab achieves a Jaccard index of 0.67 using only 40 images, while standard DeepLab fails to reach this level with 200 images [13] Comparative Analysis - The end-to-end data generation mechanism of GenSeg outperforms traditional separate training strategies, as evidenced by improved performance metrics in various segmentation tasks [15] - Regardless of the type of generation model used, the end-to-end training strategy consistently outperforms the separate training strategy [17] Generalization and Efficiency - GenSeg exhibits strong generalization capabilities across 11 medical image segmentation tasks and 19 datasets, achieving absolute performance improvements of 10-20% while requiring only 1/8 to 1/20 of the training data compared to existing methods [20]
理想VLA实质是强化学习占主导的持续预测下一个action token
理想TOP2· 2025-08-11 09:35
Core Viewpoints - The article presents four logical chains regarding the understanding of "predict the next token," which reflects different perceptions of the potential and essence of LLMs or AI [1] - Those who believe that predicting the next token is more than just probability distributions are more likely to recognize the significant potential of LLMs and AI [1] - A deeper consideration of AI and ideals can lead to an underestimation of the value of what ideals accomplish [1] - The ideal VLA essentially focuses on reinforcement learning dominating the continuous prediction of the next action token, similar to OpenAI's O1O3, with auxiliary driving being more suitable for reinforcement learning than chatbots [1] Summary by Sections Introduction - The article emphasizes the importance of Ilya's viewpoints, highlighting his significant contributions to the AI field over the past decade [2][3] - Ilya's background includes pivotal roles in major AI advancements, such as the development of AlexNet, AlphaGo, and TensorFlow [3] Q&A Insights - Ilya challenges the notion that next token prediction cannot surpass human performance, suggesting that a sufficiently advanced neural network could extrapolate behaviors of an idealized person [4][5] - He argues that predicting the next token well involves understanding the underlying reality that leads to the creation of that token, which goes beyond mere statistics [6][7] Ideal VLA and Reinforcement Learning - The ideal VLA operates by continuously predicting the next action token based on sensor information, indicating a real understanding of the physical world rather than just statistical probabilities [10] - Ilya posits that the reasoning process in the ideal VLA can be seen as a form of consciousness, differing from human consciousness in significant ways [11] Comparisons and Controversial Points - The article asserts that auxiliary driving is more suited for reinforcement learning compared to chatbots due to clearer reward functions [12][13] - It highlights the fundamental differences in the skills required for developing AI software versus hardware, emphasizing the unique challenges and innovations in AI software development [13]