深度学习

Search documents
几何计算联袂深度学习 提升疾病诊断准确率
Ke Ji Ri Bao· 2025-08-19 01:22
科技日报讯 (记者金凤)记者8月17日从东南大学获悉,该校教授顾忠泽团队联合中国科学院外籍 院士丘成桐团队等研发出基于几何表面参数化的多组学预测技术。该技术能提升对结直肠肿瘤等实体瘤 的组织分型与分子标志物的预测准确率,有望支撑人工智能在病理图像分析领域的应用。相关成果近日 刊发于中国工程院院刊《工程学》。 "人工智能已经在病理诊断中有不少应用,但现有算法多用于自然图像领域,其处理不规则病理图 像的能力有限。"论文第一作者、东南大学生物科学与医学工程学院博士生黄锴介绍,当碰到病理组织 分布不均匀等情况时,人工智能对疾病的预测准确度就会大打折扣。 "病理切片中的组织形状很不规则,造成图像中有许多空白区域,这些空白对于疾病的诊断没有意 义,我们将包含少量空白的不规则组织的图像提取出来,再将这些图像转换为正方形。"论文共同通讯 作者李铁香介绍,团队通过几何映射技术,保留了病理图像关键特征,同时引入多尺度和各向异性信 息,提升病理切片图像中有关肿瘤区域的信息量。经过这番处理,既减少了对无用信息的储存、处理, 又能增强卷积神经网络对肿瘤特征的学习能力。 黄锴介绍,团队使用该方法在573名结直肠癌患者的1802张切片上进 ...
每经热评︱人形机器人运动会启示:前沿技术走向大众需要催化剂
Mei Ri Jing Ji Xin Wen· 2025-08-18 07:40
每经评论员 张蕊 近日,全球首个人形机器人运动会在北京盛大举行。来自五大洲16个国家的280支队伍、500余台人形机 器人齐聚一堂,在国家速滑馆展开26个大项、538个小项的激烈竞技。这不仅是机器人领域的一场盛 会,更是人形机器人从实验室走向现实生活的重要一步,是产业创新的"催化剂"。可以说,中国正在用 最直观、最具传播力的方式,把实验室里的技术语言翻译成产业资本"听得懂"的赛道红利。 从技术层面看,人形机器人运动会为技术交流和检验提供了绝佳平台。 当下,人形机器人代表着全球科技最前沿的趋势。与传统机器人倚重机械工程和自动化控制不同,人形 机器人以人工智能和深度学习为内核,具备自主决策和自适应能力,实现了感知、行动与认知的融合。 然而,目前人形机器人技术仍处于发展初期。此时通过运动会这个平台,各家企业和科研机构可以集中 展示人形机器人领域的技术进展,并及时检验技术成果。 普通观众通过观赏赛事,可以直观感受到机器人技术的进步与潜力。一旦达到市场预期,相关产品的订 单就有可能获得指数级放大。同时,赛事的荣誉体系和竞争氛围,也为青年人才提供了展示舞台,能够 激励他们投身机器人科研领域,加速行业人才培养与积累。 事实 ...
卖酒的茅台要学AI了!和奔驰麦当劳一起拜师百度
量子位· 2025-08-17 03:43
Core Viewpoint - The article discusses the launch of the ninth session of Baidu's Chief AI Architect Training Program (AICA), highlighting its significance in cultivating AI talent and the increasing interest from top executives across various industries in AI education [2][41]. Group 1: AICA Program Overview - The AICA program aims to train composite AI architects who can engage in both technical development and project implementation, leveraging Baidu's self-developed deep learning platform, PaddlePaddle, and the Wenxin large model [5][41]. - This session has attracted 96 students selected from over 500 applicants, with 61% coming from state-owned enterprises, listed companies, and leading T1 application service providers [42][41]. - The curriculum includes new modules on Wenxin open-source, cutting-edge technologies, multimodal data, and practical case studies of Baidu's key technologies [44][41]. Group 2: Industry Trends and AI Development - The focus of discussions during the opening ceremony was on large models, which accounted for 51% of the topics covered, emphasizing their role in driving industrial transformation [6][7]. - AI is seen as a pivotal technology for economic development, comparable to the steam engine and the internet, with a shift towards practical applications in various sectors such as manufacturing, healthcare, and finance [12][13]. - Current trends in AI development include a shift from technical competition to commercial applications and a consolidation of industry leadership among major companies [18][20]. Group 3: Challenges and Recommendations - Despite the advancements, the effectiveness of AI products has not fully materialized, with issues of homogeneity and lack of innovation in new products [20]. - There is a need for AI to be closely integrated with core business operations to demonstrate its value and drive revenue [21][20]. - Recommendations include focusing on value creation through AI, enhancing talent development, and fostering collaboration between AI companies and user enterprises [21][20]. Group 4: Technical Insights and Future Directions - The evolution of large models has led to significant improvements in multi-task generalization capabilities, with AI code generation adoption rates increasing from 5% and 15% in 2022 to 50% and 80% [28][25]. - The article outlines four key areas for AI architects to focus on: prompt engineering, model tuning, full-stack system design, and understanding industry-specific challenges [33][32]. - The future of large models will continue to rely on the Transformer architecture while emphasizing the importance of expert MoE structures and efficient inference deployment [36][40].
Cell重磅:AI破局抗生素耐药危机,从头设计全新抗生素,精准杀灭耐药菌
生物世界· 2025-08-15 04:21
Core Viewpoint - The article discusses the urgent need for novel antibiotics to combat antibiotic resistance, highlighting the potential of generative artificial intelligence (AI) in designing new antibiotic compounds [2][5][11]. Group 1: Antibiotic Resistance Crisis - Antibiotic resistance (AMR) has led to 4.71 million deaths globally in 2021, with 1.14 million directly attributable to AMR [2]. - The CDC has classified Neisseria gonorrhoeae and Staphylococcus aureus as "urgent" and "serious" threats due to their widespread resistance to existing antibiotics [5]. - Between 1980 and 2003, only five new antibacterial drugs were developed by the top 15 pharmaceutical companies, indicating a critical need for innovative compounds [5]. Group 2: Generative AI in Antibiotic Development - Generative AI can design antibiotic molecules from scratch, allowing for the exploration of vast chemical spaces beyond existing compound libraries [7][11]. - The research team developed a generative AI platform that successfully designed two novel antibiotic molecules targeting resistant bacteria, demonstrating safety in human cells and efficacy in reducing bacterial load in mouse models [3][10]. Group 3: Research Methodology - The study utilized two methods for antibiotic design: a fragment-based approach (CReM) and an unconstrained de novo generation method (VAE), resulting in over 36 million novel compounds with predicted antibacterial activity [8][10]. - Out of 24 synthesized compounds, seven exhibited selective antibacterial activity, with two lead compounds (NG1 and DN1) showing significant efficacy against multi-drug resistant strains [10][11]. Group 4: Implications and Future Directions - The generative AI framework developed in this research provides a platform for exploring unknown chemical spaces, potentially leading to the discovery of new antibiotics [11].
NVIDIA英伟达进入自动驾驶领域二三事
自动驾驶之心· 2025-08-13 23:33
Core Viewpoint - The article discusses the evolution of the partnership between Tesla and NVIDIA in the autonomous driving sector, highlighting the challenges and innovations that have shaped their collaboration. Group 1: Tesla's Journey in Autonomous Driving - In September 2013, Tesla officially entered the autonomous driving arena, emphasizing internal development rather than relying on external technologies [5] - Initially, Tesla partnered with Mobileye due to the lack of suitable self-developed autonomous driving chips, enhancing Mobileye's technology with unique innovations like Fleet Learning [9][12] - Tensions arose between Tesla and Mobileye as Tesla sought to develop its own algorithms, leading to Mobileye's demand for Tesla to halt its internal vision efforts [12][13] Group 2: NVIDIA's Strategic Shift - In 2012, NVIDIA's CEO Jensen Huang recognized the potential of autonomous driving in electric vehicles, leading to a focus on deep learning and computer vision [15] - By November 2013, Huang highlighted the importance of digital computing in modern vehicles, indicating a shift towards automation in the automotive industry [17] - In January 2015, NVIDIA launched the DRIVE brand, introducing the DRIVE PX platform, which provided significant computational power for autonomous driving applications [18] Group 3: The Partnership Development - Following a significant accident in May 2016, Mobileye ended its partnership with Tesla, prompting Tesla to choose NVIDIA as its new technology partner [19][20] - In October 2016, Tesla announced that all its production models would feature hardware capable of full self-driving capabilities, utilizing NVIDIA's DRIVE PX 2 platform [20] - By early 2017, Tesla publicly announced its plans to develop its own chips, indicating a shift in its strategy while NVIDIA continued to expand its automotive partnerships [25][26] Group 4: Technological Advancements - In 2018, NVIDIA introduced the DRIVE Xavier platform, which improved computational performance while reducing power consumption [28] - Tesla's HW3, launched in April 2019, was described by Musk as the most advanced computer designed specifically for autonomous driving, marking the end of NVIDIA's direct involvement in Tesla's autonomous driving hardware [30][32]
OpenAI联合创始人Greg Brockman:对话黄仁勋、预言GPT-6、我们正处在一个算法瓶颈回归的时代
AI科技大本营· 2025-08-13 09:53
Core Insights - The article emphasizes the importance of focusing on practical advancements in AI infrastructure rather than just the theoretical discussions surrounding AGI [1][3] - It highlights the duality of the tech world, contrasting the "nomadic" mindset that embraces innovation and speed with the "agricultural" mindset that values order and reliability in large-scale systems [3][5] Group 1: Greg Brockman's Journey - Greg Brockman's journey from a young programmer to a leader in AI infrastructure showcases the evolution of computing over 70 years [3][5] - His early experiences with programming were driven by a desire to create tangible solutions rather than abstract theories [9][10] - The transition from academia to industry, particularly his decision to join Stripe, reflects a commitment to practical problem-solving and innovation [11][12] Group 2: Engineering and Research - The relationship between engineering and research is crucial for the success of AI projects, with both disciplines needing to collaborate effectively [27][29] - OpenAI's approach emphasizes the equal importance of engineering and research, fostering a culture of collaboration [29][30] - The challenges faced in integrating engineering and research highlight the need for humility and understanding in team dynamics [34][35] Group 3: AI Infrastructure and Future Directions - The future of AI infrastructure requires a balance between high-performance computing and low-latency responses to meet diverse workload demands [45][46] - The development of specialized accelerators for different types of AI tasks is essential for optimizing performance [47][48] - The concept of "mixture of experts" models illustrates the industry's shift towards more efficient resource utilization in AI systems [48]
超低标注需求,实现医学图像分割,UCSD提出三阶段框架GenSeg
3 6 Ke· 2025-08-12 03:24
Core Insights - GenSeg utilizes AI to generate high-quality medical images and corresponding segmentation labels, significantly reducing the manual labeling burden on medical professionals [1][20] - The framework addresses the critical challenge of dependency on large amounts of high-quality annotated data in medical image semantic segmentation [1][20] Summary by Sections Technology Overview - GenSeg is a three-stage framework that tightly couples data augmentation model optimization with semantic segmentation model training, ensuring that generated samples effectively enhance segmentation model performance [2][10] - It can be applied to various segmentation models, such as UNet and DeepLab, improving their performance in both in-domain and out-of-domain scenarios [4][20] Methodology - The framework consists of two main components: a semantic segmentation model that predicts segmentation masks and a mask-to-image generation model that predicts corresponding images [9] - The training process involves three phases: training the generation model with real image-mask pairs, augmenting real segmentation masks to create synthetic image-mask pairs, and evaluating the segmentation model on a validation set to update the generation model [9][10] Experimental Results - GenSeg demonstrates significant sample efficiency, achieving comparable or superior segmentation performance while drastically reducing the number of training samples required [11][20] - In in-domain experiments, GenSeg-UNet requires only 50 images to achieve a Dice score of approximately 0.6, compared to 600 images for standard UNet, representing a 12-fold reduction in data [13] - In out-of-domain tasks, GenSeg-DeepLab achieves a Jaccard index of 0.67 using only 40 images, while standard DeepLab fails to reach this level with 200 images [13] Comparative Analysis - The end-to-end data generation mechanism of GenSeg outperforms traditional separate training strategies, as evidenced by improved performance metrics in various segmentation tasks [15] - Regardless of the type of generation model used, the end-to-end training strategy consistently outperforms the separate training strategy [17] Generalization and Efficiency - GenSeg exhibits strong generalization capabilities across 11 medical image segmentation tasks and 19 datasets, achieving absolute performance improvements of 10-20% while requiring only 1/8 to 1/20 of the training data compared to existing methods [20]
理想VLA实质是强化学习占主导的持续预测下一个action token
理想TOP2· 2025-08-11 09:35
Core Viewpoints - The article presents four logical chains regarding the understanding of "predict the next token," which reflects different perceptions of the potential and essence of LLMs or AI [1] - Those who believe that predicting the next token is more than just probability distributions are more likely to recognize the significant potential of LLMs and AI [1] - A deeper consideration of AI and ideals can lead to an underestimation of the value of what ideals accomplish [1] - The ideal VLA essentially focuses on reinforcement learning dominating the continuous prediction of the next action token, similar to OpenAI's O1O3, with auxiliary driving being more suitable for reinforcement learning than chatbots [1] Summary by Sections Introduction - The article emphasizes the importance of Ilya's viewpoints, highlighting his significant contributions to the AI field over the past decade [2][3] - Ilya's background includes pivotal roles in major AI advancements, such as the development of AlexNet, AlphaGo, and TensorFlow [3] Q&A Insights - Ilya challenges the notion that next token prediction cannot surpass human performance, suggesting that a sufficiently advanced neural network could extrapolate behaviors of an idealized person [4][5] - He argues that predicting the next token well involves understanding the underlying reality that leads to the creation of that token, which goes beyond mere statistics [6][7] Ideal VLA and Reinforcement Learning - The ideal VLA operates by continuously predicting the next action token based on sensor information, indicating a real understanding of the physical world rather than just statistical probabilities [10] - Ilya posits that the reasoning process in the ideal VLA can be seen as a form of consciousness, differing from human consciousness in significant ways [11] Comparisons and Controversial Points - The article asserts that auxiliary driving is more suited for reinforcement learning compared to chatbots due to clearer reward functions [12][13] - It highlights the fundamental differences in the skills required for developing AI software versus hardware, emphasizing the unique challenges and innovations in AI software development [13]
高频选股因子周报:高频因子上周有所分化,深度学习因子持续强势。 AI 增强组合均录得正超额。-20250810
GUOTAI HAITONG SECURITIES· 2025-08-10 07:58
Quantitative Factors and Models Summary Quantitative Factors and Construction Process - **Factor Name**: Intraday Skewness Factor **Construction Idea**: This factor captures the skewness of intraday stock returns, reflecting the asymmetry in return distribution[13][16][18] **Construction Process**: The factor is calculated based on the third moment of intraday return distribution, normalized by the cube of standard deviation. The detailed methodology is referenced in the report "Stock Selection Factor Series Research (19) - High-Frequency Factors on Stock Return Distribution Characteristics"[13][16][18] - **Factor Name**: Downside Volatility Proportion Factor **Construction Idea**: This factor measures the proportion of downside volatility in the total realized volatility of a stock[18][19][20] **Construction Process**: The factor is derived by decomposing realized volatility into upside and downside components. The methodology is detailed in the report "Stock Selection Factor Series Research (25) - High-Frequency Factors on Realized Volatility Decomposition"[18][19][20] - **Factor Name**: Post-Open Buying Intention Proportion Factor **Construction Idea**: This factor quantifies the proportion of buying intention in the early trading period after market open[22][23][24] **Construction Process**: The factor is constructed using high-frequency data to identify and aggregate buying signals in the post-open period. The methodology is detailed in the report "Stock Selection Factor Series Research (64) - Low-Frequency Applications of High-Frequency Data Based on Intuitive Logic and Machine Learning"[22][23][24] - **Factor Name**: Post-Open Buying Intensity Factor **Construction Idea**: This factor measures the intensity of buying activity in the early trading period after market open[27][28][29] **Construction Process**: Similar to the proportion factor, this factor aggregates the magnitude of buying signals during the post-open period, normalized by trading volume[27][28][29] - **Factor Name**: Post-Open Large Order Net Buying Proportion Factor **Construction Idea**: This factor captures the proportion of large order net buying in the early trading period after market open[32][34][35] **Construction Process**: The factor is calculated by summing the net buying of large orders during the post-open period and dividing by total trading volume[32][34][35] - **Factor Name**: Post-Open Large Order Net Buying Intensity Factor **Construction Idea**: This factor measures the intensity of large order net buying in the early trading period after market open[37][39][40] **Construction Process**: The factor aggregates the net buying of large orders during the post-open period, normalized by the total number of large orders[37][39][40] - **Factor Name**: Improved Reversal Factor **Construction Idea**: This factor captures the reversal effect in stock returns, adjusted for high-frequency data characteristics[40][43][44] **Construction Process**: The factor is constructed by identifying stocks with extreme short-term returns and measuring their subsequent reversal performance[40][43][44] - **Factor Name**: Deep Learning Factor (Improved GRU(50,2)+NN(10)) **Construction Idea**: This factor leverages a deep learning model combining GRU and neural networks to predict stock returns[63][65][66] **Construction Process**: The model uses 50 GRU units and 10 neural network layers, trained on historical high-frequency data to predict short-term stock returns[63][65][66] - **Factor Name**: Deep Learning Factor (Residual Attention LSTM(48,2)+NN(10)) **Construction Idea**: This factor employs an LSTM model with residual attention mechanisms to enhance prediction accuracy[65][66][68] **Construction Process**: The model uses 48 LSTM units and 10 neural network layers, incorporating residual connections to capture long-term dependencies in high-frequency data[65][66][68] - **Factor Name**: Multi-Granularity Model Factor (5-Day Label) **Construction Idea**: This factor predicts stock returns over a 5-day horizon using a multi-granularity deep learning model[68][69][70] **Construction Process**: The model is trained using bidirectional AGRU (Attention-Gated Recurrent Unit) to capture multi-scale temporal patterns in stock data[68][69][70] - **Factor Name**: Multi-Granularity Model Factor (10-Day Label) **Construction Idea**: Similar to the 5-day label factor, this factor predicts stock returns over a 10-day horizon[69][70][71] **Construction Process**: The model uses the same AGRU architecture as the 5-day label factor but is trained with a 10-day prediction horizon[69][70][71] Factor Backtesting Results - **Intraday Skewness Factor**: - IC: 0.024 (2025), 0.019 (historical) - e^(-RankMAE): 0.327 (2025), 0.324 (historical) - Long-Short Return: 16.90% (2025 YTD), -0.66% (last week) - Long-Only Excess Return: 1.84% (2025 YTD), -0.79% (last week)[9][10][13] - **Downside Volatility Proportion Factor**: - IC: 0.020 (2025), 0.016 (historical) - e^(-RankMAE): 0.325 (2025), 0.323 (historical) - Long-Short Return: 12.93% (2025 YTD), -1.19% (last week) - Long-Only Excess Return: -0.12% (2025 YTD), -1.07% (last week)[9][10][18] - **Post-Open Buying Intention Proportion Factor**: - IC: 0.026 (2025), 0.026 (historical) - e^(-RankMAE): 0.322 (2025), 0.321 (historical) - Long-Short Return: 13.98% (2025 YTD), 0.27% (last week) - Long-Only Excess Return: 7.20% (2025 YTD), 0.28% (last week)[9][10][22] - **Post-Open Buying Intensity Factor**: - IC: 0.029 (2025), 0.030 (historical) - e^(-RankMAE): 0.327 (2025), 0.326 (historical) - Long-Short Return: 18.53% (2025 YTD), 0.05% (last week) - Long-Only Excess Return: 7.09% (2025 YTD), 0.43% (last week)[9][10][27] - **Post-Open Large Order Net Buying Proportion Factor**: - IC: 0.027 (2025), 0.036 (historical) - e^(-RankMAE): 0.319 (2025), 0.322 (historical) - Long-Short Return: 18.25% (2025 YTD), 0.31% (last week) - Long-Only Excess Return: 9.48% (2025 YTD), 0.43% (last week)[9][10][32] - **Post-Open Large Order Net Buying Intensity Factor**: - IC: 0.019 (2025), 0.025 (historical) - e^(-RankMAE): 0.318 (2025), 0.321 (historical) - Long-Short Return: 10.50% (2025 YTD), 0.31% (last week) - Long-Only Excess Return: 7.08% (2025 YTD), 0.24% (last week)[9][10][37] - **Improved Reversal Factor**: - IC: 0.025 (2025), 0.031 (historical) - e^(-RankMAE): 0.331 (2025), 0.330 (historical) - Long-Short Return: 17.44% (2025 YTD), 0.12% (last week) - Long-Only Excess Return: 6.14% (2025 YTD), 0.33% (last week)[9][10][40] - **Deep Learning Factor (Improved GRU(50,2)+NN(10))**: - IC: 0.045 (2025), 0.066 (historical) - e^(-RankMAE): 0.335 (2025), 0.336 (historical) - Long-Short Return: 28.86% (2025 YTD), 1.36% (last week) - Long-Only Excess Return: 2.19% (2025 YTD), 0.06% (last week)[9][10][63] - **Deep Learning Factor (Residual
昔日高考状元,今日AI顶尖科学家:何恺明的“开挂”人生
2 1 Shi Ji Jing Ji Bao Dao· 2025-08-09 03:27
华人AI科学家视频系列之一 扎克伯格最近疯抢AI科学家,尤其是华人科学家,动不动就开出1亿美元甚至2亿美元的薪酬包。 有一位AI大神似乎被忽略了。 今年3月,Facebook首席AI科学家杨立昆在一次访谈中,提到了"一件不为人知的事",科学(AI)领域 被引用次数最多的论文,是关于深度学习领域的,来自10年前的2015年,这篇论文起源于北京。 这篇论文的主要作者叫做,何恺明。 《自然》杂志给出了一个21世纪引用量最高的最新Top 25,排在第一位的就是"Deep Residual Learning for Image Recognition", 是一篇关于ResNets研究的论文,作者包括何恺明、张祥雨、任少卿和孙剑。 何恺明是何方神圣? 何恺明1984年出生于广州,他在执信中学的时候,因为获得全国物理竞赛一等奖,拿到了清华大学的保 送资格,但他还是参加高考来证明自己。以标准分900分的成绩,成为当年广东省9位满分状元之一。 2007年何恺明进入香港中文大学读研,师从汤晓鸥。港中大认识何凯明的,都说他是超级拼命三郎,早 上六点多出门晚上十二点回寝室,天才还这么拼命,"普通人没法玩"。2011年博士毕业后,进入 ...