Workflow
深度学习
icon
Search documents
一篇被证明“理论有误”的论文,拿下了ICML2025时间检验奖
量子位· 2025-07-15 08:31
Core Insights - The Batch Normalization paper, published in 2015, has been awarded the Time-Tested Award at ICML 2025, highlighting its significant impact on deep learning [1] - With over 60,000 citations, this work is considered a milestone in the development of deep learning, facilitating the training and application of deep neural networks [2][4] - Batch Normalization is a key technology that enabled deep learning to transition from small-scale experiments to large-scale practical applications [3] Group 1 - In 2015, deep learning faced challenges in training deep neural networks, which were often unstable and sensitive to parameter initialization [5][6][7] - Researchers Sergey Ioffe and Christian Szegedy identified the issue of Internal Covariate Shift, where the distribution of data within the network changes during training, complicating the training process [8][11] - Their solution involved normalizing the data at each layer, similar to input layer normalization, which significantly improved training speed and stability [12] Group 2 - The original paper demonstrated that using Batch Normalization allowed advanced image classification models to achieve the same accuracy with only 1/14 of the training steps [13] - Batch Normalization not only accelerated training but also introduced a regularization effect, enhancing the model's generalization ability [14][15] - Following its introduction, Batch Normalization became foundational for many mainstream convolutional neural networks, such as ResNet and DenseNet [18] Group 3 - In 2018, a paper from MIT challenged the core theory of Batch Normalization, showing that even with introduced noise, models with Batch Normalization still trained faster than those without it [21][23] - This research revealed that Batch Normalization smooths the Optimization Landscape, making gradient behavior more predictable and stable [24] - It was suggested that Batch Normalization acts as an unsupervised learning technique, allowing networks to adapt to the data's inherent structure early in training [25] Group 4 - Recent studies have provided deeper insights into Batch Normalization from a geometric perspective [29] - Both authors, Ioffe and Szegedy, have continued their careers in AI, with Szegedy joining xAI and Ioffe following suit [30][32] - Szegedy has since transitioned to a new role at Morph Labs, focusing on achieving "verifiable superintelligence" [34]
Cell:先导编辑+AI,全面解析ATM基因所有点突变的功能
生物世界· 2025-07-15 03:31
Core Viewpoint - The article discusses the challenges and advancements in assessing Variants of Uncertain Significance (VUS) in the ATM gene, which is crucial for DNA damage response and cancer susceptibility [2][5][6]. Group 1: ATM Gene and Its Importance - The ATM gene plays a key role in DNA damage response and is associated with Ataxia Telangiectasia when mutated [2][5]. - Mutations in the ATM gene can lead to increased risks of various cancers, including breast, colorectal, pancreatic, and prostate cancers [5]. - Comprehensive functional assessment of all possible single nucleotide variants (SNVs) in the ATM gene is essential for predicting cancer risk and patient prognosis [5][10]. Group 2: Recent Research Findings - Researchers from Yonsei University published a study in Cell that functionally assessed all 27,513 possible ATM SNVs using prime editing and deep learning [3][10]. - The study identified critical amino acid residues in the kinase domain that cannot tolerate missense mutations [10]. - A deep learning model named DeepATM was developed to predict the functional effects of the remaining 4,421 SNVs with unprecedented accuracy [9][10]. Group 3: Implications for Precision Medicine - The comprehensive evaluation of ATM gene mutations aids in precision medicine and provides a framework for addressing VUS in other genes [12]. - The research findings contribute to cancer risk assessment and prognosis, enhancing the understanding of ATM's role in cancer [9][10].
公私募量化基金全解析
CMS· 2025-07-13 14:35
1. Report Industry Investment Rating No relevant content provided. 2. Core Views of the Report - The report comprehensively analyzes public and private quantitative funds, covering aspects such as the basic characteristics of quantitative strategies, the development history of domestic quantitative investment, the current development status of the industry, the operational characteristics and performance of quantitative funds, the differences in investment operations between public and private quantitative funds, and how to select quantitative products [1][2][3]. - Quantitative strategies are based on historical data, using methods such as data mining and mathematical modeling to discover investment opportunities, with strong systematic and disciplined features. They focus on research breadth to achieve probability - based wins, different from subjective strategies that rely on research depth [10][11][12]. - Public and private quantitative funds have different development paths and characteristics. Public quantitative funds have experienced stages of growth, slowdown, and strategy diversification, while private quantitative funds have gone through explosive growth, stable development, and challenges [5][16][19]. - There are significant differences in regulatory requirements, management behaviors, investment strategies, and fee terms between public and private quantitative funds, which lead to differences in their risk - return characteristics [6]. - When selecting quantitative products, investors should use a four - dimensional evaluation system of "strategy deconstruction - positioning matching - indicator verification - ability evaluation" to consider factors such as strategy environment adaptability, risk - return characteristic persistence, and management team moat depth [6][90]. 3. Summary According to the Directory 3.1 Quantitative Strategy Basic Characteristics - Quantitative strategies use historical data to discover price change patterns and formulate investment strategies. The most widely used quantitative stock - selection model is the multi - factor model, including price - volume factors, fundamental factors, and alternative factors. Some funds also introduce machine learning factors [10]. - Quantitative strategies have strong strategy discipline, systematically mining investment opportunities and avoiding the influence of subjective emotions. Their risk - control systems are embedded in strategies, with different constraints for different types of products [11]. - Compared with subjective investment, quantitative investment focuses on research breadth and probability - based wins, with lower marginal costs and a wider range of tracked investment opportunities [12]. 3.2 Domestic Quantitative Investment Development History 3.2.1 Public Fund Quantitative Investment Development History - **Germination Period (2004 - 2014)**: From the exploration of "subjective + quantitative" to the initial application of the multi - factor model. The first index - enhanced fund and active quantitative stock - selection fund were established, and with the return of talents, the multi - factor stock - selection model was gradually applied [12][13][15]. - **Accelerated Growth Period (2015 - 2021)**: The multi - factor model became popular, and the scale of quantitative funds expanded rapidly. The scale of index - enhanced strategies increased significantly, while the scale of hedge strategies grew rapidly from 2020 and then declined [16]. - **Steady Development Period (2022 - present)**: The growth rate of the overall scale of public quantitative funds has slowed down, but strategies have become more diversified. Different product lines complement each other, and some managers introduce AI algorithms to iterate strategies [19]. 3.2.2 Private Fund Quantitative Investment Development History - Private quantitative funds have experienced three rounds of growth. From 2019 to 2021, there was explosive growth, with the scale reaching 1.08 trillion yuan at the end of 2021, accounting for 17.1% of the total scale of private securities investment funds. From 2021 to 2023, there was steady development, and in 2024, the industry faced challenges due to market fluctuations and stricter regulations. In 2025, private fund filings recovered [5][22][25]. 3.3 Public and Private Quantitative Fund Industry Development Status 3.3.1 Public Fund Quantitative Strategy and Pattern Distribution - **Strategy Classification**: Public quantitative strategies mainly include active quantitative strategies, index - enhanced strategies, and quantitative hedge strategies. Some equity parts of fixed - income + funds also use quantitative management methods [31]. - **Scale Distribution**: As of 2025Q1, the number of public quantitative equity funds reached 654, with a scale of 3025.88 billion yuan. Index - enhanced products had the largest scale, and the management scale concentration of the top ten managers was relatively high [32][37]. 3.3.2 Private Fund Quantitative Strategy and Manager Situation - **Strategy Classification**: Private quantitative investment strategies are more diverse, including quantitative long - only, stock neutral, convertible bond strategies, CTA strategies, other derivative strategies, arbitrage strategies, and composite strategies [38]. - **Hundred - Billion Private Quantitative Managers**: As of the end of June 2025, there were 39 hundred - billion private quantitative investment fund managers, accounting for nearly half of the total number of hundred - billion private funds [5]. 3.4 Operational Characteristics and Performance of Public and Private Stock Quantitative Funds 3.4.1 Operational Characteristics - **High Turnover**: Quantitative funds have a relatively high turnover rate, which helps capture short - term trading opportunities. Public quantitative funds' annual bilateral turnover is mainly between 2 - 20 times, and private quantitative funds' turnover is generally above 30 times [47][48]. - **Large Number of Holdings**: Quantitative funds usually hold a large number of stocks, with a high degree of diversification in stocks and industries. Public quantitative funds' holding numbers are mainly between 50 - 600, and some exceed 2000. They can reduce non - systematic risks [53][54]. 3.4.2 Performance - **Index - Enhanced Products**: The absolute and excess returns of index - enhanced products vary from year to year, with the overall excess - acquisition ability of CSI 1000 index - enhanced > CSI 500 index - enhanced > SSE 500 index - enhanced. Private index - enhanced funds generally have better excess returns than public ones, but with greater differentiation [57][58]. - **Active Quantitative Funds**: The performance of public and private active quantitative funds varies by year. In 2019 - 2020, public active quantitative funds performed better, while in 2018, 2021 - 2023, private ones performed better. Private funds have greater performance and drawdown differentiation [66]. - **Quantitative Hedge Funds**: Private quantitative hedge funds generally outperform public ones in terms of annual returns, but their performance and drawdown differentiation are also greater [70]. 3.5 Differences in Investment Operations between Public and Private Quantitative Funds - **Regulatory Requirements and Contracts**: Public quantitative funds are regulated by the "Securities Investment Fund Law", with high regulatory intensity and high information transparency. Private quantitative funds are regulated by the "Regulations on the Supervision and Administration of Private Investment Funds", with more customized contracts and higher risk levels [79]. - **Management Behaviors**: Public quantitative managers rely on institutionalized teams and standardized IT infrastructure, with a focus on systematic risk control and compliance. Private managers use an elite - based organizational structure, with higher hardware investment and employee incentives, and their product strategies may be more differentiated [81]. - **Investment Strategies and Restrictions**: Public quantitative funds have stricter constraints on investment scope, proportion, and tracking error, with lower turnover. Private quantitative funds have more flexible mechanisms, with higher turnover and greater elasticity in excess returns [6][84]. - **Fee Terms**: Private quantitative product fee terms are more complex, usually including management fees and performance rewards, while public quantitative products mainly charge fixed management fees and custody fees [6][87]. 3.6 How to Select Quantitative Products - When selecting quantitative products, investors should use a four - dimensional evaluation system of "strategy deconstruction - positioning matching - indicator verification - ability evaluation" to consider factors such as strategy environment adaptability, risk - return characteristic persistence, and management team moat depth [6][90].
DLC中国深度学习年会正式发布!AI与PBL引领课堂创新
Nan Fang Du Shi Bao· 2025-07-12 05:41
Core Insights - The conference focused on the theme "AI and PBL Leading Classroom Innovation," discussing how artificial intelligence (AI) and project-based learning (PBL) can drive educational transformation and cultivate innovative talents needed for future society [1][3] Group 1: AI's Role in Education - AI technology is reshaping the educational ecosystem, fundamentally changing the demand structure for talent in future society [3] - AI is viewed as an assistant rather than a competitor to teachers, with the core educational task shifting from merely imparting knowledge to helping students become more complete individuals [3] Group 2: Project-Based Learning (PBL) - PBL is identified as a key method for cultivating future talents, characterized by three main features: 1. Driven by real-world problems that engage students' intrinsic motivation [3] 2. Focused on collaboration and inquiry, enhancing learning through investigation, team communication, and continuous feedback [3] 3. Aiming for public presentation of results, targeting real audiences rather than being limited to assignments or exams [3] Group 3: Practical Applications and Initiatives - Guangzhou Youlian School has systematically integrated PBL into its curriculum, including mandatory courses for grades 9-10 and the establishment of the CTB Global Youth Innovation Project Club [5] - Students tackle real issues, such as designing environmental solutions and analyzing community economic problems, fostering interdisciplinary skills [5] - The "Deep Dive Workshops" at the conference provided educators with immersive learning experiences, covering AI-assisted design and interdisciplinary project practices [5] Group 4: Background of the Organizing Body - The Deeper Learning China (DLC) was established in 2019 as a public welfare educational innovation platform aimed at promoting deep learning educational concepts within China's local education system [6] - "Deep Learning" was initially proposed by the U.S. Education Foundation and has been developed in innovative school systems like High Tech High (HTH), focusing on student-centered curricula that integrate critical thinking and real tasks [6]
Meta为他豪掷2亿美元,上交校友庞若鸣,晒出在苹果的最新论文
机器之心· 2025-07-10 10:49
Core Viewpoint - The article discusses Ruoming Pang's transition from Apple to Meta, highlighting his contributions to Apple's foundational model and the development of AXLearn, a modular large model training system designed for heterogeneous infrastructure. Group 1: Ruoming Pang's Transition - Ruoming Pang, head of Apple's foundational model team, is moving to Meta's newly established superintelligence team, with a reported offer of $200 million [2][3]. - Despite the transition, Pang continues to contribute to Apple by promoting his research on AXLearn [3][4]. Group 2: AXLearn Overview - AXLearn is a production-grade system designed for large-scale deep learning model training, emphasizing scalability and high performance [6]. - The system features a modular design and comprehensive support for heterogeneous hardware infrastructure, allowing for efficient integration of functionalities like Rotary Position Embeddings (RoPE) with minimal code [6][8]. - A new method for measuring modularity, based on lines of code (LoC-complexity), is introduced, showing that AXLearn maintains constant complexity during system expansion, unlike other systems that exhibit linear or quadratic growth [7][23]. Group 3: Performance Evaluation - AXLearn's training performance is compared with systems like PyTorch FSDP, Megatron-LM, and MaxText across various hardware platforms, demonstrating competitive iteration times and throughput [26][29]. - The system shows near-linear scalability in weak-scaling experiments, indicating its robustness in handling increased workloads [30]. Group 4: Production Use and Impact - AXLearn has evolved from a tool for a few developers to a large platform supporting hundreds of developers in training models with billions to trillions of parameters [35]. - It can concurrently support over 10,000 experiments and is deployed across various heterogeneous hardware clusters, contributing to features used by billions of users [36][37].
浙江大学最新Cell论文:AI基因组模型——女娲CE,破译脊椎动物基因组调控语言
生物世界· 2025-07-09 00:09
Core Viewpoint - The research highlights the development of a high-throughput and ultra-sensitive single-nucleus ATAC sequencing technology (UUATAC-seq) and a deep learning model (NvwaCE) for predicting regulatory sequences in vertebrates, providing valuable resources for understanding the regulatory language of vertebrate genomes [5][15]. Group 1: Technology Development - The UUATAC-seq technology allows for the efficient construction of chromatin accessibility maps within a single day for a given species [8]. - The NvwaCE model is designed to interpret the "grammar" of cis-regulatory elements (cCRE) and can directly predict cCRE landscapes from genomic sequences with high accuracy [11]. Group 2: Research Findings - The study found that the conservation of regulatory grammar is significantly stronger than that of nucleotide sequences, revealing the sequence basis for cell-type-specific gene expression [6]. - The analysis indicated that differences in genome size among species affect the number of cCREs but not their size [10]. Group 3: Practical Applications - The NvwaCE model accurately predicts the impact of synthetic mutations on lineage-specific cCRE functionality, aligning with quantitative trait loci (QTL) and genome editing results [13]. - A specific gene mutation site (HBG1-68:A>G) was predicted to have curative potential for sickle cell disease, marking the first instance of an AI-designed functional site being validated in human cells [14].
LeCun团队揭示LLM语义压缩本质:极致统计压缩牺牲细节
量子位· 2025-07-04 01:42
Core Viewpoint - The article discusses the differences in semantic compression strategies between large language models (LLMs) and human cognition, highlighting that LLMs focus on statistical compression while humans prioritize detail and context [4][17]. Group 1: Semantic Compression - Semantic compression allows efficient organization of knowledge and quick categorization of the world [3]. - A new information-theoretic framework was proposed to compare the strategies of humans and LLMs in semantic compression [4]. - The study reveals fundamental differences in compression efficiency and semantic fidelity between LLMs and humans, with LLMs leaning towards extreme statistical compression [5][17]. Group 2: Research Methodology - The research team established a robust human concept classification benchmark based on classic cognitive science studies, covering 1,049 items across 34 semantic categories [5][6]. - The dataset provides category affiliation information and human ratings of "typicality," reflecting deep structures in human cognition [6][7]. - Over 30 LLMs were selected for evaluation, with parameter sizes ranging from 300 million to 72 billion, ensuring a fair comparison with human cognitive benchmarks [8]. Group 3: Findings and Implications - The study found that LLMs' concept classification results align significantly better with human semantic classification than random levels, validating LLMs' basic capabilities in semantic organization [10][11]. - However, LLMs struggle with fine-grained semantic differences, indicating a mismatch between their internal concept structures and human intuitive category assignments [14][16]. - The research highlights that LLMs prioritize reducing redundant information, while humans emphasize adaptability and richness, maintaining context integrity [17]. Group 4: Research Contributors - The research was conducted collaboratively by Stanford University and New York University, with Chen Shani as the lead author [19][20]. - Yann LeCun, a prominent figure in AI and a co-author of the study, has significantly influenced the evolution of AI technologies [24][25][29].
你被哪个后来知道很致命的BUG困扰过一周以上吗?
自动驾驶之心· 2025-07-03 12:41
Core Insights - The article discusses the challenges and experiences in training AI models using reinforcement learning, highlighting the importance of reward design and the pitfalls that can arise during the process [1][2]. Group 1: Reinforcement Learning Challenges - The author shares experiences from a project where a robot was trained to run, illustrating how different reward structures led to unexpected behaviors, such as jumping too far and falling [1]. - The design of learning objectives is crucial, as poorly defined goals can lead to models that do not perform as intended, such as generating repetitive outputs or failing to learn effectively [2]. Group 2: AI Model Training Insights - The robustness of neural networks allows them to continue iterating despite bugs in the code, which can lead to unexpected improvements when the bugs are eventually removed [2]. - The article emphasizes the collaborative nature of deep learning projects, where introducing bugs can inspire creative solutions from team members [2]. Group 3: Community and Learning Resources - The article mentions a community of nearly 4,000 members, including over 300 companies and research institutions in the autonomous driving sector, providing a platform for learning and sharing knowledge [3]. - Various technical areas related to autonomous driving are covered, including perception, mapping, and control, indicating a comprehensive approach to education in this field [3].
中美AI差距有多大,AI竞争焦点在哪?《全球人工智能科研态势报告》全球首发
Tai Mei Ti A P P· 2025-07-03 10:36
Core Insights - The report titled "Global AI Research Landscape Report (2015-2024)" analyzes the evolution of AI research over the past decade, highlighting the competitive landscape between China and the United States in AI talent and publication output [2][7]. Group 1: AI Research Trends - The report identifies four distinct phases in AI research: initial phase (2015-2016), rapid development phase (2017-2019), maturity peak phase (2020-2023), and adjustment phase (2024) [4][5]. - The number of AI papers published globally increased significantly, with a peak of 17,074 papers in 2023, representing nearly a fourfold increase from 2015 [5][6]. - The year 2024 is expected to see a decline in publication volume to 14,786 papers, indicating a shift towards more specialized and application-oriented research [6]. Group 2: Talent Distribution - China has emerged as the second-largest hub for AI talent, with a total of 52,000 researchers by 2024, growing at a compound annual growth rate of 28.7% since 2015 [8]. - The United States leads with over 63,000 AI researchers, with significant contributions from institutions like Stanford and MIT, as well as tech giants like Google and Microsoft [8][9]. - Chinese institutions such as the Chinese Academy of Sciences, Tsinghua University, and Peking University are leading in terms of publication output and talent concentration [7][9]. Group 3: Institutional and Corporate Performance - The Chinese Academy of Sciences published 4,639 top-tier papers, while Tsinghua University and Peking University followed closely, showcasing China's institutional strength in AI research [7][9]. - In contrast, U.S. companies like Google, Microsoft, and Meta have a significantly higher average publication output compared to their Chinese counterparts, reflecting a disparity in research investment and output capabilities [9][10]. - The top three U.S. companies published 5,896 papers, which is 1.8 times the output of the top three Chinese companies [9][10]. Group 4: Gender Disparity in AI Talent - The report highlights a significant gender imbalance in AI research, with women making up only 9.3% of AI talent in China compared to 20.1% in the U.S. [12][13]. - Chinese institutions like Tsinghua University and Peking University have low female representation in AI, at 7.88% and 9.18% respectively, compared to 25%-30% in top U.S. institutions [12][13]. Group 5: Future Trends in AI Research - The report indicates that "deep learning" has been the dominant focus in AI research over the past decade, but its growth rate is expected to slow down, suggesting a need for new approaches [14][15]. - Emerging technologies such as "Transformers" are gaining traction, particularly in natural language processing and multimodal AI, indicating a shift in research focus [15]. - The integration of traditional AI fields with deep learning techniques is becoming more prevalent, reflecting a trend towards collaborative and interdisciplinary research [15].
李飞飞最新YC现场访谈:从ImageNet到空间智能,追逐AI的北极星
创业邦· 2025-07-02 09:49
Core Viewpoint - The article discusses the evolution of artificial intelligence (AI) through the lens of renowned AI scientist Fei-Fei Li, focusing on her career, the creation of ImageNet, and her current work on spatial intelligence with World Labs. It emphasizes the importance of understanding and interacting with the three-dimensional world as a crucial step towards achieving Artificial General Intelligence (AGI) [2][9][25]. Group 1: ImageNet and Deep Learning - ImageNet was created as a data-driven paradigm shift, providing a large-scale, high-quality labeled dataset that laid the foundation for the success of deep learning and neural networks [9][10]. - The project has over 80,000 citations and is considered a cornerstone in addressing the data problem in AI [8][9]. - The transition from object recognition to scene narrative is highlighted, showcasing the evolution of AI capabilities from identifying objects to understanding and describing complex scenes [17][18]. Group 2: Spatial Intelligence and World Labs - Spatial intelligence is identified as the next frontier in AI, focusing on understanding, interacting with, and generating three-dimensional worlds, which is deemed a fundamental challenge for achieving AGI [9][25]. - World Labs, founded by Fei-Fei Li, aims to tackle the complexities of spatial intelligence, moving beyond flat pixel representations and language models to capture the three-dimensional structure of the world [22][25][31]. - The article discusses the challenges of modeling the real world, emphasizing the need for high-quality data and the difficulties in understanding and interacting with three-dimensional environments [28][29]. Group 3: Entrepreneurial Spirit and Personal Journey - Fei-Fei Li's journey from being an immigrant to a leading AI researcher and entrepreneur is highlighted, showcasing her entrepreneurial spirit and the importance of embracing difficult challenges [36][34]. - The article emphasizes the mindset of "intellectual fearlessness" as a core trait for success in both academic research and entrepreneurship, encouraging individuals to focus on building and innovating without being hindered by past achievements or external opinions [9][36][37]. - The narrative includes her experiences running a laundromat as a teenager, which shaped her entrepreneurial skills and resilience [34][36].