基因组大模型
Search documents
DeepSeek同款“外挂大脑”进军生命科学!中国团队发布Gengram,破解DNA天书
生物世界· 2026-01-31 06:00
Core Viewpoint - The article discusses the innovative "Gengram" module introduced by the Genos team, which enhances genomic models by integrating an external memory mechanism to improve efficiency and performance in genomic tasks [2][10]. Group 1: Gengram Module Introduction - The Gengram module aims to address the limitations of existing genomic models that only process DNA sequences at the single-base level, which is inefficient for understanding biological functions [8][10]. - By utilizing a pre-built hash dictionary of common short sequences, Gengram allows models to retrieve biological knowledge directly, reducing the need for extensive computation [10][12]. Group 2: Performance Improvements - Models equipped with Gengram have achieved state-of-the-art (SOTA) results, with a 16.1% increase in AUC for splice site recognition tasks [6][18]. - Gengram is a lightweight plugin, with only about 20 million parameters, significantly enhancing model capabilities without requiring extensive training data [18][21]. Group 3: Biological Insights - The design of Gengram allows AI to consider the three-dimensional structure of DNA while processing one-dimensional sequences, improving its understanding of biological interactions [14][15]. - The optimal performance of Gengram was observed with a window size of 21 base pairs, which corresponds to the spatial arrangement of DNA [13][14]. Group 4: Team and Collaboration - The Genos team combines expertise from BGI's life sciences research and computational capabilities from Zhejiang Lab, representing a strategic collaboration in the AI for Science domain [20][21]. - The success of Gengram highlights the potential of aligning AI with biological logic to advance the understanding of genomic data [21].