DeepSeek论文发表16天后,国内团队已经写出了模型的「生物字典」
机器之心·2026-01-31 04:10

Core Insights - The article discusses the introduction of Gengram, a genomic module inspired by the Engram technology, which enhances the efficiency of genomic models by utilizing a memory lookup system instead of traditional methods [1][4]. Group 1: Gengram Technology Overview - Gengram employs a hash table to store common DNA sequences (k-mers) and allows models to reference this external memory, significantly reducing computational load [3][11]. - The module is lightweight, with approximately 20 million parameters, and integrates seamlessly into larger models, enhancing their performance without substantial additional computational costs [15][19]. Group 2: Performance Improvements - Models utilizing Gengram showed significant performance improvements in various tasks, including a 16.1% increase in AUC for splice site prediction and a 22.6% increase for epigenetic prediction tasks [17]. - Gengram's implementation allows models to achieve high performance with minimal training data, outperforming models that have been trained on significantly larger datasets [18]. Group 3: Mechanisms and Adaptability - Gengram features a dynamic gating mechanism that enables the model to decide when to reference the memory based on the context, optimizing resource usage [12][13]. - The module demonstrates excellent adaptability across different model architectures, improving training efficiency and balancing expert loads in mixture of experts (MoE) configurations [19][21]. Group 4: Scientific Insights and Innovations - Gengram's design allows it to infer biological principles, such as the physical structure of DNA, without prior knowledge, showcasing its potential for scientific discovery [22][25]. - The choice of a 21 base pair window size for local aggregation aligns with the physical properties of DNA, indicating a sophisticated understanding of biological structures [23][24]. Group 5: Team Background and Capabilities - The Genos Team, responsible for Gengram, is a collaboration between Zhejiang Lab and BGI-HangzhouAI, combining expertise in AI and life sciences [33][34]. - The Genos model, which serves as the foundation for Gengram, reportedly surpasses leading industry benchmarks, indicating a strong competitive position in genomic modeling [35].