基因组学
Search documents
为人类健康与可持续发展贡献科研力量
Ren Min Wang· 2025-10-28 01:12
Core Insights - The 20th International Genomics Conference (ICG) was held in Hangzhou, Zhejiang Province, focusing on the theme "The Future of Omics and Artificial Intelligence (AI)" with over 100 experts from 19 countries participating [1][2] - This year's conference marks the 25th anniversary of the completion of the Human Genome Project, highlighting the transformative impact of genomics technology on fields such as medicine and agriculture [1] - The ICG aims to promote global collaboration in omics research and the bio-industry, serving as a vital bridge for advancing life sciences [1] Industry Developments - Scholars shared insights on various cutting-edge topics, with notable contributions from prominent figures such as Yang Huanming, George Church, and Wang Jian, focusing on the integration of AI with omics and disease prevention strategies [2] - Discussions included the construction of the "Healthy Zhejiang" cohort by Zhejiang University and calls for enhanced international cooperation in the field of genomics [2] - A special "Science Carnival" was organized for youth, aimed at making advanced scientific research accessible and inspiring a passion for scientific exploration among young people [2]
百亿级人类基因组基础模型发布
Ren Min Ri Bao· 2025-10-26 23:28
本报电(华轩)近日,在第二十届国际基因组学大会上,华大生命科学研究院与之江实验室联合发布了 百亿参数人类基因组通用基础模型——Genos。这一针对人类基因组深度优化的基因组基础模型,可支 持高达百万碱基对的超长上下文分析,并实现单碱基分辨率的精准识别。 为了全面验证模型的性能,研发团队进行了一系列测试。在基因组元件识别、远程调控预测、突变致病 性预测等经典评测任务里,Genos在超过一半的任务里比所有现有模型都表现更好,而且长序列评测任 务里,如突变热点识别和族群分类上,Genos的表现远超同类模型,展现了其强大的上下文分析能力, 有效破译基因组中的"暗物质"。 Genos在直接面向临床应用的致病性突变解读任务中,实现了较高的准确性,当结合021科学基础模型 后,准确率更高,为临床诊断提供了全新的高效工具。综合多项评测结果,Genos在各项核心任务中的 表现优越,证明了其全面的能力。 "共有、共为、共享",让前沿科技触手可及 一个模型无论多么强大,如果不能被方便地部署和使用,其价值便大打折扣。Genos是一个可以走向临 床、走向个人、走向每一个实验室的"实践先锋",为下游应用创新提供了坚实的地基。 Genos ...
全球首个百亿参数人类基因组基础模型Genos发布!开启基因组智能分析的新时代
生物世界· 2025-10-23 08:00
Core Insights - The article discusses the launch of Genos, the world's first human genomic foundation model with 100 billion parameters, which aims to enhance the understanding of human genetics and its implications for clinical diagnosis and scientific research [2][4]. Group 1: Model Features and Capabilities - Genos supports ultra-long context analysis of up to one million base pairs and achieves single-base resolution for precise identification [3]. - The model integrates data from multiple authoritative resources, including the Human Pan-Genome Reference Consortium and the Human Genome Structural Variation Consortium, utilizing 636 high-quality human genomes to reduce data bias and represent human genetic diversity comprehensively [8]. - Genos employs a Mixture-of-Experts (MoE) architecture, allowing it to activate only the most relevant experts for specific tasks, thus optimizing resource consumption while maintaining a vast knowledge base [9]. Group 2: Performance Metrics - In various genomic tasks, Genos outperformed existing models in over half of the assessments, particularly excelling in long-sequence evaluation tasks such as mutation hotspot identification and population classification [11]. - The model achieved an accuracy of 92% in clinical applications for pathogenic mutation interpretation, which increased to 98.3% when combined with the 021 scientific foundation model [13][18]. Group 3: Accessibility and Applications - Genos is designed to be open-source, with both 1.2 billion and 100 billion parameter versions available for developers and researchers, facilitating easy deployment for downstream applications [21]. - The model is integrated into the DCS Cloud platform, allowing users to perform rapid RNA expression predictions based solely on DNA sequences, significantly speeding up biological data analysis [21]. - In clinical settings, Genos can provide expert-level multi-modal interpretations for genetic disease diagnosis and is also integrated into personal health platforms for personalized genomic reporting [22]. Group 4: Future Initiatives - The launch of Genos marks the beginning of a new era in genomic analysis, with ongoing initiatives like the Long100K Genomes Consortium and the 10BC project aimed at generating high-quality training data for future model iterations [23].