DNA搜索引擎MetaGraph研发成功

Core Insights - Scientists from ETH Zurich have developed a DNA search engine named MetaGraph, which efficiently retrieves vast amounts of information from public biological databases, addressing the challenge of utilizing and locating extensive genomic data [1][2] Group 1: Development and Functionality - MetaGraph was created in response to the overwhelming growth of biological databases and the difficulty researchers face in extracting useful information from fragmented and noisy sequencing data [1] - The core innovation of MetaGraph lies in its use of a "graph structure" from mathematics to intelligently connect overlapping DNA fragments, similar to how a book index links sentences with common keywords [1] - The research team integrated seven publicly funded databases to create a comprehensive life spectrum index, encompassing 18.8 million unique DNA and RNA sequences and 210 billion amino acid sequences [1] Group 2: Practical Application and Impact - The search engine allows direct retrieval of original data files through text prompts, representing a new way to interact with biological data, which is highly compressed yet readily accessible [2] - MetaGraph enables researchers to pose biological questions to repositories like the Sequence Read Archive (SRA), which contains over 100 million DNA letters [2] - To validate its utility, the team scanned over 240,000 human gut microbiome samples for antibiotic resistance markers, achieving results in about one hour using a single high-performance computer, demonstrating significant analytical efficiency [2] - Biocomputing expert Rayan Hiki from the Pasteur Institute described this development as a "major breakthrough," setting a new standard for analyzing raw biological data such as DNA, RNA, and protein sequences [2]