Workflow
DeepVariant
icon
Search documents
Nature Methods Paper Leverages PacBio Sequencing Technology to Develop the Platinum Pedigree Benchmark, a New Standard for Accurate Characterization of Variation in the Human Genome that Improves Training for AI Models
Globenewswireยท 2025-08-04 13:05
Core Insights - PacBio has developed a comprehensive genomic variant dataset called the Platinum Pedigree, which significantly enhances variant classification using AI tools, particularly Google's DeepVariant, achieving a 34% reduction in erroneous variant calls [1][5][6] Group 1: Dataset Development - The Platinum Pedigree dataset is the most extensive family-based variant dataset, characterizing both simple and complex genetic variations [1][2] - It was created through deep sequencing of a 28-member multi-generational family, cataloging over 37 Mb of genetic variation, including single nucleotide and large structural variants [3][4] - The dataset includes the first large pedigree-validated tandem repeat and structural variant truth sets, extending benchmark regions to 2.77 Gb [4] Group 2: Impact on AI and Genomics - The improved benchmarks allow for better evaluation of variant calling pipelines and accelerate the development of methods that address complex genomic regions important for human health [5][6] - The Platinum Pedigree benchmark is already being utilized by scientists to develop new sequence analysis tools and validate clinical sequencing workflows [6] Group 3: Publication and Collaboration - The study detailing the Platinum Pedigree was published in Nature Methods on August 4, 2025, and involved collaboration between PacBio, the University of Washington, and the University of Utah, with support from NIH and Howard Hughes Medical Institute [8]