

Due to their size, these large structural variants usually have greater impact than the smaller variants most commonly considered. This is a major stumbling block in the attempt to fully understand the relationship between variation in the sequence of the human genome and human diversity. The problem with short-read sequencing is that larger structural variants are difficult to discern directly. Halldórsson, head of Sequence analysis, deCODE genetics. "This technology and algorithms we developed enable us to characterize almost all structural variants reliably and consistently on a population scale," says Bjarni V.


This has led to the discovery of several hitherto unknown associations of structural variants with diseases and other traits. These variants were then imputed into a larger set of participants in various disease studies at deCODE genetics and associated with phenotypes. DNA base pairs in the genome were sequenced on average at least 10 times, allowing for accurate characterization of all genomic variation within the individual. Using PromethION sequencers from Oxford Nanopore Technologies, researchers at deCODE genetics whole genome sequenced 3,622 Icelanders. However of 133,886 reliably genotyped structural variants detected with long-read sequencing only 60% can be detected with short-reads. Using short-read sequencing scientists have been able to discern most small variations in the genome and population studies have allowed them to determine how they associate with diseases and other traits. Up until now DNA sequence analysis has been performed using short-read sequencing, where the sequence examined is broken up into fragments that are no more than 151 base pairs. In a paper published today in Nature genetics, scientists at deCODE genetics, a subsidiary of the pharmaceutical company Amgen, have shown that long-read DNA sequencing can be applied at population scale to unravel large structural variants that associate with human disease and other traits.
