Low ‐, high‐coverage, and two‐stage DNA sequencing in the design of the genetic association study

ABSTRACT Next‐generation sequencing‐based genetic association study (GAS) is a powerful tool to identify candidate disease variants and genomic regions. Although low‐coverage sequencing offers low cost but inadequacy in calling rare variants, high coverage is able to detect essentially every variant but at a high cost. Two‐stage sequencing may be an economical way to conduct GAS without losing power. In two‐stage sequencing, an affordable number of samples are sequenced at high coverage as the reference panel, then to impute in a larger sample is sequenced at low coverage. As unit sequencing costs continue to decrease, investigators can now conduct GAS with more flexible sequencing depths. Here, we systematically evaluate the effect of the read depth and sample size on the variant discovery power and association power for study designs using low‐coverage, high‐coverage, and two‐stage sequencing. We consider 12 low‐coverage, 12 high‐coverage, and 51 two‐stage design scenarios with the read depth varying from 0.5× to 80×. With state‐of‐the‐art simulation and analysis packages and in‐house scripts, we simulate the complete study process from DNA sequencing to SNP (single nucleotide polymorphism) calling and association testing. Our results show that with appropriate allocation of sequencing effort, two‐stage sequencing is an effective approach for conducting GAS. We provide practical guidelines for investigators to plan the optimum sequencing‐...
Source: Genetic Epidemiology - Category: Epidemiology Authors: Tags: Research Article Source Type: research
More News: Epidemiology | Genetics | Study