Incorporating Prior Knowledge to Increase the Power of Genome-Wide Association Studies
Typical methods of analyzing genome-wide single nucleotide variant (SNV) data in cases and controls involve testing each variant’s genotypes separately for phenotype association, and then using a substantial multiple-testing penalty to minimize the rate of false positives. This approach, however, can result in low power for modestly associated SNVs. Furthermore, simply looking at the most associated SNVs may not directly yield biological insights about disease etiology. SNVset methods attempt to address both limitations of the traditional approach by testing biologically meaningful sets of SNVs (e.g., genes or pathwa...
Source: Springer protocols feed by Bioinformatics - January 1, 2013 Category: Bioinformatics Source Type: news

Higher Order Interactions: Detection of Epistasis Using Machine Learning and Evolutionary Computation
Higher order interactions are known to affect many different phenotypic traits. The advent of large-scale genotyping has, however, shown that finding interactions is not a trivial task. Classical genome-wide association studies (GWAS) are a useful starting point for unraveling the genetic architecture of a phenotypic trait. However, to move beyond the additive model we need new analysis tools specifically developed to deal with high-dimensional genotypic data. Here we show that evolutionary algorithms are a useful tool in high-dimensional analyses designed to identify gene–gene interactions in current large-scale gen...
Source: Springer protocols feed by Bioinformatics - January 1, 2013 Category: Bioinformatics Source Type: news

Applications of Multifactor Dimensionality Reduction to Genome-Wide Data Using the R Package ‘MDR’
This chapter describes how to use the R package ‘MDR’ to search and identify gene–gene interactions in high-dimensional data and illustrates applications for exploratory analysis of multi-locus models by providing specific examples. (Source: Springer protocols feed by Bioinformatics)
Source: Springer protocols feed by Bioinformatics - January 1, 2013 Category: Bioinformatics Source Type: news

Epistasis, Complexity, and Multifactor Dimensionality Reduction
Genome-wide association studies (GWASs) and other high-throughput initiatives have led to an information explosion in human genetics and genetic epidemiology. Conversion of this wealth of new information about genomic variation to knowledge about public health and human biology will depend critically on the complexity of the genotype to phenotype mapping relationship. We review here computational approaches to genetic analysis that embrace, rather than ignore, the complexity of human health. We focus on multifactor dimensionality reduction (MDR) as an approach for modeling one of these complexities: epistasis or gene&ndash...
Source: Springer protocols feed by Bioinformatics - January 1, 2013 Category: Bioinformatics Source Type: news

Mixed Effects Structural Equation Models and Phenotypic Causal Networks
Complex networks with causal relationships among variables are pervasive in biology. Their study, however, requires special modeling approaches. Structural equation models (SEM) allow the representation of causal mechanisms among phenotypic traits and inferring the magnitude of causal relationships. This information is important not only in understanding how variables relate to each other in a biological system, but also to predict how this system reacts under external interventions which are common in fields related to health and food production. Nevertheless, fitting a SEM requires defining a priori the causal structure ...
Source: Springer protocols feed by Bioinformatics - January 1, 2013 Category: Bioinformatics Source Type: news

Association Weight Matrix: A Network-Based Approach Towards Functional Genome-Wide Association Studies
In this chapter we describe the Association Weight Matrix (AWM), a novel procedure to exploit the results from genome-wide association studies (GWAS) and, in combination with network inference algorithms, generate gene networks with regulatory and functional significance. In simple terms, the AWM is a matrix with rows represented by genes and columns represented by phenotypes. Individual {i, j}th elements in the AWM correspond to the association of the SNP in the ith gene to the jth phenotype. While our main objective is to provide a recipe-like tutorial on how to build and use AWM, we also take the opportunity to briefly ...
Source: Springer protocols feed by Bioinformatics - January 1, 2013 Category: Bioinformatics Source Type: news

R for Genome-Wide Association Studies
In recent years R has become de facto statistical programming language of choice for statisticians and it is also arguably the most widely used generic environment for analysis of high-throughput genomic data. In this chapter we discuss some approaches to improve performance of R when working with large SNP datasets. (Source: Springer protocols feed by Bioinformatics)
Source: Springer protocols feed by Bioinformatics - January 1, 2013 Category: Bioinformatics Source Type: news

Detection of Signatures of Selection Using F ST
Natural selection has molded the evolution of species across all taxa. Much more recently, on an evolutionary scale, human-oriented selection started to play an important role in shaping organisms, markedly so after the domestication of animals and plants. These selection processes have left traceable marks in the genome. Following from the recent advances in molecular genetics technologies, a number of methods have been developed to detect such signals, termed genomic signatures of selection. In this chapter we discuss a straightforward protocol based on the F ST statistic to identify genomic regions that ex...
Source: Springer protocols feed by Bioinformatics - January 1, 2013 Category: Bioinformatics Source Type: news

Validation of Genome-Wide Association Studies (GWAS) Results
Validation of the results of genome-wide association studies or genomic selection studies is an essential component of the experimental program. Validation allows users to quantify the benefit of applying gene tests or genomic prediction, relative to the costs of implementing the program. Further, if implemented, an appropriate weight in a selection index can only be derived if estimates of the accuracy of genomic predictions are available. In this chapter the reasons for validation are explored, and a range of commonly encountered scenarios described. General principles are stated, and options for performing validation di...
Source: Springer protocols feed by Bioinformatics - January 1, 2013 Category: Bioinformatics Source Type: news

Genotype Imputation to Increase Sample Size in Pedigreed Populations
Genotype imputation is a cost-effective way to increase the power of genomic selection or genome-wide association studies. While several genotype imputation algorithms are available, this chapter focuses on a heuristic algorithm, as implemented in the AlphaImpute software. This algorithm combines long-range phasing, haplotype library imputation, and segregation analysis and it is specifically designed to work with pedigreed populations. (Source: Springer protocols feed by Bioinformatics)
Source: Springer protocols feed by Bioinformatics - January 1, 2013 Category: Bioinformatics Source Type: news

Genotype Phasing in Populations of Closely Related Individuals
Knowledge of phase has many potential applications for empowering genomic information. For example, phase can facilitate the identification of identical by descent sharing between pairs of individuals, as part of the process of genotype imputation, or to facilitate parent of origin of allele modeling in order to quantify the effect of parental imprinting. (Source: Springer protocols feed by Bioinformatics)
Source: Springer protocols feed by Bioinformatics - January 1, 2013 Category: Bioinformatics Source Type: news

Use of Ancestral Haplotypes in Genome-Wide Association Studies
We herein present a haplotype-based method to perform genome-wide association studies. The method relies on hidden Markov models to describe haplotypes from a population as a mosaic of a set of ancestral haplotypes. For a given position in the genome, haplotypes deriving from the same ancestral haplotype are also likely to carry the same risk alleles. Therefore, the model can be used in several applications such as haplotype reconstruction, imputation, association studies or genomic predictions. We illustrate then the model with two applications: the fine-mapping of a QTL affecting live weight in cattle and association stu...
Source: Springer protocols feed by Bioinformatics - January 1, 2013 Category: Bioinformatics Source Type: news

Detecting Regions of Homozygosity to Map the Cause of Recessively Inherited Disease
Homozygosity is a component of genetic patterning that can be used to search for the cause of genetic disease. In this chapter, methods are presented to analyze SNP data for the presence of homozygosity. Two exercises demonstrate methods to define runs of homozygosity, to identify shared homozygosity between individuals, and to evaluate the results in light of the expectations of a recessively inherited genetic disorder. An example dataset is used to aid in data interpretation. (Source: Springer protocols feed by Bioinformatics)
Source: Springer protocols feed by Bioinformatics - January 1, 2013 Category: Bioinformatics Source Type: news

Genomic Best Linear Unbiased Prediction (gBLUP) for the Estimation of Genomic Breeding Values
Genomic best linear unbiased prediction (gBLUP) is a method that utilizes genomic relationships to estimate the genetic merit of an individual. For this purpose, a genomic relationship matrix is used, estimated from DNA marker information. The matrix defines the covariance between individuals based on observed similarity at the genomic level, rather than on expected similarity based on pedigree, so that more accurate predictions of merit can be made. gBLUP has been used for the prediction of merit in livestock breeding, may also have some applications to the prediction of disease risk, and is also useful in the estimation ...
Source: Springer protocols feed by Bioinformatics - January 1, 2013 Category: Bioinformatics Source Type: news

Genome-Enabled Prediction Using the BLR (Bayesian Linear Regression) R-Package
The BLR (Bayesian linear regression) package of R implements several Bayesian regression models for continuous traits. The package was originally developed for implementing the Bayesian LASSO (BL) of Park and Casella (J Am Stat Assoc 103(482):681–686, 2008), extended to accommodate fixed effects and regressions on pedigree using methods described by de los Campos et al. (Genetics 182(1):375–385, 2009). In 2010 we further developed the code into an R-package, reprogrammed some internal aspects of the algorithm in the C language to increase computational speed, and further documented the package (Plant Genome J 3...
Source: Springer protocols feed by Bioinformatics - January 1, 2013 Category: Bioinformatics Source Type: news