Detecting Epistatic Selection with Partially Observed Genotype Data Using Copula Graphical Models

10/02/2017
by   P. Behrouzi, et al.
0

Recombinant Inbred Lines derived from divergent parental lines can display extensive segregation distortion and long-range linkage disequilibrium (LD) between distant loci. These genomic signatures are consistent with epistatic selection during inbreeding. Epistatic interactions affect growth and fertility traits or even cause complete lethality. Detecting epistasis is challenging as multiple testing approaches are under-powered and true long-range LD is difficult to distinguish from drift. Here we develop a method for reconstructing an underlying network of genomic signatures of high-dimensional epistatic selection from multi-locus genotype data. The network captures the conditionally dependent short- and long-range LD structure and thus reveals "aberrant" marker-marker associations that are due to epistatic selection rather than gametic linkage. The network estimation relies on penalized Gaussian copula graphical models, which accounts for a large number of markers p and a small number of individuals n. A multi-core implementation of our algorithm makes it feasible to estimate the graph in high-dimensions also in the presence of significant portions of missing data. We demonstrate the efficiency of the proposed method on simulated datasets as well as on genotyping data in A.thaliana and maize. In addition, we implemented the method in the R package epistasis which is freely available at https://CRAN.R-project.org/package=epistasis.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/08/2023

Estimation of Long-Range Dependent Models with Missing Data: to Input or not to Input?

Among the most important models for long-range dependent time series is ...
research
04/08/2022

Robustly fitting Gaussian graphical models: the R-package robFitConGraph

The paper gives a tutorial-style introduction to the R-package robFitCon...
research
02/06/2018

An Imputation-Consistency Algorithm for High-Dimensional Missing Data Problems and Beyond

Missing data are frequently encountered in high-dimensional problems, bu...
research
06/26/2020

The huge Package for High-dimensional Undirected Graph Estimation in R

We describe an R package named huge which provides easy-to-use functions...
research
07/28/2020

Accounting for missing actors in interaction network inference from abundance data

Network inference aims at unraveling the dependency structure relating j...
research
01/14/2019

Supervised Learning for Multi-Block Incomplete Data

In the supervised high dimensional settings with a large number of varia...

Please sign up or login with your details

Forgot password? Click here to reset