Article,

A Note on the Relations Between Spatio-Genetic Models

, and .
Journal of Computational Biology, 0 (0): null (2015)
DOI: 10.1089/cmb.2015.0080

Abstract

<p><b>Modeling human genetic variation along the continuous geographic space is a new research direction that has been stirring interest in the community during the past few years. Multiple recent works suggested different probabilistic models for the relation between geography and genetic sequence, and applied them to geographic localization, detection of selection, and correction of confounding in Genome-Wide Association Studies (GWAS). Prior to these developments, continuous representations of genetic structure were produced almost exclusively using dimensionality reduction techniques, mostly principal component analysis (PCA). Although fast and effective in some tasks, PCA suffers from multiple disadvantages, primarily stemming from a lack of explicit underlying genetic model.</b></p> <p class="last"><b>We begin this note by explaining the implicit spatio-genetic model that underlies PCA. Our presentation provides insights into some of the recently proposed spatial models; particularly, we show that two of these models can be formulated as modifications of PCA, each removing one of PCA's limitations in the context of genetic analysis. We build on one of the models to derive a nonsupervised procedure for the inference of spatial structure, and empirically demonstrate that it outperforms PCA in spatial inference. We then go on to review a few additional recent works in this unifying perspective.</b></p>

Tags

Users

  • @peter.ralph

Comments and Reviews