Adaptive gPCA: A method for structured dimensionality reduction
Loading...
Can’t use the file because of accessibility barriers? Contact us
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Permanent Link
Abstract
When working with large biological data sets, exploratory analysis is an important first step for understanding the latent structure and for generating hypotheses to be tested in subsequent analyses. However, when the number of variables is large compared to the number of samples, standard methods such as principal components analysis give results which are unstable and difficult to interpret. To mitigate these problems, we have developed a method which allows the analyst to incorporate side information about the relationships between the variables in a way that encourages similar variables to have similar loadings on the principal axes. This leads to a low-dimensional representation of the samples which both describes the latent structure and which has axes which are interpretable in terms of groups of closely related variables. The method is derived by putting a prior encoding the relationships between the variables on the data and following through the analysis on the posterior distributions of the samples. We show that our method does well at reconstructing true latent structure in simulated data and we also demonstrate the method on a dataset investigating the effects of antibiotics on the composition of bacteria in the human gut.
Description
Keywords
Citation
Fukuyama, Julia A. "Adaptive gPCA: A method for structured dimensionality reduction." Annals of Applied Statistics, 2017-02-01.
Journal
Annals of Applied Statistics