Clustering microbiome data using mixtures of logistic normal multinomial models

DOI10.48550/ARXIV.2011.06682arXiv2011.06682MaRDI QIDQ101517

Author name not available (Why is that?)

Publication date: 12 November 2020

Abstract: Discrete data such as counts of microbiome taxa resulting from next-generation sequencing are routinely encountered in bioinformatics. Taxa count data in microbiome studies are typically high-dimensional, over-dispersed, and can only reveal relative abundance therefore being treated as compositional. Analyzing compositional data presents many challenges because they are restricted on a simplex. In a logistic normal multinomial model, the relative abundance is mapped from a simplex to a latent variable that exists on the real Euclidean space using the additive log-ratio transformation. While a logistic normal multinomial approach brings in flexibility for modeling the data, it comes with a heavy computational cost as the parameter estimation typically relies on Bayesian techniques. In this paper, we develop a novel mixture of logistic normal multinomial models for clustering microbiome data. Additionally, we utilize an efficient framework for parameter estimation using variational Gaussian approximations (VGA). Adopting a variational Gaussian approximation for the posterior of the latent variable reduces the computational overhead substantially. The proposed method is illustrated on simulated and real datasets.

Mathematics Subject Classification ID

No records found.

This page was built for publication: Clustering microbiome data using mixtures of logistic normal multinomial models

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q101517)