Mixture models for spherical data with applications to protein bioinformatics

04/27/2021
by   Kanti V. Mardia, et al.
0

Finite mixture models are fitted to spherical data. Kent distributions are used for the components of the mixture because they allow considerable flexibility. Previous work on such mixtures has used an approximate maximum likelihood estimator for the parameters of a single component. However, the approximation causes problems when using the EM algorithm to estimate the parameters in a mixture model. Hence the exact maximum likelihood estimator is used here for the individual components. This paper is motivated by a challenging prize problem in structural bioinformatics of how proteins fold. It is known that hydrogen bonds play a key role in the folding of a protein. We explore this hydrogen bond geometry using a data set describing bonds between two amino acids in proteins. An appropriate coordinate system to represent the hydrogen bond geometry is proposed, with each bond represented as a point on a sphere. We fit mixtures of Kent distributions to different subsets of the hydrogen bond data to gain insight into how the secondary structure elements bond together, since the distribution of hydrogen bonds depends on which secondary structure elements are involved.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/19/2019

Quantum Expectation-Maximization for Gaussian Mixture Models

The Expectation-Maximization (EM) algorithm is a fundamental tool in uns...
research
09/22/2019

Probabilistic Fitting of Topological Structure to Data

We define a class of probability distributions that we call simplicial m...
research
05/11/2022

Existence and Consistency of the Maximum Pseudo e̱ṯa̱-Likelihood Estimators for Multivariate Normal Mixture Models

Robust estimation under multivariate normal (MVN) mixture model is alway...
research
07/05/2016

Mixtures of Bivariate von Mises Distributions with Applications to Modelling of Protein Dihedral Angles

The modelling of empirically observed data is commonly done using mixtur...
research
03/23/2017

Training Mixture Models at Scale via Coresets

How can we train a statistical mixture model on a massive data set? In t...
research
10/20/2020

Estimating a mixing distribution on the sphere using predictive recursion

Mixture models are commonly used when data show signs of heterogeneity a...
research
06/26/2015

Modelling of directional data using Kent distributions

The modelling of data on a spherical surface requires the consideration ...

Please sign up or login with your details

Forgot password? Click here to reset