A Dirichlet Process Mixture Model for Directional-Linear Data
Directional data require specialized probability models because of the non-Euclidean and periodic nature of their domain. When a directional variable is observed jointly with linear variables, modeling their dependence adds an additional layer of complexity. This paper introduces a novel Bayesian nonparametric approach for directional-linear data based on the Dirichlet process. We first extend the projected normal distribution to model the joint distribution of linear variables and a directional variable with arbitrary dimension as a projection of a higher-dimensional augmented multivariate normal distribution (MVN). We call the new distribution the semi-projected normal distribution (SPN); it possesses properties similar to the MVN. The SPN is then used as the mixture distribution in a Dirichlet process model to obtain a more flexible class of models for directional-linear data. We propose a normal conditional inverse-Wishart distribution as part of the prior distribution to address an identifiability issue inherited from the projected normal and preserve conjugacy with the SPN distribution. A Gibbs sampling algorithm is provided for posterior inference. Experiments on synthetic data and the Berkeley image database show superior performance of the Dirichlet process SPN mixture model (DPSPN) in clustering compared to other directional-linear models. We also build a hierarchical Dirichlet process model with the SPN to develop a likelihood ratio approach to bloodstain pattern analysis using the DPSPN model for density estimation to estimate the likelihood of a given pattern from a set of training data.
READ FULL TEXT