Probabilistic Canonical Correlation Analysis: A Whitening Approach
Canonical correlation analysis (CCA) is a classic and widely used statistical algorithm in multivariate data analysis. Despite the importance and pervasiveness of CCA its full understanding has lagged until recently with the probabilistic view of CCA becoming prevalent, aiding both interpretation and enabling its application to large-scale data. Here, we present a new perspective on CCA by investigating a two-layer latent variable probabilistic generative model of CCA rooted in the framework of whitening and multivariate regression. The advantages of this variant of probabilistic CCA include non-ambiguity of the latent variables, flexibility to allow non-normal generative variables, possibility of negative canonical correlations, as well as simplicity of interpretation on all levels of the model. Furthermore, we show that this approach is amenable to computationally efficient estimation in high-dimensional settings using regularized inference.
READ FULL TEXT