Covariance matrix preparation for quantum principal component analysis

04/07/2022
by   Max Hunter Gordon, et al.
0

Principal component analysis (PCA) is a dimensionality reduction method in data analysis that involves diagonalizing the covariance matrix of the dataset. Recently, quantum algorithms have been formulated for PCA based on diagonalizing a density matrix. These algorithms assume that the covariance matrix can be encoded in a density matrix, but a concrete protocol for this encoding has been lacking. Our work aims to address this gap. Assuming amplitude encoding of the data, with the data given by the ensemble {p_i,| ψ_i ⟩}, then one can easily prepare the ensemble average density matrix ρ = ∑_i p_i |ψ_i⟩⟨ψ_i |. We first show that ρ is precisely the covariance matrix whenever the dataset is centered. For quantum datasets, we exploit global phase symmetry to argue that there always exists a centered dataset consistent with ρ, and hence ρ can always be interpreted as a covariance matrix. This provides a simple means for preparing the covariance matrix for arbitrary quantum datasets or centered classical datasets. For uncentered classical datasets, our method is so-called "PCA without centering", which we interpret as PCA on a symmetrized dataset. We argue that this closely corresponds to standard PCA, and we derive equations and inequalities that bound the deviation of the spectrum obtained with our method from that of standard PCA. We numerically illustrate our method for the MNIST handwritten digit dataset. We also argue that PCA on quantum datasets is natural and meaningful, and we numerically implement our method for molecular ground-state datasets.

READ FULL TEXT

page 8

page 11

page 12

research
02/16/2016

A Sparse PCA Approach to Clustering

We discuss a clustering method for Gaussian mixture model based on the s...
research
05/31/2022

coVariance Neural Networks

Graph neural networks (GNN) are an effective framework that exploit inte...
research
07/10/2014

An eigenanalysis of data centering in machine learning

Many pattern recognition methods rely on statistical information from ce...
research
05/27/2023

Improved Privacy-Preserving PCA Using Space-optimized Homomorphic Matrix Multiplication

Principal Component Analysis (PCA) is a pivotal technique in the fields ...
research
02/11/2018

Multi-set Canonical Correlation Analysis simply explained

There are a multitude of methods to perform multi-set correlated compone...
research
04/06/2010

Extended Two-Dimensional PCA for Efficient Face Representation and Recognition

In this paper a novel method called Extended Two-Dimensional PCA (E2DPCA...
research
02/17/2017

Maximally Correlated Principal Component Analysis

In the era of big data, reducing data dimensionality is critical in many...

Please sign up or login with your details

Forgot password? Click here to reset