Quantification of intrinsic quality of a principal dimension in correspondence analysis and taxicab correspondence analysis

08/24/2021
by   Vartan Choulakian, et al.
0

Collins(2002, 2011) raised a number of issues with regards to correspondence analysis (CA), such as: qualitative information in a CA map versus quantitative information in the relevant contingency table; the interpretation of a CA map is difficult and its relation with the % of inertia (variance) explained. We tackle these issues by considering CA and taxicab CA (TCA) as a stepwise Hotelling/Tucker decomposition of the cross-covariance matrix of the row and column categories into four quadrants. The contents of this essay are: First, we review the notion of quality/quantity in multidimensional data analysis as discussed by Benzécri, who based his reflections on Aristotle. Second, we show the importance of unravelling the interrelated concepts of dependence/heterogeneity structure in a contingency table; and to picture them two maps are needed. Third, we distinguish between intrinsic and extrinsic quality of a principal dimension; the intrinsic quality is based on the signs of the residuals in the four quadrants, hence to the interpretability. Furthermore, we provide quantifications of the intrinsic quality and use them to uncover structure in particular in sparse contingency tables. Finally, we emphasize the importance of looking at the residual cross-covariance values at each iteration.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/01/2020

Comments on "correspondence analysis makes you blind"

Collins' (2002) statement "correspondence analysis makes you blind" foll...
research
02/19/2021

Correlation Based Principal Loading Analysis

Principal loading analysis is a dimension reduction method that discards...
research
08/06/2023

Visualization of Extremely Sparse Contingency Table by Taxicab Correspondence Analysis: A Case Study of Textual Data

We present an overview of taxicab correspondence analysis, a robust vari...
research
10/12/2021

Tangent Space and Dimension Estimation with the Wasserstein Distance

We provide explicit bounds on the number of sample points required to es...
research
11/29/2022

Variable selection and covariance structure identification using loadings

We provide sparse principal loading analysis which is a new concept that...
research
09/11/2020

TCA and TLRA: A comparison on contingency tables and compositional data

There are two popular general approaches for the analysis and visualizat...
research
08/31/2011

Anisotropic k-Nearest Neighbor Search Using Covariance Quadtree

We present a variant of the hyper-quadtree that divides a multidimension...

Please sign up or login with your details

Forgot password? Click here to reset