IlocA: An algorithm to Cluster Cells and form Imputation Groups from a pair of Classification Variables

02/23/2023
by   Geraard Keogh, et al.
0

We set out the novel bottom up procedure to aggregate or cluster cells with small frequency counts together, in a two way classification while maintaining dependence in the table. The procedure is model free. It combines cells in a table into clusters based on independent log odds ratios. We use this procedure to build a set of statistically efficient and robust imputation cells, for the imputation of missing values of a continuous variable using a pair classification variables. A nice feature of the procedure is it forms aggregation groups homogeneous with respect to the cell response mean. Using a series of simulation studies, we show IlocA only groups together independent cells and does so in a consistent and credible way. While imputing missing data, we show IlocAs generates close to an optimal number of imputation cells. For ignorable non-response the resulting imputed means are accurate in general. With non-ignorable missingness results are consistent with those obtained elsewhere. We close with a case study applying our method to imputing missing building energy performance data

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro