Hot Deck Imputation for Mixed Typed Datasets using Model Based Clustering

Document Type

Presentation

Presentation Date

3-13-2017

Abstract or Description

Multiple imputation is a commonly used method when addressing the issue of missing values. Hot deck imputation is distinctively different than others to ensure closeness to true variance in estimating the regression coefficients as it involves the replacement of unobserved values by observed values in similar units or cells. These cells are determined in terms of the closeness of each observation using various distance measures. But most of the distance measures can only be applied to continuous variables. Thus, there is a distinct problem when there are categorical covariates in the dataset. We proposed for a model based clustering procedure that uses a parsimonious covariance structure of the latent variable, following a mixture of Gaussian distributions to generate the imputation cells of mixed type dataset (i.e. datasets with continuous and categorical variables). The results of the simulated data showed demonstrated lower variance compared to the complete cases in estimation of regression coefficients.

Sponsorship/Conference/Institution

Eastern North American Region International Biometric Society Spring Meeting (ENAR)

Location

Washington, DC

Share

COinS