The Efficiency of Ranking Count Data with Excess Zeros

Document Type


Publication Date



Data from public health studies often include count end-points that exhibit excess zeros and depending on the study objectives, hurdle or zero-inflated models are used to model such data. In this study, we propose to apply a sampling scheme that is based on ranking, which significantly reduces the sample size and thus study cost for count data with excess zero. The appeal of ranked set sampling is its ability to give more precise estimation than simple random sampling as ranked set samples (RSS) are more likely to span the full range of the population. Intensive simulations are conducted to compare the proposed sampling method using RSS with simple random samples (SRS), comparing the mean squared error (MSE), bias, variance, and power of the RSS with the SRS under various data generating scenarios. We also illustrate the merits of RSS on a real data set with excess zeros using data from the National Medical Expenditure Survey on demand for medical care. Results from data analysis and simulation study coincide and show the RSS outperforming the SRS in all cases, with the RSS showing smaller variances and MSE compared to the SRS.


Eastern North American Region International Biometric Society Spring Meeting (ENAR)


Atlanta, GA