Modeling count data in the addiction field: Some simple recommendations

Abstract Analyzing count data is frequent in addiction studies but may be cumbersome, time‐consuming, and cause misleading inference if models are not correctly specified. We compared different statistical models in a simulation study to provide simple, yet valid, recommendations when analyzing count data.We used 2 simulation studies to test the performance of 7 statistical models (classical or quasi‐Poisson regression, classical or zero‐inflated negative binomial regression, classical or heteroskedasticity‐consistent linear regression, and Mann‐Whitney test) for predicting the differences between population means for 9 different population distributions (Poisson, negative binomial, zero‐ and one‐inflated Poisson and negative binomial, uniform, left‐skewed, and bimodal). We considered a large number of scenarios likely to occur in addiction research: presence of outliers, unbalanced design, and the presence of confounding factors. In unadjusted models, the Mann‐Whitney test was the best model, followed closely by the heteroskedasticity‐consistent linear regression and quasi‐Poisson regression. Poisson regression was by far the worst model. In adjusted models, quasi‐Poisson regression was the best model. If the goal is to compare 2 groups with respect to count data, a simple recommendation would be to use quasi‐Poisson regression, which was the most generally valid model in our extensive simulations.
Source: International Journal of Methods in Psychiatric Research - Category: Psychiatry Authors: Tags: ORIGINAL ARTICLE Source Type: research