Predictors of Diabetes Among Adult Pima Indian Women: A Multivariable and Stepwise Logistic Regression Analysis

Faculty Mentor

Dr Aliu Ibrahim

Location

Russell Union Ballroom

Type of Research

Completed

Session Format

Poster Presentation

College

Jiann-Ping Hsu College of Public Health

Department

Epidemiology and biostatisitc

Abstract

Introduction: Globally, Diabetes affects an estimated 589 million adults and caused approximately 3.4 million deaths equivalent to one death every six seconds, with type 2 diabetes accounting for most cases. The Pima Indians of Arizona have one of the highest documented prevalence rates of type 2 diabetes globally, making their population central to diabetes research. Although the Pima Diabetes Dataset has been widely used in predictive and machine-learning studies, few investigations have applied rigorous multivariable epidemiologic methods to quantify the independent contributions of key demographic and metabolic risk factors. This study aimed to characterize diabetes risk factors and quantify independent predictors among Pima Indian women using multivariable and stepwise logistic regression. Methods: A cross-sectional analysis was conducted using data from 768 adult Pima Indian women in the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) Pima Diabetes Dataset. After data cleaning, descriptive statistics, correlation analyses, and multicollinearity diagnostics were performed. A full multivariable logistic regression model including eight predictors was fitted. Stepwise logistic regression (SLENTRY = 0.15; SLSTAY = 0.15) was applied to derive a parsimonious model. Model performance was evaluated using Akaike Information Criterion (AIC), Hosmer–Lemeshow goodness-of-fit testing, and area under the receiver operating characteristic curve (AUC). Results: Five variables were retained in the final stepwise model: glucose, BMI, diabetes pedigree function, age, and pregnancies. Glucose, BMI, and diabetes pedigree function were independently associated with diabetes, with glucose demonstrating the strongest effect (aOR = 1.037; 95% CI: 1.027–1.047). The reduced model demonstrated excellent discrimination (AUC = 0.863), comparable to the full multivariable model (AUC = 0.862), while achieving improved model parsimony (AIC = 356.9). Conclusion: The study shows that glucose, BMI, and family history are the strongest predictors of diabetes and a reduced model can effectively support targeted screening in high-risk populations.

Program Description

.

Start Date

4-23-2026 10:00 AM

End Date

4-23-2026 12:00 PM

This document is currently not available here.

Share

COinS
 
Apr 23rd, 10:00 AM Apr 23rd, 12:00 PM

Predictors of Diabetes Among Adult Pima Indian Women: A Multivariable and Stepwise Logistic Regression Analysis

Russell Union Ballroom

Introduction: Globally, Diabetes affects an estimated 589 million adults and caused approximately 3.4 million deaths equivalent to one death every six seconds, with type 2 diabetes accounting for most cases. The Pima Indians of Arizona have one of the highest documented prevalence rates of type 2 diabetes globally, making their population central to diabetes research. Although the Pima Diabetes Dataset has been widely used in predictive and machine-learning studies, few investigations have applied rigorous multivariable epidemiologic methods to quantify the independent contributions of key demographic and metabolic risk factors. This study aimed to characterize diabetes risk factors and quantify independent predictors among Pima Indian women using multivariable and stepwise logistic regression. Methods: A cross-sectional analysis was conducted using data from 768 adult Pima Indian women in the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) Pima Diabetes Dataset. After data cleaning, descriptive statistics, correlation analyses, and multicollinearity diagnostics were performed. A full multivariable logistic regression model including eight predictors was fitted. Stepwise logistic regression (SLENTRY = 0.15; SLSTAY = 0.15) was applied to derive a parsimonious model. Model performance was evaluated using Akaike Information Criterion (AIC), Hosmer–Lemeshow goodness-of-fit testing, and area under the receiver operating characteristic curve (AUC). Results: Five variables were retained in the final stepwise model: glucose, BMI, diabetes pedigree function, age, and pregnancies. Glucose, BMI, and diabetes pedigree function were independently associated with diabetes, with glucose demonstrating the strongest effect (aOR = 1.037; 95% CI: 1.027–1.047). The reduced model demonstrated excellent discrimination (AUC = 0.863), comparable to the full multivariable model (AUC = 0.862), while achieving improved model parsimony (AIC = 356.9). Conclusion: The study shows that glucose, BMI, and family history are the strongest predictors of diabetes and a reduced model can effectively support targeted screening in high-risk populations.