Predicting Breast Cancer Diagnosis Using Logistic Regression
Faculty Mentor
Dr. Ibrahim Alliu
Location
Russell Union Ballroom
Type of Research
On-going
Session Format
Poster Presentation
College
Jiann-Ping Hsu College of Public Health
Department
Epidemiology Biostatistics and Environmental Sciences
Abstract
Early detection of breast cancer remains a cornerstone of improving patient survival, yet accurately distinguishing malignant from benign tumors continues to challenge clinicians and data scientists alike. This study develops and rigorously evaluates a logistic regression model designed to predict breast cancer status using key tumor characteristics including mean radius, texture, perimeter, area, and smoothness derived from a well-established clinical dataset. Through a combination of stepwise model selection, goodness-of-fit diagnostics, variance inflation factor (VIF) assessment, and effect visualization, the final model identifies the most influential predictors while addressing issues of multicollinearity.
The model demonstrates strong explanatory power, achieving a McFadden’s pseudo R^2 of 0.76 and excellent predictive performance in both training and testing datasets, with accuracies exceeding 93%. ROC curve analysis further confirms the model’s reliability, yielding an exceptionally high AUC of 0.99, indicating near-perfect discrimination between malignant and non-malignant cases. Confusion matrix assessments highlight balanced sensitivity and specificity, underscoring the model’s clinical relevance for early detection.
By integrating statistical rigor with interpretable modeling approaches, this research provides a transparent and highly accurate framework for breast cancer prediction. The findings reinforce the potential of classical statistical models when meticulously optimized to support precision diagnostics and strengthen decision-making in oncological practice.
Program Description
.
Start Date
4-23-2026 2:00 PM
End Date
4-23-2026 4:00 PM
Recommended Citation
Gomez, Ousainou, "Predicting Breast Cancer Diagnosis Using Logistic Regression" (2026). GS4 Student Scholars Symposium. 211.
https://digitalcommons.georgiasouthern.edu/research_symposium/2026/2026/211
Predicting Breast Cancer Diagnosis Using Logistic Regression
Russell Union Ballroom
Early detection of breast cancer remains a cornerstone of improving patient survival, yet accurately distinguishing malignant from benign tumors continues to challenge clinicians and data scientists alike. This study develops and rigorously evaluates a logistic regression model designed to predict breast cancer status using key tumor characteristics including mean radius, texture, perimeter, area, and smoothness derived from a well-established clinical dataset. Through a combination of stepwise model selection, goodness-of-fit diagnostics, variance inflation factor (VIF) assessment, and effect visualization, the final model identifies the most influential predictors while addressing issues of multicollinearity.
The model demonstrates strong explanatory power, achieving a McFadden’s pseudo R^2 of 0.76 and excellent predictive performance in both training and testing datasets, with accuracies exceeding 93%. ROC curve analysis further confirms the model’s reliability, yielding an exceptionally high AUC of 0.99, indicating near-perfect discrimination between malignant and non-malignant cases. Confusion matrix assessments highlight balanced sensitivity and specificity, underscoring the model’s clinical relevance for early detection.
By integrating statistical rigor with interpretable modeling approaches, this research provides a transparent and highly accurate framework for breast cancer prediction. The findings reinforce the potential of classical statistical models when meticulously optimized to support precision diagnostics and strengthen decision-making in oncological practice.