College of Graduate Studies: Theses & Dissertations
Term of Award
Spring 2026
Degree Name
Master of Science, Computer Science (M.S.C.S.)
Document Type and Release Option
Thesis (open access)
Copyright Statement / License for Reuse

This work is licensed under a Creative Commons Attribution 4.0 License.
Department
Department of Computer Science
Committee Chair
Yao Xu
Committee Member 1
Lixin Li
Committee Member 2
Hong Zhang
Abstract
Flight delays pose persistent challenges to the efficiency and reliability of air transportation systems, affecting airlines, airports, regulators, and passengers alike. As traffic demand grows and operational environments become increasingly interconnected, accurately predicting both departure and arrival delays has become crucial for effective planning and mitigation. This study presents a network-aware, airline-specific framework for predicting flight delays in U.S. domestic air transportation systems using tree-based ensemble machine learning models. A large-scale dataset of 1.98 million flights, enriched with weather information, is used to develop predictive models for both departure and arrival delays. To capture the structural and operational complexity of the aviation network, the study integrates temporal features, historical delay patterns, and network centrality measures derived from directed origin–destination graphs. Models are trained on both the full dataset and airline-specific subsets representing the five largest U.S. carriers, enabling a direct comparison between system-wide and airline-level modeling approaches. A novel structured feature selection framework based on mutual information and correlation is applied to reduce redundancy and improve model robustness. Experimental results show that tree-based ensemble methods, particularly Random Forest and Extra Trees, achieve the strongest performance across all datasets. Airline-specific models consistently outperform system-wide models, demonstrating improved accuracy, recall, and overall predictive stability. Feature importance analysis reveals that delay outcomes are primarily driven by seasonal patterns, schedule timing, and historical delay propagation, while the influence of network connectivity and weather conditions varies systematically across airlines. Overall, the findings highlight the importance of tailored, interpretable machine learning frameworks for flight delay prediction. By combining predictive accuracy with operational insight, this study contributes to a more nuanced understanding of delay dynamics and offers practical implications for airline operations and air traffic management.
Recommended Citation
Afrane, Mary Dufie, "Network-Aware Airline-Specific Flight Delay Prediction Using Tree-Based Ensemble Models" (2026). College of Graduate Studies: Theses & Dissertations. 3106.
https://digitalcommons.georgiasouthern.edu/etd/3106
Research Data and Supplementary Material
No
Included in
Artificial Intelligence and Robotics Commons, Data Science Commons, Management and Operations Commons