The Tree-Ordered Harmonic Fowlkes–Mallows Index (THFM): A Decision-Theoretic Metric for Sequential Medical Diagnostics
Faculty Mentor
Dr. Hani Samawi
Location
Russell Union 2047
Type of Research
On-going
Session Format
Oral Presentation
College
Jiann-Ping Hsu College of Public Health
Department
Biostatistics
Abstract
Evaluating modern diagnostic protocols, which increasingly focus on disease staging and molecular subtype identification rather than simple binary classification, demands novel statistical tools. Traditional metrics are ill-equipped for these sequential, hierarchical structures; the F1-score, for instance, critically ignores true negatives, limiting its utility for clinical exclusion, while the Youden Index assumes uniform misclassification costs. This work introduces and validates the Tree-Ordered Harmonic Fowlkes–Mallows (THFM) Index, a metric specifically designed to provide a single, holistic performance score for an entire multi-stage diagnostic cascade while considering prevalence. The THFM index is formally defined as the harmonic mean of the tree-ordered Fowlkes–Mallows (FM) measures for the positive (confirmatory) and negative (exclusion) classification paths. This formulation provides a balanced assessment of a protocol's unified ability to both confirm and rule out complex, multi-stage disease states. Theoretically, the index generalizes the single-stage Harmonic FM (HFM) index to the tree-structured setting and serves as a direct counterpart to the correlation-based Tree-Ordered MCC (TMCC). Bounded between 0 and 1, its optimality criterion offers a clear decision-theoretic basis for threshold selection across the cascade. Through extensive simulations and a primary application to a real-world multi-subtype lung cancer dataset, the THFM index demonstrates superior performance in selecting optimal cut-offs that yield more balanced classification rates compared to existing measures, such as the Youden index and F score for tree orderings. THFM provides a robust and mathematically rigorous tool for developing and validating the advanced diagnostic pathways essential for modern, subtype-aware precision medicine.
Program Description
.
Start Date
4-23-2026 10:30 AM
End Date
4-23-2026 10:45 AM
Recommended Citation
Rabari, Parthkumar Rameshbhai; Samawi, Dr. Hani; Kersey, Dr.Jing x.; and Biswas, Purbasha, "The Tree-Ordered Harmonic Fowlkes–Mallows Index (THFM): A Decision-Theoretic Metric for Sequential Medical Diagnostics" (2026). GS4 Student Scholars Symposium. 108.
https://digitalcommons.georgiasouthern.edu/research_symposium/2026/2026/108
The Tree-Ordered Harmonic Fowlkes–Mallows Index (THFM): A Decision-Theoretic Metric for Sequential Medical Diagnostics
Russell Union 2047
Evaluating modern diagnostic protocols, which increasingly focus on disease staging and molecular subtype identification rather than simple binary classification, demands novel statistical tools. Traditional metrics are ill-equipped for these sequential, hierarchical structures; the F1-score, for instance, critically ignores true negatives, limiting its utility for clinical exclusion, while the Youden Index assumes uniform misclassification costs. This work introduces and validates the Tree-Ordered Harmonic Fowlkes–Mallows (THFM) Index, a metric specifically designed to provide a single, holistic performance score for an entire multi-stage diagnostic cascade while considering prevalence. The THFM index is formally defined as the harmonic mean of the tree-ordered Fowlkes–Mallows (FM) measures for the positive (confirmatory) and negative (exclusion) classification paths. This formulation provides a balanced assessment of a protocol's unified ability to both confirm and rule out complex, multi-stage disease states. Theoretically, the index generalizes the single-stage Harmonic FM (HFM) index to the tree-structured setting and serves as a direct counterpart to the correlation-based Tree-Ordered MCC (TMCC). Bounded between 0 and 1, its optimality criterion offers a clear decision-theoretic basis for threshold selection across the cascade. Through extensive simulations and a primary application to a real-world multi-subtype lung cancer dataset, the THFM index demonstrates superior performance in selecting optimal cut-offs that yield more balanced classification rates compared to existing measures, such as the Youden index and F score for tree orderings. THFM provides a robust and mathematically rigorous tool for developing and validating the advanced diagnostic pathways essential for modern, subtype-aware precision medicine.