Analysis of the Contribution of Air Pollution and Noise Levels to Predicting Autonomic Nervous System Disorders: An Interpretable Machine Learning Approach
Author(s)
Hyeonjung Park, B.Pharm (Oriental)1, Hae Sun Suh, MA, MS, PhD2;
1Department of Regulatory Science, Graduate School, Kyung Hee University, Seoul, Korea, Republic of, 2College of Pharmacy, Kyung Hee University, Seoul, Korea, Republic of
1Department of Regulatory Science, Graduate School, Kyung Hee University, Seoul, Korea, Republic of, 2College of Pharmacy, Kyung Hee University, Seoul, Korea, Republic of
Presentation Documents
OBJECTIVES: This study evaluates the contribution of environmental stressors, specifically air pollution and noise levels, to predicting Autonomic nervous system (ANS) disorders using machine learning (ML) models and Shapley Additive Explanations (SHAP).
METHODS: Data from the UK Biobank were used to develop ML models predicting ANS disorders, defined by ICD-10 codes (G90.0-G90.9) for cases diagnosed after 2011. Sociodemographic factors, lifestyle behaviors (e.g., smoking, alcohol consumption, physical activity), air pollution indices (NO₂, NOx, PM10, PM2.5), and noise pollution metrics were included. Variables with over 50% missing data were excluded, and remaining missing values were imputed using the K-nearest-neighbors method. The dataset was split 7:3 for training and validation, and class imbalance was addressed through random under-sampling. Variables were standardized, and backward elimination was used for feature selection. Model performance for logistic regression, random forest, XGBoost, LightGBM, stacking, and multilayer perceptron was evaluated using accuracy and area under the receiver operating characteristic (ROC) curve (AUC). SHAP plots were employed to assess variable contributions in the best-performing model.
RESULTS: Among 380,128 participants, the average age was 73.35 years (SD: 7.95), 54% were female, and 7,015 participants had ANS disorders. After backward elimination, 14 variables remained. All six ML models demonstrated acceptable performance, with accuracies and ROC AUC both around 0.7. The best-performing model was the stacking model, achieving an accuracy of 0.705 and an AUC of 0.714. From the SHAP analysis, age contributed the most to prediction, followed by ethnic background, PM2.5, PM10.
CONCLUSIONS: We conducted interpretable ML approaches to evaluate the contribution of environmental stressors. This study demonstrated the impact of air particulate matter on ANS disorders. These findings have important implications for public health, supporting the development of evidence-based policies to mitigate the health impacts of environmental stressors, including air pollution and noise.
METHODS: Data from the UK Biobank were used to develop ML models predicting ANS disorders, defined by ICD-10 codes (G90.0-G90.9) for cases diagnosed after 2011. Sociodemographic factors, lifestyle behaviors (e.g., smoking, alcohol consumption, physical activity), air pollution indices (NO₂, NOx, PM10, PM2.5), and noise pollution metrics were included. Variables with over 50% missing data were excluded, and remaining missing values were imputed using the K-nearest-neighbors method. The dataset was split 7:3 for training and validation, and class imbalance was addressed through random under-sampling. Variables were standardized, and backward elimination was used for feature selection. Model performance for logistic regression, random forest, XGBoost, LightGBM, stacking, and multilayer perceptron was evaluated using accuracy and area under the receiver operating characteristic (ROC) curve (AUC). SHAP plots were employed to assess variable contributions in the best-performing model.
RESULTS: Among 380,128 participants, the average age was 73.35 years (SD: 7.95), 54% were female, and 7,015 participants had ANS disorders. After backward elimination, 14 variables remained. All six ML models demonstrated acceptable performance, with accuracies and ROC AUC both around 0.7. The best-performing model was the stacking model, achieving an accuracy of 0.705 and an AUC of 0.714. From the SHAP analysis, age contributed the most to prediction, followed by ethnic background, PM2.5, PM10.
CONCLUSIONS: We conducted interpretable ML approaches to evaluate the contribution of environmental stressors. This study demonstrated the impact of air particulate matter on ANS disorders. These findings have important implications for public health, supporting the development of evidence-based policies to mitigate the health impacts of environmental stressors, including air pollution and noise.
Conference/Value in Health Info
2025-05, ISPOR 2025, Montréal, Quebec, CA
Value in Health, Volume 28, Issue S1
Code
EPH53
Topic
Epidemiology & Public Health
Topic Subcategory
Public Health
Disease
SDC: Neurological Disorders, SDC: Systemic Disorders/Conditions (Anesthesia, Auto-Immune Disorders (n.e.c.), Hematological Disorders (non-oncologic), Pain)