Icten Z, Burns SM, Menzin JA
Boston Health Economics, Boston, MA, USA
OBJECTIVES : Detecting Parkinson disease (PD) early in its prodromal period can facilitate timely treatment and mitigate symptoms and risks. This study aims to identify predictors of incident PD patients using Medicare administrative claims data and machine learning techniques. METHODS : Medicare Part A/B claims (5% sample) from 2010Q1-2015Q4 were used to identify incident PD cases in 2015 based on ICD-9/10 diagnosis codes in patients ≥65 years old with continuous enrollment during the two-year baseline period prior to their PD diagnosis date (index date). Controls were identified in a 3:1 ratio to cases and met the same eligibility criteria, had no evidence of PD diagnosis and had a randomly selected index encounter date with the month/year matching to the case’s index date. Features included demographics, comorbidities, medication and procedure utilization, and service location variables were extracted from the baseline. Data were partitioned using a 60%/20%/20% split to train, tune models and test performance on unseen data. Traditional and regularized logistic regression, k-nearest neighbor, XGBoost, support vector machine, and random forest models were built, and the best model was selected using the area under the ROC curve (AUC). Accuracy, recall, precision and F1 score were also assessed. RESULTS : The study population included 4,575 cases and 13,725 controls (mean age=74.8 years; females=56%). The XGBoost model was the best performing model (on unseen data: AUC: 83.1%; accuracy: 79.6%; recall: 65.1%; precision: 58.2% and F1: 0.61). Features in the XGBoost model with the highest weights were age, gender, Charlson Comorbidity Index, motor symptoms (tremor, abnormal gait, involuntary movements), diagnoses related to autonomic dysfunction (urinary incontinence), diagnostic procedures (CT scan, MRI), hypertension, cancer diagnostics (mammography), ophthalmologic disease, anxiety and psychosis. CONCLUSIONS : Our study identified predictors of PD with high predictive accuracy and our findings are overall consistent with established risk factors, while indicating opportunities for further research.
Conference/Value in Health Info
2020-05, ISPOR 2020, Orlando, FL, USA
Epidemiology & Public Health, Health Service Delivery & Process of Care, Methodological & Statistical Research, Real World Data & Information Systems
Artificial Intelligence, Machine Learning, Predictive Analytics, Disease Management, Health & Insurance Records Systems