PREDICTIVE MODEL OF PARKINSON'S DISEASE IN LARGE ELECTRONIC HEALTH RECORDS DATABASE

Author(s)

Kabadi S1, Lee A2, Kuhn M1, Gray D3
1Pfizer, Inc., Groton, CT, USA, 2University of Florida, Gainesville, FL, USA, 3Pfizer, Inc., Cambridge, MA, USA

OBJECTIVES: The objective of this investigation was to analyze de-identified electronic health record (EHR) data to predict a Parkinson’s Disease (PD) diagnosis. METHODS: Patients ≥ 30 years of age with evidence of continuous activity from January 1, 2012 to December 31, 2013 were eligible for inclusion (n = 3,057,540). PD cases (n = 2,097) were identified by two diagnoses for PD (ICD-9: 332.0) in calendar year 2013 and controls (n=2,548,563) were without a diagnosis for PD. A “training” dataset (n = 1,912,996) was used for model development and a “test” dataset (n = 637,664) was reserved to confirm model performance. Sixty demographic, clinical diagnosis and healthcare resource utilization (HRU) variables derived from the calendar year 2012 were entered into logistic regression (LR), classification and regression tree (CART), and random forest (RF) models. The LR and CART models used the full dataset, however, downsampling was applied to the RF model to handle class imbalance. Importance of the variables was estimated and predictive accuracy was evaluated using area under the curve (AUC). RESULTS: The LR model (AUC=0.84) was the better fit when applied to training data compared to CART (AUC = 0.53) and RF (AUC= 0.72) models. Age, sex, diagnosis of postural instability, and diagnosis of sleep disorders were important variables in predicting a PD diagnosis. Furthermore, number of levodopa prescriptions written and visits to a general practitioner in the year prior to diagnosis were important HRU variables. LR model performance metrics were acceptable when applied to the test dataset (AUC=0.85, specificity=0.75, sensitivity=0.81).   CONCLUSIONS: Data mining methods can be used to identify patients with Parkinson’s Disease using 60 variables in EHR data with acceptable AUC, sensitivity, and specificity. Sleep disorders may be more predictive of PD in the year prior to diagnosis than previous research suggests.

Conference/Value in Health Info

2016-05, ISPOR 2016, Washington DC, USA

Value in Health, Vol. 19, No. 3 (May 2016)

Code

PND3

Topic

Epidemiology & Public Health

Topic Subcategory

Safety & Pharmacoepidemiology

Disease

Neurological Disorders

Explore Related HEOR by Topic


Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on ispor.org. Update my browser now

×