Predicting Antimicrobial Resistance in Uncomplicated Urinary Tract Infections Using Machine Learning

Speaker(s)

Kponee-Shovein K¹, Cheng WY¹, Marijam A², Schwab P³, Gao C¹, Indacochea D¹, Ferrinho D³, Mitrani-Gold FS³, Pinheiro L¹, Royer J⁴, Joshi AV³
¹Analysis Group, Inc., Boston, MA, USA, ²GSK, Wavre, Belgium, ³GSK, Collegeville, PA, USA, ⁴Analysis Group, Inc., Montreal, QC, Canada

Presentation Documents

ISPOREU22_Marijam.pdf

OBJECTIVES:

Urinary tract infections (UTIs) are among the most common bacterial infections worldwide, with 80% classified as uncomplicated (uUTIs). Over 50% of patients with uUTI are prescribed non-guideline based antimicrobial treatment, potentially contributing to antimicrobial resistance (AMR) and increased healthcare costs. Here, we present the methodology underlying our study, which used machine learning in the development and validation of robust models estimating the probability of resistance to commonly prescribed classes of antibiotics for uUTI.

METHODS:

This predictive modeling study uses retrospective Optum Electronic Health Record (EHR) data, including lab results, during the period from 1/10/2015–29/2/2020. Data from female patients aged ≥12 years, diagnosed with uUTI (via positive Escherichia coli urine culture), with antibiotic susceptibility test results (index date), and ≥12 months of EHR activity prior to index date, were categorized into training and testing cohorts based on index dates. Least absolute shrinkage and selection operator (LASSO) and random forest (RF) models were evaluated as candidate predictive models. Both LASSO and RF algorithms were developed using training cohort data to estimate the probability of AMR to respective antibiotic classes and then assessed on separate testing data. Multiple imputation by chained equations was used to impute missing data, and nested cross validation was used to select the optimal algorithm, eliminating effects of optimism observed in standard k-fold cross validation. The statistical strength of selected features was calculated by bootstrapping the k-fold cross validation procedure to account for adaptive feature selection.

RESULTS:

LASSO was selected over RF as it slightly outperformed on Area Under the Curve of the Receiver Operating Characteristic value and was determined to be more interpretable.

CONCLUSIONS:

This predictive algorithm for AMR among uUTI patients further improves upon existing models because Optum EHR data are larger than those used in other published models in the United States, enabling greater statistical power and generalizability.

Code

MSR41

Topic

Methodological & Statistical Research

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics

Disease

No Additional Disease & Conditions/Specialized Treatment Areas

ISPOR Europe 2022

6 - 9 November