Using Machine Learning to Estimate Perceptions and Future Prescribing Intentions of Health Care Providers: A Novel Application of Integrated Primary and Secondary Data in Understanding HCP Decision-Making
Author(s)
Amod Athavale, BS, MS, PhD, Brittany Smith, MPH, MBA, Jonathan Jenkins, MS, MA;
Trinity Life Sciences, Waltham, MA, USA
Trinity Life Sciences, Waltham, MA, USA
OBJECTIVES: This study is designed to demonstrate a novel application of machine learning to estimate healthcare provider (HCP) perceptions and future prescribing intentions, enriching administrative claims data with survey data.
METHODS: A representative sample of HCPs from a large US claims database was surveyed on their perceptions (e.g., preference for treatment attributes, impact of practice setting on decisions, etc.) and intended prescribing behaviors towards pipeline therapies within a specific therapeutic area. Survey responses were linked to longitudinal prescribing records in the claims data using NPIs. Machine learning models (Random Forest, AdaBoost, GBM, XGBoost, LightGBM, CatBoost) were trained on this linked dataset to classify HCPs based on associations between survey responses and historical claims data; survey metrics were estimated using claims data metrics. Model performance was assessed using a hold-out test set, evaluating accuracy, precision, recall, and F1-score. The final models for each survey metric to be estimated were applied to the full claims database to predict perceptions and intentions for non-surveyed HCPs.
RESULTS: This is an ongoing study, and results are expected in advance of the conference. Model performance metrics and descriptive statistics for predicted survey metrics will be reported. Associations between predicted perceptions/intentions and observed prescribing patterns in the full claims dataset will be assessed to evaluate the validity of the predictions. Further analyses will explore the distribution of predicted perceptions and intentions across HCP subgroups defined by practice setting and past prescribing behavior.
CONCLUSIONS: Administrative claims databases are a valuable source of real-world data for health outcomes research, but they lack information on HCP perceptions and prescribing intentions that are often critical to understand and characterize observed behaviors. This approach helps enrich claims data with estimated HCP perceptions and future intentions. This has the potential to elucidate rationale behind HCP decision-making and help improve effectiveness of medical education.
METHODS: A representative sample of HCPs from a large US claims database was surveyed on their perceptions (e.g., preference for treatment attributes, impact of practice setting on decisions, etc.) and intended prescribing behaviors towards pipeline therapies within a specific therapeutic area. Survey responses were linked to longitudinal prescribing records in the claims data using NPIs. Machine learning models (Random Forest, AdaBoost, GBM, XGBoost, LightGBM, CatBoost) were trained on this linked dataset to classify HCPs based on associations between survey responses and historical claims data; survey metrics were estimated using claims data metrics. Model performance was assessed using a hold-out test set, evaluating accuracy, precision, recall, and F1-score. The final models for each survey metric to be estimated were applied to the full claims database to predict perceptions and intentions for non-surveyed HCPs.
RESULTS: This is an ongoing study, and results are expected in advance of the conference. Model performance metrics and descriptive statistics for predicted survey metrics will be reported. Associations between predicted perceptions/intentions and observed prescribing patterns in the full claims dataset will be assessed to evaluate the validity of the predictions. Further analyses will explore the distribution of predicted perceptions and intentions across HCP subgroups defined by practice setting and past prescribing behavior.
CONCLUSIONS: Administrative claims databases are a valuable source of real-world data for health outcomes research, but they lack information on HCP perceptions and prescribing intentions that are often critical to understand and characterize observed behaviors. This approach helps enrich claims data with estimated HCP perceptions and future intentions. This has the potential to elucidate rationale behind HCP decision-making and help improve effectiveness of medical education.
Conference/Value in Health Info
2025-05, ISPOR 2025, Montréal, Quebec, CA
Value in Health, Volume 28, Issue S1
Code
MSR70
Topic
Methodological & Statistical Research
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics
Disease
No Additional Disease & Conditions/Specialized Treatment Areas