ESTIMATING HETEROGENEOUS TREATMENT EFFECTS FOR PHARMACOEPIDEMIOLOGIC STUDIES: A SCOPING REVIEW FOR CAUSAL FOREST MODELS
Author(s)
Yinan Wang, PhD, MPP, Zhengxuan Li, Jieni Li, PhD, MPH, Rajender R. Aparasu, PhD, FAPhA.
University of Houston, Houston, TX, USA.
University of Houston, Houston, TX, USA.
OBJECTIVES: The Causal Forest (CF) models are increasingly being used to evaluate heterogeneous treatment effects (HTEs) in healthcare. Therefore, this study aims to comprehensively review the applications of CF models in pharmacoepidemiologic research.
METHODS: We conducted a systematic literature search in PubMed and Embase to identify peer-reviewed studies published till December 2025. We used the search terms "causal forest", "causal random forest", "causal machine learning", and "generalized random forest" to identify studies that used the CF algorithm and involved/compared pharmaceutical interventions and were published in English. We excluded studies published as conference abstracts, reviews, brief reports, or commentaries. The data extraction focused on the data source, model construction, and key findings.
RESULTS: Of the 261 unique studies screened, a total of 61 full-text articles were assessed for eligibility, and 35 studies met the inclusion criteria. Sixteen studies (45.7%) used data from randomized controlled trials, and 17 studies (48.6%) used claims or electronic health records data that applied propensity score methods to balance baseline characteristics. Most studies (48.6%) used the “grf” package to build the model and tuned hyperparameters with cross-validation, followed by “EconML”. However, the presentation of results varied considerably across studies. Most studies (65.7%) reported the rank of features that impact HTE, while only a few studies (31.4%) selected the most representative causal tree. Less than half of the studies (45.7%) developed patient subgroups based on estimated individual treatment effects, and only 11 (31.4%) studies presented the most representative tree for patient subgroups using features. Thirteen (37.1%) studies did not evaluate the model performance, six (17.14%) evaluated model performance using calibration.
CONCLUSIONS: We found that CF applications in pharmacoepidemiology have high variability and lack consistency in model development and presentation of HTE findings. More work is needed on how to select the best model and identify informative subgroups to evaluate HTE.
METHODS: We conducted a systematic literature search in PubMed and Embase to identify peer-reviewed studies published till December 2025. We used the search terms "causal forest", "causal random forest", "causal machine learning", and "generalized random forest" to identify studies that used the CF algorithm and involved/compared pharmaceutical interventions and were published in English. We excluded studies published as conference abstracts, reviews, brief reports, or commentaries. The data extraction focused on the data source, model construction, and key findings.
RESULTS: Of the 261 unique studies screened, a total of 61 full-text articles were assessed for eligibility, and 35 studies met the inclusion criteria. Sixteen studies (45.7%) used data from randomized controlled trials, and 17 studies (48.6%) used claims or electronic health records data that applied propensity score methods to balance baseline characteristics. Most studies (48.6%) used the “grf” package to build the model and tuned hyperparameters with cross-validation, followed by “EconML”. However, the presentation of results varied considerably across studies. Most studies (65.7%) reported the rank of features that impact HTE, while only a few studies (31.4%) selected the most representative causal tree. Less than half of the studies (45.7%) developed patient subgroups based on estimated individual treatment effects, and only 11 (31.4%) studies presented the most representative tree for patient subgroups using features. Thirteen (37.1%) studies did not evaluate the model performance, six (17.14%) evaluated model performance using calibration.
CONCLUSIONS: We found that CF applications in pharmacoepidemiology have high variability and lack consistency in model development and presentation of HTE findings. More work is needed on how to select the best model and identify informative subgroups to evaluate HTE.
Conference/Value in Health Info
2026-05, ISPOR 2026, Philadelphia, PA, USA
Value in Health, Volume 29, Issue S6
Code
MSR97
Topic
Methodological & Statistical Research
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics
Disease
No Additional Disease & Conditions/Specialized Treatment Areas