Identifying Treatment Patterns for Diffuse Large B-Cell Lymphoma in Real-World Data Using Unsupervised Machine Learning

Author(s)

Wang Y1, Vanness D2
1Pennsylvania State University, State College, PA, USA, 2Pennsylvania State University, University Park, PA, USA

Presentation Documents

OBJECTIVES: Because many hematological oncology treatments delivered in practice do not precisely match treatment guidelines, researchers cannot rely on guidelines alone to identify treatment patterns and detect switches in therapy observed in real-world data. We explore whether unsupervised machine learning may be useful for identifying treatment patterns for patients with diffuse large B-cell lymphoma (DLBCL).

METHODS: We used 2007-2022 electronic health record data (TriNetX) to identify 7,321 DLBCL patients beginning with non-second-line-only therapies using ICD-10-CM diagnosis codes (C83.3). We identified 30 drugs used for DLBCL treatment from National Comprehensive Cancer Network (NCCN) and American Cancer Society (ACS) guidance and the literature. For each patient, drugs delivered within a 7-day window were grouped into multi-drug encounters. Multiple Correspondence Analysis (MCA) identified dimensions comprising weighted combinations drugs co-occurring within encounters. Mini-Batch K-Means then clustered encounters on their MCA domains. We used the Bayesian Information Criterion (BIC) to determine the optimal number of clusters. Sensitivity analyses varied the time window for grouping drugs, number of MCA dimensions, and criteria for selecting the optimal number of clusters.

RESULTS: Our base case approach successfully identified meaningful treatment patterns distinguishing between recognizable first-line and second-line therapies. Reducing MCA dimensions or expanding the drug grouping window reduced the optimal number of clusters, undesirably assigning some recognized first-line and second-line therapies to the same cluster. In some scenarios, replacing BIC with Akaike information criterion (AIC) yielded results similar to the base case but dramatically increased the optimal number of clusters in others.

CONCLUSIONS: Unsupervised machine learning is a promising approach for identifying meaningful treatment patterns. However, results are sensitive to learning parameters and require careful consideration.

Conference/Value in Health Info

2023-05, ISPOR 2023, Boston, MA, USA

Value in Health, Volume 26, Issue 6, S2 (June 2023)

Code

MSR55

Topic

Methodological & Statistical Research

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics

Disease

No Additional Disease & Conditions/Specialized Treatment Areas

Explore Related HEOR by Topic


Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on ispor.org. Update my browser now

×