Modeling Price Premiums of Oncology Drugs in Germany: A Cross-Validated Analysis Using XGBoost
Author(s)
Federico Felizzi, BSc, MS, PhD1, Volkan Beykoz, MSc2.
1Global HEOR Franchise Lead - Oncology, Menarini, Zurich, Switzerland, 2Menarini Stemline, Zurich, Switzerland.
1Global HEOR Franchise Lead - Oncology, Menarini, Zurich, Switzerland, 2Menarini Stemline, Zurich, Switzerland.
OBJECTIVES: Pharmaceutical pricing in Germany is tightly linked to the benefit assessments conducted by the G-BA, based on IQWiG evaluations. In oncology, where clinical and economic stakes are high, anticipating how such assessments influence pricing can support evidence-based negotiation strategies.
METHODS: We used a dataset from the AMNOG Monitor, focusing exclusively on oncology drugs with completed G-BA assessments. We focused on eight features: scores for four clinical domains (mortality, morbidity, quality of life, and safety), mean price of active comparators at launch, orphan drug status, the G-BA's probability of additional benefit, and the highest accepted study design (e.g., RCT, indirect comparison). The outcome was the log-transformed price premium. A linear booster model (XGBRegressor) was trained with L1 and L2 regularization to reduce overfitting. Model performance was evaluated via stratified 5-fold cross-validation, preserving the distribution of price premiums across folds to mitigate sampling bias in a modestly sized dataset.
RESULTS: The model achieved a cross-validated R² of 0.547, indicating moderate predictive performance. Feature importance analysis showed that mortality and morbidity scores, along with comparator price and orphan status, were the most influential variables. Clinical endpoints related to quality of life and safety contributed less to price prediction, while study design quality and additional benefit probability had moderate but consistent impact.
CONCLUSIONS: This analysis shows that structured clinical features can help anticipate negotiated oncology drug prices in Germany. The use of regularized gradient boosting combined with stratified cross-validation provides a solid and reliable estimate of model performance, as it is not based on a single train-test split but on repeated sampling across the dataset. While the model captures meaningful patterns, its predictive accuracy also highlights limitations in the available data. To improve performance, future research should explore alternative modeling approaches that may better capture complex interactions and latent drivers of pricing decisions.
METHODS: We used a dataset from the AMNOG Monitor, focusing exclusively on oncology drugs with completed G-BA assessments. We focused on eight features: scores for four clinical domains (mortality, morbidity, quality of life, and safety), mean price of active comparators at launch, orphan drug status, the G-BA's probability of additional benefit, and the highest accepted study design (e.g., RCT, indirect comparison). The outcome was the log-transformed price premium. A linear booster model (XGBRegressor) was trained with L1 and L2 regularization to reduce overfitting. Model performance was evaluated via stratified 5-fold cross-validation, preserving the distribution of price premiums across folds to mitigate sampling bias in a modestly sized dataset.
RESULTS: The model achieved a cross-validated R² of 0.547, indicating moderate predictive performance. Feature importance analysis showed that mortality and morbidity scores, along with comparator price and orphan status, were the most influential variables. Clinical endpoints related to quality of life and safety contributed less to price prediction, while study design quality and additional benefit probability had moderate but consistent impact.
CONCLUSIONS: This analysis shows that structured clinical features can help anticipate negotiated oncology drug prices in Germany. The use of regularized gradient boosting combined with stratified cross-validation provides a solid and reliable estimate of model performance, as it is not based on a single train-test split but on repeated sampling across the dataset. While the model captures meaningful patterns, its predictive accuracy also highlights limitations in the available data. To improve performance, future research should explore alternative modeling approaches that may better capture complex interactions and latent drivers of pricing decisions.
Conference/Value in Health Info
2025-11, ISPOR Europe 2025, Glasgow, Scotland
Value in Health, Volume 28, Issue S2
Code
MSR149
Topic
Economic Evaluation, Methodological & Statistical Research
Disease
Oncology