Cartography of Biostatistics and Machine Learning Methods to Identify Prognostic Factors

Author(s)

Mounier L, Civet A, Dupin J, Pau D, Esnault C
Roche, Boulogne-Billancourt, 92, France

Presentation Documents

OBJECTIVES:

With the emergence of machine learning (ML), new opportunities arise for prognostic factors identification. Although articles exist to review biostatistical methods for the identification of prognostic factors, the opportunities offered by ML algorithms are poorly considered. The overall purpose is to gather literature and cartography all methods in these two fields that are applicable to identify prognosis factors.

METHODS:

A literature review based on relevant keywords has been performed on Google Scholar and PubMed. The criteria used for selecting methodological papers included the date of publication and number of citations. An iterative selection process was then conducted to make an in-depth search and identify new keywords, leading to more specific papers.

RESULTS:

15 papers published after 2010 were selected from the literature to create a map covering feature extraction, feature selection and subgroup discovery fields. Feature selection methods include 3 families for independent features: 1/ Filter (e.g. univariate and multivariate analysis), 2/ Wrapper (based on e.g. sequential search, random search, exponential search), 3/ Embedded (lasso, ridge, elastic net, ...). Hybridization of these methods can also be implemented. Dedicated methods exist for structured features. Feature extraction includes methods that transform variables or dataset and must be associated with interpretative methods for prognostic factors identification purposes. Finally, subgroup discovery methods include many exploratory data mining techniques to uncover patterns associated with an outcome.

CONCLUSIONS:

This research gives an overview of all existing approaches to identify prognostic factors, both in biostatistics and ML, and highlights that there is a great diversity of approaches.This cartography should help data experts to go beyond what is usually made in studies.

Conference/Value in Health Info

2022-11, ISPOR Europe 2022, Vienna, Austria

Value in Health, Volume 25, Issue 12S (December 2022)

Code

MSR99

Topic

Methodological & Statistical Research, Study Approaches

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics, Literature Review & Synthesis

Disease

No Additional Disease & Conditions/Specialized Treatment Areas

Explore Related HEOR by Topic


Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on ispor.org. Update my browser now

×