Limitations and Opportunities for Identifying Outcome Prognostic Factors in the Context of Small Samples
Author(s)
Mounier L, Civet A, Pau D, Dupin J, Esnault C
Roche, Boulogne-Billancourt, 92, France
Presentation Documents
OBJECTIVES:
Identification of prognostic factors is essential for advances in clinical research but rare diseases or genomic mutations studies face the limits of small numbers of patients, for which current methods may be challenging. This work aims to address the need for a synthesis around limitations and opportunities for up-to-date statistical and machine learning methods in the context of small samples.METHODS:
A literature review based on relevant keywords in the context of small sample size has been performed on Google Scholar and PubMed. The criteria used for selecting methodological papers included the date of publication and number of citations. An iterative selection process has been conducted to make an in-depth search and identify new keywords, leading to more specific publications.RESULTS:
As of today, the following limits in low sample size were identified in relevant literature: precision and stability of results and the high-dimensionality issue. Solutions to the problem of reliability of the estimations include: methods avoiding to use the same data during training and test phases, sampling methods (e.g. bootstrapping), penalization methods, ensemble methods and use of external information (e.g. cross-referencing of multiple data sources, a-priori knowledge in Bayesian analysis). Finally, for dealing with high-dimensionality, Feature Extraction, Feature Selection and methods based on sparsity (Sparse (Group) PLS, Sparse (Group) Lasso) seem to be more adapted.CONCLUSIONS:
This synthesis shows that specific statistical and machine learning methods to identify prognostic factors exist with the potential to overcome each type of limits regarding small sample size. However, further research is required to better compare them and to investigate for innovative solutions.Conference/Value in Health Info
2022-11, ISPOR Europe 2022, Vienna, Austria
Value in Health, Volume 25, Issue 12S (December 2022)
Code
MSR129
Topic
Methodological & Statistical Research, Study Approaches
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics, Confounding, Selection Bias Correction, Causal Inference, Literature Review & Synthesis
Disease
No Additional Disease & Conditions/Specialized Treatment Areas