Overfitting Mitigation in Neural Networks: Could Ergodic Regularization Be the Future of AI?
Author(s)
Yovani Torres Favier, BSc, MSc1, Antonio Monleon, BSc, MSc, PhD2, Carlos Crespo, BEc, BSc, MASc, MBA, MFE, MSc, PhD1.
1Axentiva Solutions, Barcelona, Spain, 2University of Barcelona, Barcelona, Spain.
1Axentiva Solutions, Barcelona, Spain, 2University of Barcelona, Barcelona, Spain.
OBJECTIVES: Overfitting in artificial neural networks (ANN), where models learn noise instead of true signals, poses significant challenges in biomedical applications with limited sample sizes. Traditional regularization methods like weight decay and dropout often prove insufficient for complex medical datasets, where robust generalization is critical for accelerating patient access. This study aimed to develop a novel theoretical framework based on ergodic principles that prevents overfitting.
METHODS: We established conditions under which ANN become ergodic transformations, ensuring that generalization gaps vanish. We derived an "ergodic regularizer" that penalizes deviations from volume preservation by controlling the Jacobian determinant of network layers. We evaluated this approach across multilayer perceptrons, convolutional networks, recurrent networks, and transformers architectures on Wisconsin Breast Cancer and diabetes data, comparing against standard L2 weight decay and dropout regularization. Analyses were deployed in Python.
RESULTS: Across all tested architectures, ergodic regularization consistently outperformed traditional methods. In breast cancer feature reconstruction, the ergodic approach achieved 38% lower test error compared to standard regularization. For diabetes density, ergodic regularization produced significantly higher log-likelihood scores, indicating better capture of underlying data distributions without overfitting. The method showed strength in scenarios with limited training data.
CONCLUSIONS: Ergodic regularization has been demonstrated to be an effective approach to mitigating overfitting in a variety of machine learning problems. By enforcing volume preservation properties, this method ensures ANN learn genuine data patterns rather than dataset-specific artifacts, showing particular promise for real-world science applications where robust generalization is essential for clinical reliability and impacting the drug discovery pipeline.
METHODS: We established conditions under which ANN become ergodic transformations, ensuring that generalization gaps vanish. We derived an "ergodic regularizer" that penalizes deviations from volume preservation by controlling the Jacobian determinant of network layers. We evaluated this approach across multilayer perceptrons, convolutional networks, recurrent networks, and transformers architectures on Wisconsin Breast Cancer and diabetes data, comparing against standard L2 weight decay and dropout regularization. Analyses were deployed in Python.
RESULTS: Across all tested architectures, ergodic regularization consistently outperformed traditional methods. In breast cancer feature reconstruction, the ergodic approach achieved 38% lower test error compared to standard regularization. For diabetes density, ergodic regularization produced significantly higher log-likelihood scores, indicating better capture of underlying data distributions without overfitting. The method showed strength in scenarios with limited training data.
CONCLUSIONS: Ergodic regularization has been demonstrated to be an effective approach to mitigating overfitting in a variety of machine learning problems. By enforcing volume preservation properties, this method ensures ANN learn genuine data patterns rather than dataset-specific artifacts, showing particular promise for real-world science applications where robust generalization is essential for clinical reliability and impacting the drug discovery pipeline.
Conference/Value in Health Info
2025-11, ISPOR Europe 2025, Glasgow, Scotland
Value in Health, Volume 28, Issue S2
Code
MSR161
Topic
Methodological & Statistical Research, Real World Data & Information Systems, Study Approaches
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics, Confounding, Selection Bias Correction, Causal Inference
Disease
No Additional Disease & Conditions/Specialized Treatment Areas