Applying Machine Learning Techniques to Identify Undiagnosed Patients with Nonalcoholic Steatohepatitis (NASH)

Author(s)

Baser O1, Mete F2, Yapar N2, Baser E3
1City University of New York, New York, NY, USA, 2Columbia Data Analytics, New York, NY, USA, 3Columbia Data Analytics, New York, UNITED STATES

OBJECTIVES: Nonalcoholic Steatohepatitis (NASH) is liver inflammation and damage caused by a buildup of fat in the liver. NASH is underdiagnosed as patients are often asymptomatic or present with non-specific symptoms. To develop a machine learning model that identifies patients in a Veteran Health Systems who likely have NASH but are undiagnosed.

METHODS: Scikit-learn, Python module is used as a machine learning algorithm. The study of population was selected from Veteran’s Health Administrative data, consisted of patients with NASH-prone conditions. Patients are labeled with 150 condition category flags and split into actual positive NASH cases, actual negative NASH cases, and unlabeled cases. The study population was then randomly divided into a training subset and a testing subset. The training subset was used to determine 30 models and to select the highest performing model, and the testing was used to evaluate performance of the best machine learning model.

RESULTS: The study population consisted of 30,415 actual positive NASH cases, 265,965 actual negative NASH cases, and 181,375 unlabeled cases. In the best performing model, the precision, recall, and accuracy were 0.90, 0.82, and 0.88, respectively. The best performing model estimated that the number of patients likely to have NASH was about 6 times the number of patients directly identified as NASH-positive through a claims analysis in the study population. The most important features in assigning NASH probability were presence or absence of diagnoses codes related to obesity or diabetes.

CONCLUSIONS: The prevalence of NASH is increasing, but more concerning is the disproportionate increase in those with advanced fibrosis, hepatocellular carcinoma and hepatic decompensation. In United States, NASH is currently the leading indicator for liver transplant in women and those over 65 years of age. Machine Learning Techniques can help identify undiagnosed patients so that upcoming treatment can be applied broadly to delay the disease progression.

Conference/Value in Health Info

2023-05, ISPOR 2023, Boston, MA, USA

Value in Health, Volume 26, Issue 6, S2 (June 2023)

Code

MSR39

Topic

Methodological & Statistical Research

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics

Disease

Urinary/Kidney Disorders

Explore Related HEOR by Topic


Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on ispor.org. Update my browser now

×