ASSESSING THE CONCORDANCE BETWEEN SELF-REPORTED NICOTINE PRODUCT USE AND EMR DIAGNOSIS CODES IN THE ALL OF US RESEARCH PROGRAM
Author(s)
David Goldfarb, PhD, MPH1, Mary Catherine Minnig, PhD, MS1, Shivani Aggarwal, PhD, MS2.
1Landmark Science, Inc, New York, NY, USA, 2Landmark Science, Inc, Los Angeles, CA, USA.
1Landmark Science, Inc, New York, NY, USA, 2Landmark Science, Inc, Los Angeles, CA, USA.
OBJECTIVES: Ascertaining nicotine/tobacco product use (e.g., cigarettes, e-cigarettes, smokeless tobacco) is critical for studying respiratory- and cancer-related outcomes but can be difficult to measure using secondary data sources. Standardized ontologies capturing nicotine/tobacco use are routinely used but have limited granularity. We evaluated concordance of nicotine/tobacco use from standardized ontologies recorded in electronic medical records (EMRs) with self-report survey data on nicotine/tobacco use using the All of Us (AoU) database.
METHODS: We conducted an observational retrospective analysis of adult (≥18 years) AoU participants with completed self-report questionnaires and linked EMR data. Nicotine/tobacco use was defined using ICD-9-CM, ICD-10-CM, and SNOMED CT concepts for nicotine/tobacco dependence. Concordance between self-reported nicotine/tobacco use and codes was evaluated, along with degree of missingness of nicotine/tobacco codes among those with self-reported use.
RESULTS: Among 571,624 patients with survey data between June 2016-September 2023, 33,665 (5.9%) had EMR data with nicotine/tobacco diagnosis codes within one year of surveys. Prevalent codes were “nicotine dependence, unspecified, uncomplicated” (52.9%), corresponding to self-report cigarette, e-cigarette, and smokeless tobacco use of 87.1%, 39.1%, and 17.7% respectively, and “nicotine dependence, cigarette, uncomplicated” (61.4%), corresponding to similar self-report product use. No codes explicitly captured e-cigarette use. Cigarette codes had 87.8-97.7% agreement with self-reported cigarette use. Poor agreement was observed for other products: 27.2-51.2% with non-specific nicotine/tobacco codes and 70.5-72.4% with chewing tobacco codes reported e-cigarette and smokeless tobacco use, respectively. Among patients with EMR data and no recorded nicotine/tobacco codes within one year of surveys (N=90,637), 30.8%, 15.5%, and 8.6% reported cigarette, e-cigarette, and smokeless tobacco use, respectively.
CONCLUSIONS: Standardized nicotine/tobacco dependence codes underestimated cigarette, e-cigarette, and smokeless tobacco use and rarely distinguished specific nicotine/tobacco products, limiting accurate product-level exposure ascertainment. Standardized ontologies alone should be used with caution to discern nicotine/tobacco exposure levels in real-world studies. Researchers should additionally consider triangulation with unstructured EMR or survey data.
METHODS: We conducted an observational retrospective analysis of adult (≥18 years) AoU participants with completed self-report questionnaires and linked EMR data. Nicotine/tobacco use was defined using ICD-9-CM, ICD-10-CM, and SNOMED CT concepts for nicotine/tobacco dependence. Concordance between self-reported nicotine/tobacco use and codes was evaluated, along with degree of missingness of nicotine/tobacco codes among those with self-reported use.
RESULTS: Among 571,624 patients with survey data between June 2016-September 2023, 33,665 (5.9%) had EMR data with nicotine/tobacco diagnosis codes within one year of surveys. Prevalent codes were “nicotine dependence, unspecified, uncomplicated” (52.9%), corresponding to self-report cigarette, e-cigarette, and smokeless tobacco use of 87.1%, 39.1%, and 17.7% respectively, and “nicotine dependence, cigarette, uncomplicated” (61.4%), corresponding to similar self-report product use. No codes explicitly captured e-cigarette use. Cigarette codes had 87.8-97.7% agreement with self-reported cigarette use. Poor agreement was observed for other products: 27.2-51.2% with non-specific nicotine/tobacco codes and 70.5-72.4% with chewing tobacco codes reported e-cigarette and smokeless tobacco use, respectively. Among patients with EMR data and no recorded nicotine/tobacco codes within one year of surveys (N=90,637), 30.8%, 15.5%, and 8.6% reported cigarette, e-cigarette, and smokeless tobacco use, respectively.
CONCLUSIONS: Standardized nicotine/tobacco dependence codes underestimated cigarette, e-cigarette, and smokeless tobacco use and rarely distinguished specific nicotine/tobacco products, limiting accurate product-level exposure ascertainment. Standardized ontologies alone should be used with caution to discern nicotine/tobacco exposure levels in real-world studies. Researchers should additionally consider triangulation with unstructured EMR or survey data.
Conference/Value in Health Info
2026-05, ISPOR 2026, Philadelphia, PA, USA
Value in Health, Volume 29, Issue S6
Code
RWD98
Topic
Real World Data & Information Systems
Topic Subcategory
Health & Insurance Records Systems, Reproducibility & Replicability
Disease
SDC: Oncology, SDC: Respiratory-Related Disorders (Allergy, Asthma, Smoking, Other Respiratory), STA: Multiple/Other Specialized Treatments