EXPLORING ISSUES IN ANALYZING NATIONAL DATABASES USING LOGISTIC REGRESSION- APPLICATION OF MEDICAL EXPENDITURE PANEL SURVEY

Author(s)

Althemery AU, Lai L, Alfaifi A
Nova Southeastern University, Davie, FL, USA

OBJECTIVES: Most national data use a complex stratified multistage probability design including cluster, strata, and weight adjustment to extrapolate study results to a national level. Survey procedures are available in Statistical Analysis System (SAS) 9.4. However, several issues might occur if not used appropriately. Moreover, no clear agreement exists on detecting multicollinearity in logistic regression, and generating ROC curves in these recent survey procedures. The study investigated three main issues when applying logistic regression in nationally representative multistage survey data: subgroup analysis in multistage sampling design data, multicollinearity in logistic regression, and receiver operating characteristic (ROC) curves for survey procedure METHODS: The current study reviewed, discussed and compared the available principles and techniques. First, results from three procedure statements for subpopulation analyses in (SAS) were contrasted. Also two multicollinearity methods, linear regression and the adjusted weight matrix by maximum likelihood algorithm, were conducted. Lastly, ROC curves in survey logistics were generated using direct and indirect procedures. A cohort of patients diagnosed with high blood cholesterol was obtained from Medical Expenditures Panel Survey (MEPS) 2012, and was utilized to provide examples of the reviewed statistical techniques.  RESULTS: The study showed that the results without domain statement yielded potentially overestimated estimates and standard errors. The tolerance test and variance inflation factor (VIF) for detecting multicollinearity slightly changed after adjusting weight matrix. However, the two methods agreed that none of the tested independent factors were collinear. ROC curves accounting for the national estimation were successfully generated and offered similar but more reliable estimates.  CONCLUSIONS: Accounting for total population weights when analyzing a subgroup in national databases is important. New methods are required for exploring multicollinearity in survey logistic regression procedures.

Conference/Value in Health Info

2016-05, ISPOR 2016, Washington DC, USA

Value in Health, Vol. 19, No. 3 (May 2016)

Code

SY3

Topic

Methodological & Statistical Research, Real World Data & Information Systems

Topic Subcategory

Confounding, Selection Bias Correction, Causal Inference, Reproducibility & Replicability

Disease

Cardiovascular Disorders, Diabetes/Endocrine/Metabolic Disorders

Explore Related HEOR by Topic


Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on ispor.org. Update my browser now

×