To assess the effect of inducing covariation among simulated high-density lipoprotein (HDL-C), triglyceride, and total cholesterol values on Framingham risk equation results.
National Health and Nutrition Examination Survey (NHANES) data were used to estimate means and standard deviations for HDL-C, triglyceride, and total cholesterol for all Type II diabetic patients (N = 293) and patients with metabolic syndrome (N = 2303). NHANES data were also used to estimate correlations between HDL-C, triglyceride, and total cholesterol. Data were simulated and bootstrapped for 1000 replications of the numbers of patients in NHANES. Four-year risks of coronary heart disease were estimated using the Framingham risk equation that includes a nonlinear Weibull function. The differences in means, with and without correlation, were compared to zero to determine whether not inducing correlation was associated with bias. The ratios of variances with and without correlation were compared to one to determine whether not inducing correlation was associated with a different level of precision. All simulation results were compared with bootstrapping results.
Bootstrapping maintained the correlation in the original data. Inducing correlation leads to more precise estimates that are closer to the bootstrapped estimates for Framingham equations not including triglycerides. Using the Framingham equation for women with triglycerides, the correlated simulation data produce less precise estimates than the uncorrelated data; the uncorrelated data are more precise than the bootstrapped results.
Not inducing correlation can affect results that combine multiple simulated parameters using nonlinear functions. Researchers engaged in modeling should consider the value of inducing correlation in their simulated data.