Am Heart J Plus. 2026 Jan 10;63:100719. doi: 10.1016/j.ahjo.2026.100719. eCollection 2026 Mar.
ABSTRACT
BACKGROUND: Cardiovascular diseases (CVDs) remain the leading cause of mortality worldwide, arising from complex interactions among demographic, clinical, behavioral, and social determinants. Leveraging large, nationally representative datasets such as the 2022 Behavioral Risk Factor Surveillance System (BRFSS) offers a unique opportunity to identify emerging risk patterns, monitor population-level disparities, and inform more targeted prevention strategies.
METHODS: A total of 221,643 participants from the BRFSS 2022 survey were included after excluding records with missing data. Thirteen key predictors spanning demographics, chronic conditions, and social factors were selected. Data were split into training (80%) and testing (20%) sets. Four machine learning models, XGBoost, Random Forest, Logistic Regression, and Naive Bayes, were developed and evaluated using stratified 10-fold cross-validation. Model performance was assessed via accuracy, F1 score, precision, sensitivity, and ROC AUC. Synthetic Minority Over-sampling Technique (SMOTE) addressed class imbalance. SHAP values provided insights into feature importance and model interpretability.
RESULTS: XGBoost demonstrated the best predictive performance (accuracy 94.2%, F1 score 85.3%, ROC AUC 0.94). SHAP analysis highlighted age ≥ 65, male gender, and diabetes as the strongest predictors, with additional contributions from kidney disease, employment status, and social isolation. Protective effects were observed for never smoking and higher education. Stratified analyses revealed that while overweight/obesity (BMI ≥25) was generally associated with higher CVD prevalence, the association was attenuated in older adults, smokers, and those with diabetes or kidney disease, suggesting illness-related weight loss, frailty, and behavioral confounding. These subgroup insights contextualize the apparent "BMI paradox" observed in the aggregate data.
CONCLUSIONS: Findings from the BRFSS 2022 highlight both established and emerging determinants of CVD risk, including the modifying effects of comorbidities, social isolation, and BMI-related heterogeneity. Beyond algorithmic performance, these results underscore the value of national surveillance data for informing applied, actionable strategies in CVD prevention and risk stratification.
PMID:41631000 | PMC:PMC12860618 | DOI:10.1016/j.ahjo.2026.100719