Clean Balance-Ensemble CHD: A Balanced Ensemble Learning Framework for Accurate Coronary Heart Disease Prediction

Merlin Sofia S; D. Ravindran; G. Arockia Sahaya Sheela

doi:10.58414/SCIENTIFICTEMPER.2025.16.10.05

Authors

Merlin Sofia S Research Scholar, Department of Computer Science, St. Joseph’s College (Autonomous), Affiliated to Bharathidasan University, Tiruchirappalli, Tamil Nadu, India.
D. Ravindran Associate Professor, Department of Computer Science, St. Joseph’s College (Autonomous),Affiliated to Bharathidasan University, Tiruchirappalli, Tamil Nadu, India.
G. Arockia Sahaya Sheela Assistant Professor, Department of Computer Science, St. Joseph’s College (Autonomous), Affiliated to Bharathidasan University, Tiruchirappalli, Tamil Nadu, India.

Abstract

Coronary Heart Disease (CHD) is still one of the leading causes of death worldwide, which necessitates early and reliable prediction methods to support timely medical interventions. Traditional machine learning approaches frequently struggle with noisy and imbalanced datasets which leading to biased predictions and reduced diagnostic reliability. To address these limitations, this paper proposes the CleanBalance-EnsembleCHD algorithm that combines data cleaning, balancing, and ensemble learning to improve prediction accuracy. The goal is to reduce noise, handle imbalance, and combine the strengths of multiple classifiers to detect CHDs more effectively. For noise reduction, the methodology employs Edited Nearest Neighbor (ENN) and Iterative Partitioning Filter (IPF), if imbalance persists Synthetic Minority Oversampling Technique (SMOTE) used. Five classifiers namely Rotation Forest, LogitBoost, Multilayer Perceptron, Logistic Model Trees (LMT), and Random Forest were trained, with the best models chosen for weighted soft-voting ensemble integration. The experimental evaluation on a CHD dataset with an initial class imbalance (maj/min ratio: 1.038, Gini index: 0.4998) revealed significant improvements. After ENN and IPF cleaning, the dataset was reduced from 1011 to 853 balanced instances (class counts: {1.0=414, 0.0=439}). Individual classifiers performed well, with accuracies of 97.36% (Rotation Forest), 94.72% (LogitBoost), 96.04% (Multilayer Perceptron), 97.95% (LMT), and 98.53% (Random Forest). After that, the top three models chosen Random Forest, LMT, and Rotation Forest were combined into an ensemble that outperformed all individual models on the test set, with Accuracy: 99.42%, F1-score: 0.9939, and MCC: 0.9884. These findings show that CleanBalance-EnsembleCHD provides superior predictive reliability leading to noise-resistant and balanced decision-making. Finally, the proposed framework provides a powerful and interpretable solution for early CHD detection using the potential to help clinicians with risk assessment and medical decision support.

How to Cite

Sofia S, M., Ravindran, D., & Sheela, G. A. S. (2025). Clean Balance-Ensemble CHD: A Balanced Ensemble Learning Framework for Accurate Coronary Heart Disease Prediction. The Scientific Temper, 16(10), 4870–4878. https://doi.org/10.58414/SCIENTIFICTEMPER.2025.16.10.05

Download Citation

Downloads

Download data is not yet available.

Clean Balance-Ensemble CHD: A Balanced Ensemble Learning Framework for Accurate Coronary Heart Disease Prediction

Downloads

Published

DOI:

Keywords:

Dimensions Badge

Issue

Section

License

Authors

Abstract

How to Cite

Downloads

Similar Articles

Most read articles by the same author(s)

Make a Submission

Cover

Menu