Acta Scientific Computer Sciences

Research Article Volume 6 Issue 3

Integration of Machine Learning Techniques for Heart Disease Prediction

Adebisi Abraham Owodunni*, Tareq Al-Jaber and Zhibao Mian

Computer Science (Artificial Intelligence and Data Science), University of Hull, Hull, England, United Kingdom

*Corresponding Author: Adebisi Abraham Owodunni, Computer Science (Artificial Intelligence and Data Science), University of Hull, Hull, England, United Kingdom.

Received: February 02, 2024; Published: February 22, 2024

Abstract

As important as the heart is to humans, unfortunately, 43% of death is from heart disease [2] declared by Global Burden of Disease research. By 2030, deaths from cardiovascular disease will reach 23.6 million where heart disease takes the lead [3]. Annually, 10 million people die globally according to World Health Organization (WHO). There have been (pre)established conventional ways of detecting this disease in humans like angiography, electrocardiograms among others, which are not only expensive for the common man, but have been proven, but over 17 million individuals have lost their lives to lack of expertise, incapacitation with several side effects [4]. According to a WHO survey, only 67% of the time, doctors can accurately predict heart disease. Hence the need for noninvasive and a more efficient technique thereby leveraging on Data Science (Machine Learning - ML). This research makes use of ML techniques to classifying Heart Disease through the comparative way of their metrics to predict heart disease in individuals, ii. Investigate the most relevant features and the risk factors contributing to predicting heart disease, iii. Evaluate the performance of the developed models using appropriate metrics, iv. Provide insights and recommendations for healthcare professionals to improve early diagnosis and intervention strategies. These involve four classifiers: XGBoost, Random Forest (RF), Logistic Regression (LR), and Support Vector Machine, to classify and predict heart disease using the Framingham heart disease dataset. Different models were built after handling missing values and outliers in the dataset. Before balancing the dataset, the models built, LR and RF gave the best performance with an accuracy of 85% each. The dataset was later balanced/resampled, and important features selection was done using the XGBoost classifier, Sequential Feature Selection (SFS) and KBest methods respectively, and these improved the performance of the model. Ensemble techniques (AdaBoost and Bagging) were adopted and the AdaBoost model (RF classifier) performed as high as giving an accuracy of 93%. Hyperparameter tuning was done involving Randomized SearchCV and Grid SearchCV, but none outperformed the AdaBoost model’s performance. Lastly, the balanced dataset was split into train and test datasets (ratio of 80:20), and a model was built/trained with the train dataset and then tested with the test dataset, this gave an accuracy of 93% as that of the AdaBoost model, but a better CV_score: 0.9110, R2_score: 0.7078, AUC curve: 0.98, RSME: 0.2701, MAE: 0.0730 with Random Forest classifier.

Keywords: Random Forest Classifier; Logistic Regression Classifier; Sequential Feature Selection; AdaBoost and Bagging; Support Vector Machine Classifier; XGBoost classifier

References

  1. Bhatt CM., et al. “Effective Heart Disease Prediction Using Machine Learning Techniques”. Algorithms 16 (2023): 88.
  2. Estes C., et al. “Modeling NAFLD disease burden in China, France, Germany, Italy, Japan, Spain, United Kingdom, and United States for the period 2016-2030”. Journal of Hepatology 69 (2018): 896-904.
  3. Purushottam Saxena K and Sharma R. “Efficient Heart Disease Prediction System”. Procedia Computer Science 85 (2016): 962-969.
  4. Vardhan Shorewala. “Early detection of coronary heart disease using ensemble techniques”. Informatics in Medicine Unlocked 26 (2021): 100655.
  5. Mozaffarian D., et al. “Heart disease and stroke statistics—2015 update: A report from the American Heart Association”. Circulation 131 (2015): e29-e322.
  6. Avinash Golande and Pavan Kumar T. “Heart Disease Prediction Using Effective Machine Learning Techniques”. International Journal of Recent Technology and Engineering 8 (2019): 944-950.
  7. H Schmidt. “Chronic disease prevention and health promotion” (2016).
  8. Apte C S. “Improve study of Heart Disease prediction system using Data Mining Classification techniques”.
  9. Fahd Saleh Alotaibi. “Implementation of Machine Learning Model to Predict Heart Failure Disease”. (IJACSA) International Journal of Advanced Computer Science and Applications6 (2019).
  10. Li J., et al. “Work stress and cardiovascular disease: A life course perspective”. Journal of Occupational Health 58 (2016): 216-219.
  11. Hasan N and Bao Y. “Comparing different feature selection algorithms for cardiovascular disease prediction”. Health Technology 11 (2020): 49-62.
  12. Karthiga A S., et al. “Early Prediction of Heart Disease Using Decision Tree Algorithm”. International Journal of Advanced Research in Basic Engineering Sciences and Technology (2017).
  13. Theresa Princy R and J Thomas. “Human heart Disease Prediction System using Data Mining Techniques”. International Conference on Circuit Power and Computing Technologies, Bangalore (2016).
  14. Amanda H Gonsalves., et al. “Prediction of Coronary Heart Disease using Machine Learning: An Experimental Analysis”. In Proceedings of the 2019 3rd International Conference on Deep Learning Technologies (ICDLT '19). Association for Computing Machinery, New York, NY, USA (2019): 51-56.
  15. Nagaraj M Lutimath., et al. “Prediction Of Heart Disease using Machine Learning”. International Journal Of Recent Technology and Engineering 2S10 (2019): 474-477.
  16. Mohamed AAA., et al. “Parasitism—Predation algorithm (PPA): A novel approach for feature selection”. Ain Shams Engineering Journal 11 (2020): 293-308.
  17. Anjan Nikhil Repaka., et al. “Design And Implementation Heart Disease Prediction Using Naives Bayesian”. International Conference on Trends in Electronics and Information (ICOEI 2019).
  18. Muhammad Y., et al. “Early and accurate detection and diagnosis of heart disease using intelligent computational model”. Scientific Report 10 (2020): 19747.
  19. Reddy KVV., et al. “An Efficient Prediction System for Coronary Heart Disease Risk Using Selected Principal Components and Hyperparameter Optimization”. Applied Science 13 (2023): 118.
  20. Alam Z and Rahman MS. “A Random Forest based predictor for medical data classification using feature ranking”. Informatics in Medicine Unlocked 15 (2019): 100180.

Citation

Citation: Adebisi Abraham Owodunni., et al. “Application of Machine Learning Techniques for the Prediction of Heart Disease".Acta Scientific Computer Sciences 6.3 (2024): 13-23.

Copyright

Copyright: © 2024 Adebisi Abraham Owodunni., et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.




Metrics

Acceptance rate35%
Acceptance to publication20-30 days

Indexed In




News and Events


  • Certification for Review
    Acta Scientific certifies the Editors/reviewers for their review done towards the assigned articles of the respective journals.
  • Submission Timeline for Upcoming Issue
    The last date for submission of articles for regular Issues is December 25, 2024.
  • Publication Certificate
    Authors will be issued a "Publication Certificate" as a mark of appreciation for publishing their work.
  • Best Article of the Issue
    The Editors will elect one Best Article after each issue release. The authors of this article will be provided with a certificate of "Best Article of the Issue"

Contact US