TY - JOUR
T1 - Predicting Ischemic Heart Disease and Determining Its Risk Factors
T2 - A Comparison of Various Classification Methods in Machine Learning
AU - Yaqoob, Muhammad
AU - Iqbal, Farhat
N1 - Publisher Copyright:
© 2025, Thai Statistical Association. All rights reserved.
PY - 2025/7
Y1 - 2025/7
N2 - This study was conducted to identify the important risk factors of ischemic heart disease (IHD) amongst the population of Balochistan and to determine the most accurate machine learning (ML) algorithms for the prediction of IHD. The data were collected from 300 individuals (100 IHD cases and 200 control cases) on common risk factors of IHD. The risk factors included marital status, physical activity, socioeconomic position, type of oil used for cooking, diet, body mass index, blood pressure, random blood sugar, history of known disease, and cholesterol level. We employed linear discriminant analysis (LDA), artificial neural networks (ANN), naïve Bayes (NB) and random forest (RF) classification methods. The data were randomly partitioned into training (70%) and testing (30%) sets. The classification methods were evaluated based on their accuracy rates, sensitivity, specificity, positive and negative predictive values, and area under the receiver operating characteristic curve. The results of the study indicated that ANN was the most accurate classification method, with an accuracy of 88.89%, followed by NB, LDA and RF, with accuracy rates of 86.67%, 85.56% and 84.44%, respectively. Moreover, in most classification methods, blood pressure, cholesterol levels, physical activity, diet, BMI, and family history were found as the important factors for developing the risk of IHD. The study’s results indicated that ML methods, especially ANN, can be employed for accurately predicting the state of IHD and determining the important risk factors.
AB - This study was conducted to identify the important risk factors of ischemic heart disease (IHD) amongst the population of Balochistan and to determine the most accurate machine learning (ML) algorithms for the prediction of IHD. The data were collected from 300 individuals (100 IHD cases and 200 control cases) on common risk factors of IHD. The risk factors included marital status, physical activity, socioeconomic position, type of oil used for cooking, diet, body mass index, blood pressure, random blood sugar, history of known disease, and cholesterol level. We employed linear discriminant analysis (LDA), artificial neural networks (ANN), naïve Bayes (NB) and random forest (RF) classification methods. The data were randomly partitioned into training (70%) and testing (30%) sets. The classification methods were evaluated based on their accuracy rates, sensitivity, specificity, positive and negative predictive values, and area under the receiver operating characteristic curve. The results of the study indicated that ANN was the most accurate classification method, with an accuracy of 88.89%, followed by NB, LDA and RF, with accuracy rates of 86.67%, 85.56% and 84.44%, respectively. Moreover, in most classification methods, blood pressure, cholesterol levels, physical activity, diet, BMI, and family history were found as the important factors for developing the risk of IHD. The study’s results indicated that ML methods, especially ANN, can be employed for accurately predicting the state of IHD and determining the important risk factors.
KW - Ischemic heart disease
KW - machine learning algorithms
KW - prediction
KW - risk factors
UR - https://www.scopus.com/pages/publications/105009374234
M3 - Article
AN - SCOPUS:105009374234
SN - 1685-9057
VL - 23
SP - 677
EP - 691
JO - Thailand Statistician
JF - Thailand Statistician
IS - 3
ER -