Abstract

Efforts have been made to address the adverse impact of heart disease on society by improving its treatment and diagnosis. This study uses the Jordan University Hospital (JUH) Heart Dataset to develop and evaluate machine-learning models for predicting heart disease. The primary objective of this study is to enhance prediction accuracy by utilizing a comprehensive approach that includes data preprocessing, feature selection, and model development. Various artificial intelligence techniques, namely, random forest, SVM, decision tree, naive Bayes, and K-nearest neighbours (KNN) were explored with particle swarm optimization (PSO) for feature selection. These results have substantial implications for early disease detection, diagnosis, and tailored treatment, potentially aiding medical professionals in making well-informed decisions and improving patient outcomes. The PSO is used to select the most compelling features out of 58 features. Experiments on a dataset comprising 486 heart disease patients at JUH yielded a commendable classification accuracy of 94.3% using our proposed system, aligning with state-of-the-art performance. Notably, our research utilized a distinct dataset provided by the corresponding author, while alternative algorithms in our study achieved accuracies ranging from 85% to 90%. These results emphasize the superior accuracy of our proposed system compared to other algorithms considered, particularly highlighting the SVM classifier with PSO as the most accurate, contributing significantly to improving heart disease diagnosis in regions like Jordan, where cardiovascular diseases are a leading cause of mortality.

1. Introduction

Heart disease is a leading global cause of mortality, and its timely diagnosis presents significant challenges due to overlapping symptoms with other health conditions. This complexity can complicate treatment significantly when detection is delayed. Heart disease encompasses various cardiac conditions, making symptom recognition challenging. The similarity of symptoms to other ailments often leads to misdiagnosis, potentially exacerbating the patient’s condition and posing life-threatening risks.

Several factors, such as smoking, poor nutrition, high blood pressure, and sedentary lifestyles, contribute to the development of heart disease. Diagnosis typically involves various tests, including blood tests, X-rays, electrocardiography, echocardiography, and invasive procedures such as cardiac catheterization and biopsy. However, these processes are time-consuming, costly, and demanding for both patients and healthcare providers, with no guarantee of pinpointing the exact type of heart disease.

The rise in heart attack rates among young individuals, financial burdens, limitations of existing medical tools, and diagnostic challenges highlight the need for innovative solutions. Computerized systems have emerged as promising alternatives to traditional methods, offering faster and more efficient heart disease risk prediction. Machine learning can potentially enhance early detection and diagnosis, addressing the challenges posed by heart disease more effectively.

This study investigates integrating multiple artificial intelligence techniques, including random forest, SVM, decision tree, naive Bayes, and k-nearest neighbours (KNN), coupled with particle swarm optimization (PSO) for feature selection. The aim is to predict the presence of heart disease and assess the classifiers’ accuracy. The analysis uses a dataset comprising 486 patient records obtained from JUH (Jordan University Hospital).

The research is organized as follows. Section 2 delves into the background of the terminology and related works. Section 3 outlines the research methodology. Section 4 examines the obtained results. Section 5 unveils insights and innovations in heart disease prediction. Finally, Section 6 presents the conclusion of the work.

2.1. Understanding Heart Diseases: Overview and Diagnosis

Heart disease is a broad term encompassing various medical conditions that affect the heart’s components, a muscular organ responsible for tirelessly pumping blood throughout the body. The heart’s efficient function is vital for distributing oxygen and nutrients while eliminating waste products. It relies on a network of coronary arteries to maintain its oxygen supply. Any damage or interference with this intricate system disrupts the heart’s function and impacts the entire body.

Coronary artery disease (CAD) is the most common heart disease, characterized by the gradual narrowing of heart and body arteries due to fatty deposits called plaques. This narrowing restricts blood flow and oxygen supply, often leading to symptoms such as angina (chest pain) during physical activity. In severe cases, plaque rupture can result in unstable angina (USA) or a heart attack (myocardial infarction, MI).

Heart diseases can also manifest as arrhythmias, such as atrial fibrillation (AF), which involves irregular heart rhythms. AF can vary in duration and symptoms, including palpitations, fainting, or chest pain.

Recognizing heart disease symptoms can be challenging as they may initially be nonspecific, including fatigue, shortness of breath, dizziness, or nausea. Accurate diagnosis requires medical testing to differentiate heart-related symptoms from those caused by other factors.

Standard diagnostic methods include the following:Medical history: information is gathered on risk factors such as age, coronary artery disease (CAD), diabetes, and smoking to understand the patient’s background and potential contributing factorsPhysical examination: a thorough evaluation of signs and symptoms related to heart disease, including listening to heart sounds, measuring blood pressure, and assessing clinical indicatorsLaboratory blood tests: measuring cardiac biomarkers such as troponin and CK-MB to detect proteins indicative of muscle cell damage, particularly in chest pain or acute coronary syndrome casesElectrocardiogram (ECG): a noninvasive test that records the heart’s electrical activity and rhythm, helping identify abnormalities, damage, or ischemia

These diagnostic tools play a crucial role in determining the presence and nature of heart disease, guiding further management and treatment decisions.

2.2. Machine Learning in the Medical Domain

Machine-learning algorithms have become indispensable tools for analyzing medical datasets, particularly in the context of medical diagnosis. The digital revolution has facilitated the collection and storage of medical data, with modern hospitals equipped with various monitoring devices and data systems accumulating extensive datasets. These datasets are now readily accessible within information systems, enabling the application of machine-learning techniques for comprehensive healthcare analysis and decision-making.

Machine-learning technology is particularly well-suited for analyzing medical data and has found extensive use in specialized diagnostic problems. By inputting patient records with known correct diagnoses into a computer program, a learning algorithm can be employed to train a machine-learning model. This process enables the development of a clinical decision support system (CDSS), a derived classifier that aids physicians in diagnosing new patients, enhancing diagnostic speed, accuracy, and reliability. Additionally, CDSS can be a valuable tool for training nonspecialist individuals in diagnosing patients with specific medical conditions.

Machine learning is crucial in improving medical diagnosis and decision-making, making healthcare processes more efficient and effective.

2.3. Supervised and Unsupervised Learning

Supervised learning involves training algorithms using labelled data with known desired outputs. These algorithms learn patterns and relationships in the data, enabling them to make predictions or classify new data based on the learned patterns. Examples of supervised learning algorithms include regression, classification, and decision trees.

In contrast, unsupervised learning involves training algorithms on unlabelled data with unknown desired output. These algorithms learn to identify patterns and relationships in the data and can be used for tasks like grouping or clustering similar data points. Unsupervised learning algorithms encompass clustering, anomaly detection, and dimensionality reduction techniques.

Supervised learning is commonly used for tasks such as image recognition, speech recognition, and natural language processing, where the algorithm learns to map inputs to specific outputs. On the other hand, unsupervised learning is valuable for exploratory data analysis, helping identify hidden patterns and relationships in the data.

2.3.1. Support Vector Machines (SVMs)

Support vector machines (SVMs) are a well-established machine-learning methodology for classification and regression tasks. SVMs construct a hyperplane or a set of hyperplanes within a high-dimensional feature space, primarily oriented toward classification objectives. The central aim of SVMs lies in pinpointing the hyperplane that maximizes the margin between distinct classes, thereby forming a resilient decision boundary. In the realm of heart disease detection, SVMs are deployed to categorize patients into various disease groups, relying on diverse features, as shown in Figure 1:

2.3.2. Random Forests (RF)

Random forests are an ensemble learning approach that combines multiple decision trees to make predictions. Each decision tree is constructed using a random subset of training data and features, reducing the risk of overfitting and enhancing generalization. Random forests are adept at handling high-dimensional data, capturing complex feature interactions, and providing rankings of feature importance. They have found widespread use in heart disease detection due to their strong performance and interpretability. The algorithm is shown in Figure 2.

2.3.3. Decision Trees (DTs)

Decision trees represent a straightforward yet potent machine-learning algorithm that employs a tree-like structure for decision-making. Internal nodes correspond to features, while leaf nodes signify classes or predictions. Decision trees recursively split the data based on features, creating decision rules for predicting the target variable. They offer intuitiveness, ease of interpretation, and versatility in handling categorical and continuous features. Decision trees are applied in heart disease detection, offering transparent decision-making processes and insights into feature importance. The algorithm is shown in Figure 3.

2.3.4. Naive Bayes

Naive Bayes is a probabilistic machine-learning technique founded on Bayes’ theorem. It operates under the assumption of conditional independence among features given the class label; despite this simplifying assumption, naive Bayes classifiers have demonstrated effectiveness across various domains, including text classification and medical diagnosis. Naive Bayes algorithms classify patients into distinct disease categories based on features in heart disease detection. The algorithm is shown in Figure 4.

2.3.5. K-Nearest Neighbours (KNNs)

K-nearest neighbours algorithm is a nonparametric machine-learning method suitable for classification and regression tasks. Using a specified distance metric, KNN classifies a new instance by identifying the K-nearest neighbours within the training data. The class label of the new instance is determined through a majority vote among its K-nearest neighbours. KNN is straightforward to implement, although its performance may be sensitive to the choice of distance metric and the value of K. In heart disease detection, KNN is applied, mainly when dealing with nonlinear class distributions. The algorithm is shown in Figure 5.

2.4. Related Work

In [1], Qureshi et al. introduced a novel approach for brain tumour detection using the ultra-light learning architecture (UL-DLA) and gray-level co-occurrence matrix (GLCM) for extracting in-depth and textural features, respectively. The resulting hybrid feature space (HFC) and a support vector machine performed exceptionally on a T1-weighted MRI dataset, achieving a 99.23% average detection rate and a 0.99% F1 measure.

In [2], a study optimized epileptic seizure recognition using deep learning, achieving a notable test accuracy of 0.993 with the Conv1D + LSTM architecture. The investigation provided insights into the variable responses of different deep learning models to feature scaling, PCA, and feature selection methods, aiming to enhance epileptic seizure recognition for improved patient outcomes.

A study published in [3] investigated the use of machine learning to predict pelvic tilt and lumbar angle in women experiencing urinary incontinence. AdaBoost exhibited high accuracy (R2 = 0.944) for pelvic tilt prediction, suggesting potential advancements in assessing pelvic floor dysfunction. In [4], convolutional neural networks (CNNs) were employed to classify monkeypox skin lesions, achieving a remarkable 95.3% accuracy after optimization with the grey wolf optimizer (GWO). This proposed approach offers an effective method for expedited and accurate monkeypox diagnosis, with significant implications for public health outcomes.

The paper in [5] introduced a machine-learning framework for predicting the hepatitis C virus in Egyptian healthcare workers, showcasing improved accuracies after sequential forward selection (SFS). After hyperparameter tuning with only four features, the random forest (RF) classifier achieved 94.88% accuracy.

In [6], data mining techniques, including decision trees, naïve Bayes, and neural networks, were employed to predict hepatitis C virus (HCV) infection using medical profiles, revealing valuable insights for training healthcare professionals and demonstrating the effectiveness of each technique across diverse datasets.

The paper in [7] applied an end-to-end machine-learning paradigm, leveraging automated machine learning (AutoML), for landslide susceptibility mapping in the Three Gorges Reservoir area. The AutoML-based stacked ensemble model outperformed classical ML approaches, achieving the highest AUC at 0.954. This user-friendly solution provides an efficient alternative for landslide susceptibility mapping, particularly for practitioners with limited ML expertise.

In [8], the earthworm optimization algorithm-optimized support vector regression (EOA-SVR) demonstrated superior performance in accurate reservoir landslide displacement prediction, surpassing other metaheuristics and highlighting its potential for reliable predictions in medium- and long-term landslide early warning systems.

The study in [9] addressed ground fissures through a comprehensive approach integrating field investigation and an analytic hierarchy process for hazard evaluation. The method effectively assessed ground fissure hazards, aiding risk zoning and predictions for informed urban and rural planning.

In [10], supervised classification and remote sensing were employed to analyze landslide evolution in the Mianyuan River Basin after the 2008 Wenchuan earthquake. The random forest algorithm exhibited an 87% average accuracy in landslide identification, emphasizing the importance of long-term monitoring after significant seismic events.

In [11], the study used remote sensing and supervised classification to examine a decade of postearthquake landslide evolution in the Mianyuan River Basin after the 2008 Wenchuan earthquake. The random forest algorithm achieved an 87% average accuracy, revealing temporal patterns and underscoring the importance of prolonged monitoring for understanding landslide dynamics following major seismic events.

In [12], a nonlocal adaptive hysteresis smoothing (NLAHS) method was introduced for image denoising, showing significant quality improvements over existing methods based on objective and subjective criteria. The path-based NLAHS (PBNLAHS) method further improved block similarity determination.

The study in [13] highlighted the role of data analytics and machine learning in managing extensive healthcare data, focusing on Alzheimer’s disease (AD) and mild cognitive impairment (MCI). Employing feature selection methods such as T-test and genetic algorithm, coupled with support vector machine (SVM) classification, the study achieved high accuracy in categorizing patients based on PET scan images.

The systematic review in [14] assessed the use of AI classification algorithms in coronary artery disease (CAD) detection, with ML methods showing success in CAD diagnosis. The findings suggested the potential of ML models for quicker and more precise CAD diagnosis, reducing errors and associated healthcare costs.

In [15], the challenge of diagnosing gynaecological diseases, including pregnancy and conditions such as polycystic ovarian syndrome, ovarian cysts, and menopause, was addressed. A novel method using artificial neural networks (ANN) to predict pregnancy success achieved a remarkable 96.5% classification accuracy, accommodating patients with various hormone levels and infertility-related factors.

Chronic diseases, particularly diabetes, were discussed in [16], emphasizing the need for timely diagnosis to avoid long-term complications. A binary decision tree methodology optimized testing sequences for more effective patient evaluation, demonstrating high efficiency with a 95.4% accuracy rate when applied to 50 patients.

Studies in [17, 18] focused on renal failure, proposing a decision support system based on a multilayer perceptron (MLP) neural network with 32 input variables. The system’s architecture was optimized iteratively, yielding promising results for quick and accurate renal failure cause prediction. Neural networks were also suggested for diagnosing pharyngitis in [1921], achieving a reported correct diagnosis rate of 95.4%.

Another study [22] aimed to enhance cardiovascular disease (CVD) prediction using machine-learning and advanced feature selection techniques. The proposed model, combining data collection, preprocessing, and hybrid classifiers, achieved an impressive accuracy of 99.05%, highlighting the importance of early diagnosis for CVD.

In [23], an electrocardiogram (ECG) classification approach using machine learning demonstrated high binary and multiclass classification accuracy, indicating its potential in diagnosing cardiac arrhythmias.

The study in [24] aimed to predict heart conditions using various classifiers and feature selection techniques. The extreme gradient boosting classifier achieved the highest accuracy of 81%, emphasizing the importance of early heart disease detection.

The article in [25] proposed an efficient system for diagnosing heart disease using machine-learning techniques, with the SVM classifier showing promising results, demonstrating the potential of intelligent systems for heart disease identification.

In [26], a majority voting ensemble method for predicting heart disease achieved an overall accuracy of 90%, potentially serving as a valuable tool for doctors’ diagnoses.

The research in [27] explored machine-learning algorithms for heart disease prediction, proposing a hybrid approach combining feature selection and optimization techniques with various classifiers. The optimized model achieved an impressive accuracy of 99.65%, outperforming other methods.

The study in [28] utilized multiple machine-learning algorithms to predict heart failure, with the decision tree classifier achieving the highest accuracy.

The significant impact of heart disease on mortality rates was highlighted based on [29, 30], emphasizing the importance of early prediction and diagnosis.

The paper in [31] introduced the HRFLM method, combining random forest and linear models for heart disease prediction. The SVM classifier performed best, emphasizing the importance of accurate prediction in heart disease management.

The research in [32] focused on predicting patient survival in heart failure cases using various classification models. The Extra Tree classifier combined with the synthetic minority oversampling technique (SMOTE) achieved high accuracy, emphasizing the potential of machine learning in healthcare.

The study in [33] proposed a novel diagnostic system for heart disease using logistic regression and K-nearest neighbours algorithms. The KNN algorithm outperformed other models, achieving an accuracy of 87.5%.

The research in [34] developed a heart disease prediction model using machine-learning techniques, combining outlier detection, data balancing, and prediction methods. The model achieved high accuracies, demonstrating potential for clinical decision support.

The study in [35] aimed to predict heart disease using machine-learning algorithms. Random forest proved the most effective model, showing potential as a decision support system for medical practitioners in clinical settings.

The study in [10, 27] explored the application of machine-learning algorithms for heart disease prediction and classification. They proposed a hybrid approach incorporating feature selection using fast correlation-based feature selection (FCBF), feature optimization through particle swarm optimization (PSO) and ant colony optimization (ACO), and various classification algorithms. The study utilized the heart disease dataset from the UCI machine-learning repository, containing 14 attributes related to factors such as age, gender, chest pain type, blood pressure, and cholesterol levels.

This comprehensive research pipeline included data preprocessing, FCBF-based feature selection, PSO and ACO-based feature optimization, and classification using multiple machine-learning algorithms. Evaluation metrics such as accuracy, precision, recall, and F1-score were used. The hybrid approach achieved a maximum classification accuracy of 99.65% with the optimized model by FCBF, PSO, and ACO, outperforming classifiers without optimization, and those optimized solely by FCBF. The study discussed results in detail, emphasizing effectiveness, accuracy, precision, and recall, and included confusion matrices to evaluate classifier performance. Comparative analysis with existing methods underscored the superior accuracy of the proposed hybrid approach.

Based on the information presented in the related work, the strengths and weaknesses of the methodologies employed in the studies were assessed. Areas needing further research were identified, along with suggestions for enhancements.

2.4.1. Strengths
(1)Improved Accuracy: Many studies demonstrated high accuracy rates in predicting heart disease using machine learning, suggesting the potential effectiveness of these methods in assisting healthcare professionals with cardiovascular diagnosis and management.(2)Feature Selection: Several studies employed feature selection techniques to identify crucial attributes for heart disease prediction. This not only streamlines data but also enhances the efficiency and interpretability of the models.(3)Comparative Analysis: Some studies conducted comparative analyses of multiple machine-learning algorithms, aiding researchers in identifying the most accurate and suitable models for heart disease prediction. This contributes to the selection of robust methodologies.(4)Practical Implications: Certain studies discussed the practical implications of their models, envisioning integration into clinical decision support systems or deployment in community health centres, showcasing the potential real-world applications.
2.4.2. Weaknesses
(1)Limited Dataset Size: Some studies relied on relatively small datasets, potentially limiting the generalizability of their results. Exploring more extensive and diverse datasets could enhance the robustness and applicability of the developed models.(2)Specific Feature Selection Techniques: Certain studies used specialized feature selection methods such as Relief and LASSO, which are effective for their datasets but potentially less adaptable to different data or feature sets. A more standardized approach may improve generalizability.(3)Missing Data and Imbalanced Datasets: Studies inconsistently addressed missing data and imbalanced class distribution, potentially impacting model accuracy and generalizability. A more uniform approach to handling such issues could improve overall model reliability.(4)Limited Exploration of Real-time Applications: While some studies hinted at real-time potential, they did not extensively explore or validate the models in real-time settings. Further investigations into the feasibility and performance of proposed models in real-time scenarios are recommended.
2.4.3. Research Gap and Suggestions for Improvement
(1)Diverse and Larger Datasets: Future research should consider employing more varied and extensive datasets to enhance model generalizability and reliability, ensuring that the models perform well across various populations and settings(2)Standardization and Comparison: Adopting a standardized approach to feature selection, machine-learning algorithms, and evaluation metrics would facilitate more meaningful comparisons and benchmarking across different methodologies, aiding in identifying best practices(3)Handling Missing Data and Imbalanced Datasets: Addressing the challenges of missing data and imbalanced class distribution consistently across studies can significantly improve model accuracy and reliability, promoting more robust and trustworthy predictions(4)Exploration of Real-time Applications: Further investigations should delve into the feasibility and performance of proposed models in real-time scenarios, potentially incorporating streaming data or continuous monitoring to assess the models’ adaptability to dynamic healthcare environments

3. Research Methodology

This research will elucidate the methodology for predicting heart disease using machine-learning algorithms. This section aims to provide a comprehensive account of the procedures and steps involved in crafting and assessing our proposed model. We will delve into the data collection process, data preprocessing techniques, feature selection methods, and model training and evaluation and present an overview of our entire methodology.

Our primary objective is to enhance the accuracy and efficiency of heart disease prediction by utilizing machine-learning techniques. By developing a robust and finely tuned model, we aim to aid medical professionals in promptly detecting and diagnosing heart disease, ultimately leading to timely interventions and improved patient outcomes.

To realize this objective, we will collect the disease dataset from JUH, which encompasses pertinent attributes and instances for heart disease prediction. We will harness a variety of machine-learning algorithms, including random forest, support vector machine (SVM), naive Bayes (NB), and decision tree (DT), to construct and assess our prediction model.

The methodology is used to develop and evaluate a machine-learning model for predicting heart disease. The methodology encompasses several pivotal steps, including data collection, data preprocessing, feature selection, model construction, model assessment, and the utilization of performance metrics. The pseudocode for our proposed method is shown in Figure 6.

The pseudocode for our proposed algorithm method is shown in Figure 7.

3.1. JUH Dataset Collection

The dataset preparation process for the heart disease diagnosis system is described as follows. When a patient visits the clinic, various factors must be considered for accurate diagnosis and treatment decisions, including observations, patient responses to questions, physical examinations, and lab results.

This research used a heart disease dataset from JUH in Amman, Jordan, for system testing and training. The dataset comprised a total of 486 cases. Out of these, 324 instances are associated with patients diagnosed with heart disease. The remaining 162 cases pertain to patients without heart disease who have previously visited cardio clinics. It included 58 variables essential for diagnosing heart diseases, categorized as follows:(1)Basic patient information (e.g., age and gender)(2)Patient medical history (10 factors)(3)Reported symptoms (16 factors)(4)Physical examination findings (10 factors)(5)Blood lab results (7 factors)(6)ECG (electrocardiogram) findings (12 factors)

The dataset comprised predominantly binary attributes, denoting the presence or absence of features. However, specific characteristics, such as smoking and heart rhythm, had multiple values to indicate severity. Gender and numerical values, such as age, blood pressure, and pulse, were also part of the Dataset. Figure 8 illustrates the datasheet used for gathering patient variables, highlighting the prevalence of binary values for many attributes, and, in cases where an attribute was present, discerning severity by categorizing it as severe or mild.

3.2. Dataset Preprocessing
3.2.1. Data Exploration

In this phase, the dataset’s structure is examined, encompassing factors such as the number of rows and columns, variable data types, and summary statistics. This exploration aids in comprehending the dataset’s overall composition and revealing any missing or erroneous data.

3.2.2. Data Cleaning

This step addresses missing data, employing imputation techniques, or, in cases of extensive missing, removing rows/columns as warranted. It also involves detecting and eliminating duplicate entries.

3.2.3. Feature Engineering

Feature engineering entails crafting new variables or transforming existing ones to extract more meaningful insights. It might involve deriving additional features from existing ones, like calculating the Body Mass Index (BMI) from available height and weight variables.

3.2.4. Handling Missing Values

Strategies for addressing missing data in the dataset involve imputation methods such as mean, median, and mode or advanced techniques such as regression or multiple imputations. Alternatively, removing rows or columns with missing values exists for extensive missing.

3.2.5. Feature Scaling and Encoding

Standardization (mean normalization) or normalization (scaling to a specific range, like 0 to 1) handles variables with differing scales in the dataset. This process promotes uniformity in the scales of the variables, preventing variables with larger values from exerting undue influence.

The encoding of the 58 variables involved a systematic approach:(i)The original numerical format was retained without alteration for numerical variables such as age, systolic and diastolic blood pressure, pulse, and lab results.(ii)Variables with two independent attributes, including gender, patient history attributes (excluding smoking), some symptoms, and specific physical examination attributes, were encoded with binary values (0, 1)—for example, 0 denoted female and 1 marked male. The value 1 indicated the attribute’s presence, while 0 indicated its absence.(iii)Variables with two independent attributes, such as specific symptoms and all ECG findings, required encoding with three values (0, 0.5, and 1). These values represented the absence of the attribute (0), a mild presence (0.5), and a severe presence (1).(iv)Variables with three independent attributes, such as smoking, were encoded using ternary values (0, 0.5, and 1). Here, 0 signified a nonsmoker, 0.5 indicated an ex-smoker, and 1 represented a current smoker.

Figure 9 displays the dataset’s distribution based on age, highlighting the concentration of individuals affected in the fifth decade of life.

3.3. Data Feature Selection

In our cardiac disease databases, there are 58 features. However, few are crucial for precise prediction, given the prevalence of irrelevant and redundant attributes, which can lead to unreliable outcomes, increased expenses, and prolonged analysis periods. Feature optimization techniques strive to pinpoint the optimal subset of features by minimizing or maximizing the objective function while adhering to resource constraints. Utilizing optimization methods such as particle swarm optimization (PSO) enhances classifier performance in heart disease prediction. PSO iteratively refines candidate solutions by adjusting parameters, serving as a strategic approach to extracting key features from the dataset.

Following the application of wrapper methods, specifically PSO, to the heart disease dataset, 19 predominant features are identified, including chest pain type (cp), resting electrocardiographic (restecg), maximum heart rate (thalach), exercise induced (exang), oldpeak, major vessels (ca), thal, and class. This reduction from 58 to 19 attributes streamlines the dataset for more efficient analysis. The specification of PSO used is shown in Table 1. These parameter values were determined through empirical experimentation, sensitivity analysis, and alignment with the nature of the heart disease prediction problem. Sensitivity analysis revealed that variations in these parameters did not significantly impact the overall performance, thus affirming the robustness of our chosen configuration.

Following the application of PSO algorithms to the dataset features, 19 features associated with heart disease were identified. These features, obtained through PSO, include age, sex, family history of heart disease, coronary artery, cholesterol, fasting blood sugar, smoking, resting blood pressure, chest pain, exercise-induced angina, autonomic symptoms, heart rate achieved, cardiac enzymes, resting ECG, oldpeak, slope, anemia, associated symptoms, and disease type.

3.4. Cross-Validation and Hyperparameter Tuning

Cross-validation assesses a machine-learning model’s performance on unseen data, mitigating overfitting. K-fold cross-validation partitions data into k equally sized folds (k = 10 in this research), using each as a validation set. Model training and evaluation occur k times, and performance averages across all folds yield an unbiased estimate [21]. The dataset is split into 70% for training and 30% for testing.

3.5. Classification

Several classification algorithms are employed:Support vector machine (SVM): an optimal hyperplane seeks to separate classes, practical for linear and nonlinear data with high-dimensional features [37]Random Forest (RF): an ensemble method combining decision trees, robust against overfitting, accommodating categorical and numerical data, and capturing complex relationshipsDecision Tree (DT): decisions are made using if-else conditions, are interpretable, and handle both variable types but are susceptible to overfittingNaive Bayes: a probabilistic classifier assuming feature independence, efficient, suitable for text classification, and handling high-dimensional dataK-Nearest Neighbors (KNN): nonparametric, classified based on nearest neighbours, simple but sensitive to k choice, memory-intensive, and computationally demanding

3.6. Model Evaluation

Model evaluation is crucial for gauging performance. Common metrics include the following.

3.6.1. Confusion Matrix Elements

The confusion matrix has four elements:(1)True positives (TP): the number of instances where the model correctly predicts the positive outcome(2)False positives (FP): the number of instances where the model incorrectly predicts the positive outcome(3)True negatives (TN): the number of instances where the model correctly predicts the negative outcome(4)False negatives (FN): the number of instances where the model incorrectly predicts the negative outcome

3.6.2. Using the Confusion Matrix

The confusion matrix can be used to calculate several different metrics that can be used to evaluate the performance of a classification model. Some of the most common metrics include the following.

(1) Accuracy. The percentage of all correct predictions quantifies the ratio of correctly classified patients (TP + TN) to the total number of patients (TP + TN + FP + FN) (equation (1)).

(2) Precision. The percentage of correct optimistic predictions determines whether the model is reliable (equation (2)).

(3) ROC (Receiver Operating Characteristic). ROC is one of the most essential evaluation metrics for checking any classification model’s performance. It has been plotted with two metrics against each other. TPR is true positive rate or recall and FPR is false positive rate, where the former is on the y-axis and the latter is on the x-axis (equation (3)).TPR is the recall which is, out of all positive cases, how many we predicted correctly (equation (4)).FPR: Out of all negative cases, how many did not we predict correctly?F1-score: A harmonic means of precision and recall. It reaches a maximum when precision equals recall (equation (5)).

The interpretability of the F1-score is poor, so we use it with other evaluation metrics that give us a complete picture; otherwise, we do not know whether the classifier is maximizing precision or recall.

3.7. Experiment Setup

To ensure a successful experiment, it is crucial to establish a well-equipped environment that will aid your machine-learning endeavors. Here is a breakdown of the setup you need to consider.

Programming Language Selection: Choosing a suitable programming language like Python, which provides a wide range of machine-learning libraries and frameworks, is recommended. Python libraries such as sci-kit-learn, TensorFlow, and PyTorch offer the necessary tools and functions for modelling and training your machine-learning algorithms.

Integrated Development Environment (IDE) or Text Editor: Set up an IDE or text editor to streamline the coding and debugging processes. This will enhance your efficiency in developing and fine-tuning your models.

The implemented machine specifications include an HP computer type with 8 GB RAM, a Core i5 processor, and the Windows 10 operating system.

4. Results and Discussion

This section delves into the results and discussions surrounding the application of machine-learning algorithms in heart disease prediction. The objective is to assess the performance of developed models and gauge their effectiveness in predicting heart disease. The section initiates with an overview of the experimental setup, encompassing the dataset, machine-learning algorithms, and evaluation metrics. It subsequently presents the results of each algorithm and engages in a comparative analysis and discussion of the outcomes. This section addresses the study’s limitations and suggests potential avenues for future research.

The JUH Heart Disease dataset served as the foundational data source in this experiment, offering pertinent attributes and instances for heart disease prediction. The dataset underwent preprocessing, and feature selection was executed using particle swarm optimization (PSO). The heart disease prediction task was undertaken through five machine-learning algorithms: support vector machine (SVM), random forest, decision tree, naive Bayes, and k-nearest neighbors (KNN). These models underwent training and evaluation via cross-validation, and their performance was quantified using accuracy, precision, recall, and F1-score metrics.

We employed the JUH Heart Dataset, comprising diverse health data and symptoms, to craft a machine-learning model to predict the presence or absence of heart disease in our study. We rigorously assessed the performance of five distinct classifiers, namely, random forest, SVM, decision tree, naive Bayes, and k-nearest neighbors (KNN). To provide a comprehensive assessment, we compared the results from two distinct scenarios: one without integrating particle swarm optimization (PSO) and another incorporating PSO, which was utilized for feature selection.

Our study used the JUH Heart Dataset, which contains extensive information about patients’ health conditions, symptoms, and medical histories. We meticulously preprocessed this dataset and executed feature selection via PSO to identify the most pertinent attributes for heart disease prediction. Subsequently, we conducted training and evaluation exercises for the classifiers, investigating two scenarios: one without PSO and another with PSO. The results of our heart disease prediction analysis, conducted on the JUH Dataset, are detailed in Figure 10.

The results highlight the performance of the classifiers in the context of heart disease prediction, both with and without the utilization of particle swarm optimization (PSO) for feature selection.

SVM demonstrated an accuracy of 91.8% without PSO, which improved to 94.3% with the incorporation of PSO. This substantial improvement underscores the value of feature selection through PSO in enhancing the accuracy of heart disease prediction. The ensemble learning approach employed by random forest, coupled with the selected features, substantially contributed to its superior performance.

For the decision tree, the accuracy reached 85.34% without PSO, which improved to 85.71% with PSO. The decision tree algorithm demonstrated reasonable performance, with the enhancement achieved with PSO underscoring the significance of feature selection in identifying the most pertinent attributes for heart disease prediction.

Naive Bayes yielded an accuracy of 88.23% without PSO, which improved to 89.3% with PSO. As a probabilistic classifier assuming feature independence, naive Bayes exhibited lower performance than other algorithms. Nevertheless, it still reaps the benefits of the selected features, which contributes to improved accuracy.

KNN achieved an accuracy of 88.60% without PSO, which increased to 90.13% with PSO. KNN, a nonparametric classifier reliant on the majority vote of neighbouring instances, exhibited notable improvement with PSO. This enhancement suggests that the selected features aided in more accurate neighbour identification, ultimately leading to improved predictive outcomes.

Particle swarm optimization (PSO), inspired by collective behaviours in nature such as bird flocking and fish schooling, effectively explores the feature space to identify the optimal subset of features for classification. PSO’s unique capacity to balance exploration and exploitation within the feature space allows it to pinpoint the most informative features crucial for heart disease prediction.

By selecting PSO features, irrelevant or redundant features are systematically pruned, diminishing noise and enhancing overall model performance. PSO adeptly identifies the optimal equilibrium between the number of features and classification accuracy, homing in on the most pertinent attributes essential for precise predictions.

Particle swarm optimization (PSO) demonstrates its true strength when applied to datasets with many dimensions and features. It tackles the dimensionality challenge, proficiently choosing a subset of features that are relevant to the task. Thus, PSO proves instrumental in feature selection, enabling the classifiers in this study to attain heightened accuracy in predicting heart disease, surpassing outcomes achieved by utilizing all available features.

Figure 11 shows results for SVM with PSO across the ten cross-validation folds, which are promising. The algorithm consistently achieves high accuracy levels, ranging from 92% to 96% across different folds. This level of consistency suggests that the model is robust and not overfitting to specific subsets of the data.

The average accuracy of approximately 94.3% indicates that SVM with PSO performs well overall in accurately classifying the data. This suggests that the combination of support vector machine (SVM) and particle swarm optimization (PSO) is effective in this context and can be considered a strong candidate for the given task.

The summarized technical details, illustrated in Figure 12, encompass performance metrics such as recall, precision, and F1-score for the implemented models. The figure elucidates that SVM excels in both recall and precision, establishing itself as a robust performer in accurately and comprehensively identifying positive cases. Naive Bayes exhibits commendable performance in the F1-score, indicating a well-balanced trade-off between precision and recall. However, the decision tree, K-NN, and random forest show diverse performance levels, with the decision tree registering the lowest F1-score among them. This suggests that further optimization may be needed for the decision tree model, or alternative models might be more suitable for the specific task.

Crucial metrics such as receiver operating characteristic area under the curve (ROC AUC) were utilized to evaluate heart disease prediction models. The results highlight naive Bayes as the top performer with the highest ROC AUC of 0.934, showcasing its robust discriminatory ability between positive and negative instances. Support vector machine (SVM) also demonstrates strong performance, boasting an ROC AUC of 0.896 and underscoring its effectiveness in classification tasks. K-nearest neighbors (K-NNs) and decision tree achieve competitive ROC AUC values of 0.89 and 0.884, respectively. At the same time, the random forest model, with a slightly lower ROC AUC of 0.865, still exhibits noteworthy discriminatory capabilities. These findings offer valuable insights into the strengths of each model, aiding in selecting an appropriate algorithm tailored to the specific needs of heart disease prediction.

In general, there are a few reasons why SVM may outperform the other algorithms on this dataset. First, SVM is a perfect algorithm for handling nonlinear relationships in data. Second, SVM is also very good at handling high-dimensional data. Third, SVM is a very robust algorithm, meaning it is not sensitive to noise in the data.

It is important to note that the accuracy of a machine-learning algorithm can vary depending on the Dataset. For example, if the dataset is very small or the classes are not well-balanced, the algorithm may be unable to learn the relationships in the data.

Overall, the results in the image show that SVM is a perfect algorithm for classification tasks. However, it is essential to evaluate the performance of different algorithms on your specific dataset before choosing one.

This comparative analysis with other researchers underscores the substantial potential of machine-learning algorithms in accurately predicting heart disease. It also emphasizes the critical role of selecting suitable algorithms, feature selection techniques, and optimization approaches to attain heightened accuracy. Employing the SVM classifier, our proposed methodology showcases superior performance, surpassing other research studies regarding heart disease prediction accuracy.

Numerous research studies have explored the application of machine-learning algorithms in heart disease prediction, each achieving varying degrees of accuracy. Table 2 shows comparison studies of different approaches, including the approach used in this study.

Our proposed method has shown an impressive accuracy of 94.3% in predicting heart disease using the JUH Heart Dataset and the SVM classifier. Our approach’s accuracy aligns with the range of results reported by other researchers, which highlights its effectiveness. However, it is worth noting that some studies may have reported higher accuracy rates but might have used different datasets or algorithms. It is essential to mention that the dataset used in our study was collected from JUH, a hospital affiliated with medical training at both postgraduate and undergraduate levels.

5. Unravelling the Insights and Innovations in Heart Disease Prediction

This study examines machine-learning algorithms applied to heart disease prediction, leveraging the JUH Heart Disease dataset. The results showcase the efficacy of distinct methodologies, providing valuable insights into the landscape of predictive modelling for cardiovascular health.

5.1. SVM Dominance and Feature Selection Impact

The remarkable accuracy exhibited by the support vector machine (SVM) underscores its prowess in deciphering intricate relationships within the dataset. Integrating particle swarm optimization (PSO) for feature selection significantly improves SVM’s performance. The ensemble approach employed by random forest further accentuates the influence of selected features, solidifying its standing among the top-performing models.

5.2. PSO as a Catalyst for Accuracy Enhancement

Particle swarm optimization emerges as a pivotal player in this study, dynamically navigating the feature space to identify optimal subsets crucial for heart disease prediction. The results demonstrate the substantial improvement achieved in accuracy when PSO is integrated into the modelling process. By systematically pruning irrelevant features, PSO diminishes noise and enhances the overall robustness of the models.

5.3. Comparative Analysis of Models

The comparative analysis sheds light on the varying performance levels among machine-learning algorithms. While SVM excels in both recall and precision, naive Bayes achieves a well-balanced trade-off between precision and recall, as reflected in its commendable F1-score. Decision tree, K-NN, and random forest exhibit diverse performance, suggesting further optimization, especially for decision tree or exploring alternative models.

5.4. ROC AUC as a Discriminatory Metric

The evaluation using the receiver operating characteristic area under the curve (ROC AUC) provides a nuanced perspective on the models’ discriminatory capabilities. Naive Bayes emerges as the top performer, showcasing robust discrimination between positive and negative instances. SVM’s strong performance and competitive showings by K-NN and decision tree further enrich the understanding of these models’ classification tasks.

5.5. Innovations and Contributions

The study introduces several innovations, including utilizing PSO for feature selection, significantly enhancing model interpretability and accuracy. The demonstrated superiority of SVM, validated through rigorous cross-validation, contributes to the growing evidence supporting its effectiveness in heart disease prediction.

5.5.1. Limitations
(1)Retrospective Data: The reliance on retrospective data from hospital databases introduces potential selection bias and limits generalizability. Future research could benefit from prospective studies and more prominent, more diverse datasets to improve the robustness and applicability of the findings.(2)Additional Evaluation Metrics: While the research focuses on prediction model accuracy, other metrics such as sensitivity, specificity, and AUC-ROC could provide more comprehensive insights into model performance. Incorporating these metrics in future research could enhance the assessment of predictive capabilities.

6. Conclusion and Future Work

The primary objective of the research was to improve heart disease prediction through machine-learning models utilizing the JUH Heart Dataset. The study achieved notable accuracy rates by employing rigorous methodologies encompassing data preprocessing, feature selection, and model development. Remarkably, in conjunction with particle swarm optimization (PSO) for feature selection, the SVM classifier exhibited exceptional performance, signifying its effectiveness in categorizing patients based on their heart disease risk, achieving an accuracy of 94.3%. These findings hold significant implications for advancing early detection, diagnosis, and personalized treatment strategies for heart disease. However, addressing limitations and pursuing future research directions is imperative for further advancements in practical machine-learning applications for heart disease prediction.

The study underscored the efficacy of machine-learning algorithms, with a specific emphasis on the superiority of SVM, in accurately categorizing patients according to their heart disease risk. Incorporating feature selection techniques, notably, PSO, substantially enhanced model performance, resulting in high accuracy rates surpassing other algorithms in the literature.

The research outcomes hold significant implications for the medical field, offering valuable support for early identification of heart disease, risk assessment, and personalized treatment planning. The developed models possess the potential to aid healthcare professionals in decision-making, optimizing resource allocation, and facilitating the efficient delivery of healthcare services. However, it is crucial to acknowledge the study’s identified limitations, encompassing concerns related to data representativeness, data dimensionality, and model interpretability. Consequently, there is an imperative need for further research to address these limitations and enhance the robustness and applicability of the developed models in real-world healthcare settings.

Future advancements could lie in the following:(1)Ensemble Methods: Combining the strengths of multiple classifiers, such as SVM and random forest, through stacking or bagging techniques could further enhance accuracy and robustness against unforeseen data fluctuations.(2)Multimodal Data Fusion: Integrating diverse data sources beyond the JUH Heart Dataset, such as genetic information, wearable device data, and detailed lifestyle factors, could paint a more holistic picture of patients’ risk profiles, leading to personalized risk assessments and predictive models.(3)Explainable AI (XAI): Building trust and facilitating clinical adoption require interpretable models that healthcare professionals can understand and explain to patients. Exploring XAI techniques such as feature importance analysis and saliency maps could bridge this gap.(4)Real-World Validation: Moving beyond simulations, we can conduct prospective studies in clinical settings to assess the impact of these models on real-world decision-making and patient outcomes. This practical validation ensures the models’ effectiveness in everyday healthcare scenarios.

Furthermore, delving into specific research areas such as XAI concepts, taxonomies, and challenges associated with responsible AI development or exploring radiogenic classification using multiomics datasets could offer groundbreaking advancements in precision medicine and early disease diagnosis [41].

By incorporating these details, you have strengthened your future research section with specific examples and highlighted the broader potential of your current work in areas such as XAI and multiomics analysis. It showcases a clear roadmap for future exploration and emphasizes the significance of your research for the medical field [42].

Data Availability

The data used to support the findings of this study are available from the corresponding author upon reasonable request.

Disclosure

This research stems from a master’s student thesis. All the collaborating doctors actively participated in and contributed to this study.

Conflicts of Interest

The authors declare that they have no conflicts of interest.