Performance Comparison of Three Classifiers for Fetal Health Classification Based on Cardiotocographic Data
Abstract:
The global child mortality rate, which is steadily declining, will be around 26 fatalities per 1000 live births in 2022. Numerous Sustainable Development Goals of the United Nations take into account the declining child mortality rate, which illustrates how far humanity has come. Cardiotocograms (CTGs) are a simple and affordable tool that most professionals choose to reduce infant and mother mortality. Three of the most cutting-edge methodologies are utilized in this research to classify the data, and their results are compared. All three classifiers outperformed the random forest, whose accuracy was 94.3%.
1. Introduction
All mothers want a benign pregnancy, a regular delivery, and a healthy child. The mother and fetus are both negatively impacted by delivery problems. Thus, choosing the right delivery mechanism is of the utmost significance. The most used technique for detecting fetal distress during the antepartum and early postpartum period is cardiotocography (CTG). Four essential and important factors were included in the relevant datasets: baseline fetal heart rate (BL), accelerations (ACC), decelerations (DCL), and variability. Based on these variables, doctors can determine whether the fetal condition is normal, suspicious, or pathological.
The field of knowledge known as machine learning (ML) enables computers to learn without explicit programming. It is one of the most exciting technological developments ever. Unsupervised, supervised, and reinforced machine learning are the three primary types of ML [1]. In this paper, three classification techniques are utilized, namely, support vector machine, random forest, and multilayer perceptron [2].
This paper employs CTG data to monitor the health of the fetus, for CTG data enables the detection of fetal defects and the choice of medical intervention prior to the infant suffering permanent injury. Our investigation was carried out using a number of well-known ML techniques. The most accurate algorithms were found to be random forest, support vector machine, and multilayer perceptron. To the best of our knowledge, these methods have not been compared in previous studies. The accuracy rate of the models utilized in this study is significantly higher than that of earlier studies, indicating that these models are more reliable. Their robustness was demonstrated by numerous model comparisons.
By 2030, the UN wants all countries to cease preventable infant and child deaths, with a goal of reducing under-five mortality to at least 25 per 1,000 live births. Maternal mortality, which includes fatalities during pregnancy and after delivery, accounts for 295 000 deaths in addition to infant mortality (as of 2017). The majority of these deaths (94%) occurred in areas with little resources, and most were preventable [3].
Pregnancy typically lasts nine months. A trimester is a three-month phase of pregnancy. Each trimester marks a new stage in fetus development. Prenatal screenings and routine medical exams are essential. Fetal issues develop when the unborn child develops in the womb. These conditions are categorized as congenital, which means that they exist from birth. Some fetal illnesses are genetic, i.e., they are inherited from one's parents. Most prenatal illnesses are not known to have a cause. Modern testing techniques are used by the professionals at the Prenatal Care Center to identify fetal anomalies. Early detection is crucial to ensure that a mother and her unborn child receive the best medical treatment possible.
Certain birth defects may be improved in babies after fetal surgery. These highly challenging operations are carried out by our experts while the child is still in the womb. Treatment for some fetal conditions could begin as soon as the baby is born. Unfortunately, not all fetal disorders can be cured. Chest and lung diseases, chromosomal disorders, extremity and skeletal abnormalities, gastrointestinal abnormalities, heart illness, neurological conditions, tumors, and growths are some of the most complicated prenatal conditions.
Cardiotocograms (CTGs) are a quick and affordable approach for medical professionals to assess fetal health and take action to lower infant and mother mortality rates. CTGs emits ultrasound pulses and analyzes their reactions, shedding light on fetal heart rate (FHR), fetal movements, uterine contractions, and other parameters. Here, the authors made an effort to use these parameters to create a model that can categorize the fetus as normal, suspect, or pathological [4], [5].
2. Methods and Materials
This dataset consist 2,126 records of features extracted from Cardiotocogram exams, which were then classified by three expert obstetricians into 3 classes:
· Normal
· Suspect
· Pathological
The Fetal heart rate (FHR) baseline different ranges of 110 bpm to 150 bpm or 110 bpm to 160 bpm as shown in Figure 1. So, we have 2127 FHR values which are giving different values of acceleration fetal, fetal movement and so on to generate a multiclass model to classify CTG features into the three fetal health states.
Some of the features are:
i) Fetal accelerations
ii) Uterine contraction
iii) Short term variability
iv) Histograms
There were no null values and all target data besides fetal_health are floats. Therefore, we had quickly assessed if we have any replicates then moved into brief EDA. There are a ton of variables so we just make sure our data is relatively balanced. First, we set a plotting function that makes publication ready figures then we plotted a count plot as shown in Figure 2.
Clearly, the data is imbalanced and we can’t plan on performing an upsample till initial modeling is complete. Instead of plotting a pair plot, we can plot a correlation matrix to observe the pearson correlation coefficients as shown in Figure 3. Remember though that correlation does not imply causation. This will also guide us to predicting what the feature selection (KBest) will decide are the most important features as well later [6], [7].
Using k best selection and f_classif as score function as shown in Figure 4, we visualize the result by seaborn library using bar chart [8].
Next, we selected features that scored more than 200 and generates the features into a list. We add the Level string to be used to make new data frame. We create new data frame with selected features as shown in Figure 5.
We were left with 6 features that were selected as the most important. Since we have a reduced feature amount lets plot a quick pairplot to spot some differences as shown in Figure 6.
First, the data will be split so we can train a scaler model to apply to an unknown (test) data set. We will save 25% of the data for testing as shown in Figure 7. The data will then be split by standard scaler using the formula $Z=\frac{(X 0-\mu)}{\sigma}$. This can help reduce the effect of outliers when modeling later.
As per the task, stratify will be used.
In this paper, three classifiers are utilized to classify the Cardiotocographic data as follow [9], [10], [11].
It is used to generate the optimal line or decision boundary that can divide n-dimensional space into the classes so that we can simply place fresh data points in the proper category in the future. The (soft-margin) SVM classifier is computed by minimizing an expression of the form.
$\left[\frac{1}{n} \sum_{i=1}^n \max \left(0,1-y_i\left( w ^{ T } x _{ i }- b \right)\right)\right]+\lambda\| w \|^2$
We focus on the soft-margin classifier since, as noted above, choosing a sufficiently small value for 𝝀 yields the hard-margin classifier for linearly classifiable input data.
In this paper, using the support vector machines classifiers (SVC) generate hyperplanes for separation and score on a yes (1) no (1) basis as shown in confusion matrix in Figure 8. The rulings are decided for where a data point lands within a decision boundary. F-1 score provides us with a method to monitor the precision and recall of our values.
Random Forest constructs decision trees from several samples and uses their majority of votes for classification and average for regression. One of the most essential characteristics of the Random Forest Algorithm is that, as in regression and classification, it can handle data sets with both continuous and categorical variables. It outperforms other algorithms in categorization tasks [12].
An ensemble method that estimates several weak decision trees and combines the mean to create an uncorrelated forest at the end. The uncorrelated forest should be able to predict more accurately than an individual tree. For this dataset Random Forest classification method gives better result than existing as shown in confusion matrix in Figure 9.
Multilayer Perceptron (MLP) - A multilayer perceptron (MLP) is a feed-forward type of neural network augmentation [13], [14], [15]. Input, output and concealed these are three layers of multilayer perceptron. The input signal which is to be processed is received by the input layer. For the categorization and prediction output layer is responsible. Multi-layer perceptron is intended to approx any continuous function and can tackle issues that cannot be solved linearly Feed forward neural network. The number of nodes is determined by (2/3 * input feature count) + (number of outputs + 2). The number of layers were decided by 2/3 of the first and 1/2 the second layer. We can parametrize plenty of activator functions and set this up with the search function above. performance of multilayer perceptron is shown as confusion matrix in Figure 10.
3. Results
In SCV grid search results the best parameters were: {‘C’:10,’degree’:3,’gamma’:0.1.’kernal’:’rbf’,’random_state’:1}. Classification report is shown in Table 1.
Best accuracy: 92.4%
Classification Report: | Precision | Recall | F1 -Score | Support |
1.0 | 0.95 | 0.97 | 0.96 | 494 |
2.0 | 0.81 | 0.75 | 0.78 | 88 |
3.0 | 0.88 | 0.81 | 0.84 | 52 |
In random forest grid search results the best parameters were: {‘criterion’:’entropy’,’max_depth’: 11,’n_estimator’:200,’random_state’:1}. Classification report is shown in Table 2.
Best accuracy: 94.3%
Classification Report: | Precision | Recall | F1 -Score | Support |
1.0 | 0.95 | 0.99 | 0.97 | 494 |
2.0 | 0.85 | 0.73 | 0.79 | 88 |
3.0 | 0.93 | 0.81 | 0.87 | 52 |
In multi-layer perceptron grid search results the best parameters were: {‘activation’:‘relu’, ‘hidden_layer_sizes’(6,4),‘learning_rate’:‘constant’,‘learning_rate_init’:0.001,‘max_iter’:1000, ‘random_state’: 1, ‘solver’: ‘adam’}. Classification report is shown in Table 3.
Best accuracy: 91.5%
Classification Report: | Precision | Recall | F1 -Score | Support |
1.0 | 0.95 | 0.96 | 0.96 | 494 |
2.0 | 0.70 | 0.72 | 0.71 | 88 |
3.0 | 0.82 | 0.71 | 0.76 | 52 |
4. Conclusion and Future Scope
Mother must take care of her health and as well as baby health monitoring. For mother fetal growth and development several tests are suggested during pregnancy. One of the tests is cadiotocogram, which is used to check the health state of the fetus in the uterus.
In this paper, CTG data is used for fetal health monitoring. This dataset consist 2,126 records of features extracted from Cardiotocogram exams, Using KBestSelection we were able to fetched the most important features from the data set. which were then classified by three classifiers namely: Support vector Machine Random Forest, and Multilayer perceptron as classifiers. We have obtained accuracy respectively Support vector Machine (92.4%), Random Forest (94.3%), Multilayer perceptron (0.91.5%). The research results show the comparison of three classifiers namely Support Vector Machine, Random Forest and Multilayer perceptron. We have observed that the random forest is the best algorithm implemented on cardiotocography data.
In future reduction techniques as a pre-processing can be apply on data. The dataset used in this paper is not too much rich; the performance may be much better and accurate if dataset is vaster. For dimensionality reduction and increase the accuracy we will use principle component analysis (PCA) and Linear Discriminant Analysis (LDA) algorithms. Both algorithms are used for retaining as much as information after the reduction of number of features in the dataset.
The data used to support the research findings are available from the corresponding author upon request.
It gives us great pleasure to express our deepest sense of gratitude and sincere thanks to our Department of Electronics and Communication, Jaypee Institute of Information Technology for providing us an opportunity to present our work.
The authors declare no conflict of interest.