Performance Analysis of Hybrid SOM and AdaBoost Classifiers for Diagnosis of Hypertensive Retinopathy
محورهای موضوعی : Image ProcessingWiharto Wiharto 1 , Esti Suryani 2 , Murdoko Susilo 3
1 - Universitas Sebelas Maret,Surakarta, Indonesia
2 - Universitas Sebelas Maret, Surakarta, Indonesia
3 - Universitas Sebelas Maret, Surakarta, Indonesia
کلید واژه: Hypertensive Retinopathy, Self-organizing Maps, Segmentation, Adaboost, Classification, Information Gain.,
چکیده مقاله :
The diagnosis of hypertensive retinopathy (CAD-RH) can be made by observing the tortuosity of the retinal vessels. Tortuosity is a feature that is able to show the characteristics of normal or abnormal blood vessels. This study aims to analyze the performance of the CAD-RH system based on feature extraction tortuosity of retinal blood vessels. This study uses a segmentation method based on clustering self-organizing maps (SOM) combined with feature extraction, feature selection, and the ensemble Adaptive Boosting (AdaBoost) classification algorithm. Feature extraction was performed using fractal analysis with the box-counting method, lacunarity with the gliding box method, and invariant moment. Feature selection is done by using the information gain method, to rank all the features that are produced, furthermore, it is selected by referring to the gain value. The best system performance is generated in the number of clusters 2 with fractal dimension, lacunarity with box size 22-29, and invariant moment M1 and M3. Performance in these conditions is able to provide 84% sensitivity, 88% specificity, 7.0 likelihood ratio positive (LR+), and 86% area under the curve (AUC). This model is also better than a number of ensemble algorithms, such as bagging and random forest. Referring to these results, it can be concluded that the use of this model can be an alternative to CAD-RH, where the resulting performance is in a good category.
The diagnosis of hypertensive retinopathy (CAD-RH) can be made by observing the tortuosity of the retinal vessels. Tortuosity is a feature that is able to show the characteristics of normal or abnormal blood vessels. This study aims to analyze the performance of the CAD-RH system based on feature extraction tortuosity of retinal blood vessels. This study uses a segmentation method based on clustering self-organizing maps (SOM) combined with feature extraction, feature selection, and the ensemble Adaptive Boosting (AdaBoost) classification algorithm. Feature extraction was performed using fractal analysis with the box-counting method, lacunarity with the gliding box method, and invariant moment. Feature selection is done by using the information gain method, to rank all the features that are produced, furthermore, it is selected by referring to the gain value. The best system performance is generated in the number of clusters 2 with fractal dimension, lacunarity with box size 22-29, and invariant moment M1 and M3. Performance in these conditions is able to provide 84% sensitivity, 88% specificity, 7.0 likelihood ratio positive (LR+), and 86% area under the curve (AUC). This model is also better than a number of ensemble algorithms, such as bagging and random forest. Referring to these results, it can be concluded that the use of this model can be an alternative to CAD-RH, where the resulting performance is in a good category.
[1] W. Wiharto and E. Suryani, “The Review of Computer Aided Diagnostic Hypertensive Retinopathy Based on The Retinal Image Processing,” in The 2nd Sriwijaya international Conference on Science, Engineering, and Technology [SICEST], Palembang, Indonesia, 2018, vol. 690, pp. 1–9. doi: 10.1088/1757-899X/620/1/012099.
[2] F. Garcia-Lamont, J. Cervantes, A. López, and L. Rodriguez, “Segmentation of images by color features: A survey,” Neurocomputing, vol. 292, pp. 1–27, May 2018, doi: 10.1016/j.neucom.2018.01.091.
[3] W. Wiharto and E. Suryani, “The Analysis Effect of Cluster Numbers On Fuzzy C-Means Algorithm for Blood Vessel Segmentation of Retinal Fundus Image,” in IEEE The 2nd International Conference on Information and Communications Technology, Yogyakarta, Indonesia, 2019, pp. 1–5. doi: 10.1109/ICOIACT46704.2019.8938583.
[4] W. Wiharto and E. Suryani, “The Segmentation Analysis of Retinal Image Based on K-means Algorithm for Computer-Aided Diagnosis of Hypertensive Retinopathy,” Indonesian Journal of Electrical Engineering and Informatics (IJEEI), vol. 8, no. 2, pp. 419–426, 2020, doi: 10.11591/ijeei.v8i2.1287.
[5] W. Wiharto, E. Suryani, and M. Susilo, “The Hybrid Method of SOM Artificial Neural Network and Median Thresholding for Segmentation of Blood Vessels in the Retina Image Fundus,” International Journal of Fuzzy Logic and Intelligent Systems, vol. 19, no. 4, pp. 323–331, 2019.
[6] F. Shafiei and S. Fekri-Ershad, “Detection of Lung Cancer Tumor in CT Scan Images Using Novel Combination of Super Pixel and Active Contour Algorithms,” Traitement du Signal, vol. 37, no. 6, pp. 1029–1035, Dec. 2020, doi: 10.18280/ts.370615.
[7] C. Lupascu and D. Tegolo, “Automatic unsupervised segmentation of retinal vessels using self-organizing maps and k-means clustering,” in Computational Intelligence Methods for …, Berlin, Heidelberg, 2011, vol. 6685, pp. 263–274. doi: 10.1007/978-3-642-21946-7_21.
[8] C. Budayan, I. Dikmen, and M. T. Birgonul, “Comparing the performance of traditional cluster analysis, self-organizing maps and fuzzy C-means method for strategic grouping,” Expert Systems with Applications, vol. 36, no. 9, pp. 11772–11781, 2009, doi: 10.1016/j.eswa.2009.04.022.
[9] S. Arumugadevi and V. Seenivasagam, “Comparison of Clustering Methods for Segmenting Color Images,” Indian Journal of Science and Technology, vol. 8, no. 7, pp. 670–677, 2015, doi: 10.17485/ijst/2015/v8i7/62862.
[10] O. A. Abbas, “Comparisons Between Data Clustering Algorithms,” The International Arab Journal of Information Technology, vol. 5, no. 3, pp. 320–325, 2008.
[11] K. K. Jassar, “Comparative Study and Performance Analysis of Clustering Algorithms,” IJCA Proceedings on International Conference on ICT for Healthcare, vol. ICTHC 2015, no. 1, pp. 1–6, 2016.
[12] P. Zhu et al., “The relationship of retinal vessel diameters and fractal dimensions with blood pressure and cardiovascular risk factors,” PLoS ONE, vol. 9, no. 9, pp. 1–10, 2014, doi: 10.1371/journal.pone.0106551.
[13] N. Popovic, M. Radunovic, J. Badnjar, and T. Popovic, “Fractal dimension and lacunarity analysis of retinal microvascular morphology in hypertension and diabetes,” Microvascular Research, vol. 118, no. 2018, pp. 36–43, 2018, doi: 10.1016/j.mvr.2018.02.006.
[14] H. A. Crystal et al., “Association of the Fractal Dimension of Retinal Arteries and Veins with Quantitative Brain MRI Measures in HIV-Infected and Uninfected Women,” PLoS ONE, vol. 11, no. 5, pp. 1–11, 2016, doi: 10.1371/journal.pone.0154858.
[15] E. V. L. Costa and R. A. Nogueira, “Fractal, multifractal and lacunarity analysis applied in retinal regionsof diabetic patients with and without non-proliferative diabetic retinopathy,” Fractal Geometry and Nonlinear Anal in Med and Biol, vol. 1, no. 3, pp. 112–119, 2016, doi: 10.15761/FGNAMB.1000118.
[16] W. Wiharto, E. Suryani, and M. Yahya Kipti, “Assessment of Early Hypertensive Retinopathy using Fractal Analysis of Retinal Fundus Image,” TELKOMNIKA (Telecommunication Computing Electronics and Control), vol. 16, no. 1, pp. 445-454, 2018, doi: 10.12928/telkomnika.v16i1.6188.
[17] M. F. Syahputra, I. Aulia, R. F. Rahmat, and others, “Hypertensive retinopathy identification from retinal fundus image using probabilistic neural network,” in 2017 International Conference on Advanced Informatics, Concepts, Theory, and Applications (ICAICTA), 2017, pp. 1–6.
[18] K. Narasimhan, V. C. Neha, and K. Vijayarekha, “Hypertensive retinopathy diagnosis from fundus images by estimation of AVR,” Procedia Engineering, vol. 38, no. 2012, pp. 980–993, 2012, doi: 10.1016/j.proeng.2012.06.124.
[19] N. Hutson, A. Karan, J. A. Adkinson, P. Sidiropoulos, I. Vlachos, and L. Iasemidis, “Classification of Ocular Disorders Based on Fractal and Invariant Moment Analysis of Retinal Fundus Images,” in 2016 32nd Southern Biomedical Engineering Conference (SBEC), Shreveport, LA, USA, Mar. 2016, pp. 57–58. doi: 10.1109/SBEC.2016.21.
[20] U. G. Abbasi and U. M. Akram, “Classification of blood vessels as arteries and veins for diagnosis of hypertensive retinopathy,” in 2014 10th International Computer Engineering Conference: Today Information Society What’s Next?, ICENCO 2014, Giza, Egypt, 2014, pp. 5–9. doi: 10.1109/ICENCO.2014.7050423.
[21] B. K. Triwijoyo, W. Budiharto, and E. Abdurachman, “The Classification of Hypertensive Retinopathy using Convolutional Neural Network,” Procedia Computer Science, vol. 116, pp. 166–173, 2017, doi: 10.1016/j.procs.2017.10.066.
[22] X. Miao and J. S. Heaton, “A comparison of random forest and Adaboost tree in ecosystem classification in east Mojave Desert,” in 2010 18th International Conference on Geoinformatics, Beijing, China, Jun. 2010, pp. 1–6. doi: 10.1109/GEOINFORMATICS.2010.5567504.
[23] C. A. Lupascu, D. Tegolo, and E. Trucco, “FABC: Retinal Vessel Segmentation Using AdaBoost,” IEEE Trans. Inform. Technol. Biomed., vol. 14, no. 5, pp. 1267–1274, Sep. 2010, doi: 10.1109/TITB.2010.2052282.
[24] N. Dey, A. B. Roy, M. Pal, and A. Das, “FCM Based Blood Vessel Segmentation Method for Retinal Images,” International Journal of Computer Science and Network (IJCSN), vol. 1, no. 3, pp. 1–5, 2012.
[25] G. B. Kande, T. S. Savithri, and P. Subbaiah, “Segmentation of vessels in fundus images using spatially weighted fuzzy c-means clustering algorithm,” International journal of computer science and network security, vol. 7, no. 12, pp. 102–109, 2007.
[26] R. A. Aras, T. Lestari, H. Adi Nugroho, and I. Ardiyanto, “Segmentation of retinal blood vessels for detection of diabetic retinopathy: A review,” Communications in Science and Technology, vol. 1, no. 2016, pp. 33–41, 2016, doi: 10.21924/cst.1.1.2016.13.
[27] Ş. Talu, C. Vlǎduţiu, L. A. Popescu, C. A. Lupaşcu, Ş. C. Vesa, and S. D. Ţǎlu, “Fractal and lacunarity analysis of human retinal vessel arborisation in normal and amblyopic eyes,” Human and Veterinary Medicine, vol. 5, no. 2, pp. 45–51, 2013.
[28] T. Acharya and K. R. Ray, Image Processing: Principles and Applications. USA: John Willey & Sons, 2005.
[29] B. B. Mandelbrot, The fractal geometry of nature, vol. 173. WH freeman New York, 1983.
[30] A. R. Backes and O. M. Bruno, “A new approach to estimate fractal dimension of texture images,” in In: Elmoataz A., Lezoray O., Nouboud F., Mammass D. (eds) Image and Signal Processing. ICISP 2008. Lecture Notes in Computer Science, Berlin, Heidelberg, 2008, vol. 5099, pp. 136–143. doi: 10.1007/978-3-540-69905-7_16.
[31] C. Allain and M. Cloitre, “Characterizing the lacunarity of random and deterministic fractal sets,” Physical Review A, vol. 44, no. 6, pp. 3552–3558, 1991, doi: 10.1103/PhysRevA.44.3552.
[32] M.-K. Hu, “Visual pattern recognition by moment invariants,” IRE transactions on information theory, vol. 8, no. 2, pp. 179–187, 1962.
[33] Z. Wu et al., “Application of image retrieval based on convolutional neural networks and Hu invariant moment algorithm in computer telecommunications,” Computer Communications, vol. 150, pp. 729–738, Jan. 2020, doi: 10.1016/j.comcom.2019.11.053.
[34] S. Lefkovits and L. Lefkovits, “Gabor Feature Selection Based on Information Gain,” Procedia Engineering, vol. 181, no. 2017, pp. 892–898, 2017, doi: 10.1016/j.proeng.2017.02.482.
[35] Y. Freund and R. E. Schapire, “Experiments with a New Boosting Algorithm,” in Proceedings of the Thirteenth International Conference on International Conference on Machine Learning, The University of Virginia, 1996, pp. 148–156.
[36] G. Eibl and K. Pfeiffer, “Multiclass Boosting forWeak Classifiers,” Journal of Machine Learning Research, vol. 6, no. 7, pp. 189–210, 2005.
[37] Y. Freund and R. E. Schapire, “A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting,” Journal of Computer and System Sciences, vol. 55, no. 1, pp. 119–139, Aug. 1997, doi: 10.1006/jcss.1997.1504.
[38] E. Ramentol, Y. Caballero, R. Bello, and F. Herrera, “SMOTE-RSB *: A hybrid preprocessing approach based on oversampling and undersampling for high imbalanced data-sets using SMOTE and rough sets theory,” Knowledge and Information Systems, vol. 33, no. 2, pp. 245–265, 2012, doi: 10.1007/s10115-011-0465-6.
[39] S. McGee, “Simplifying likelihood ratios,” J Gen Intern Med, vol. 17, no. 8, pp. 647–650, Aug. 2002, doi: 10.1046/j.1525-1497.2002.10750.x.
[40] F. Gorunescu, Data Mining: Concepts, Models and Techniques. Berlin, Heidelberg: Springer, 2011.
[41] M. Stella and S. Kumar, “Prediction and Comparison using AdaBoost and ML Algorithms with Autistic Children Dataset,” IJERT, vol. 9, no. 7, pp. 133–136, Jul. 2020, doi: 10.17577/IJERTV9IS070091.
[42] P. H. Prastyo, I. G. Paramartha, M. S. M. Pakpahan, and I. Ardiyanto, “Predicting Breast Cancer: A Comparative Analysis of Machine Learning Algorithms,” PROC. INTERNAT. CONF. SCI. ENGIN., vol. 3, no. 2020, pp. 455–459, 2020.
Performance Analysis of Hybrid SOM and AdaBoost Classifiers for Diagnosis of Hypertensive Retinopathy
Department of Informatics, Universitas Sebelas Maret, Surakarta, Indonesia wiharto@staff.uns.ac.id Esti Suryani Department of Informatics, Universitas Sebelas Maret, Surakarta, Indonesia estisuryani@staff.uns.ac.id Murdoko Susilo Department of Informatics, Universitas Sebelas Maret, Surakarta, Indonesia murdokosusilo@student.uns.ac.id
Received: 17/Jan/2021 Revised: 26/Mar/2021 Accepted: 21/Apr/2021 |
|
Abstract
The diagnosis of hypertensive retinopathy (CAD-RH) can be made by observing the tortuosity of the retinal vessels. Tortuosity is a feature that is able to show the characteristics of normal or abnormal blood vessels. This study aims to analyze the performance of the CAD-RH system based on feature extraction tortuosity of retinal blood vessels. This study uses a segmentation method based on clustering self-organizing maps (SOM) combined with feature extraction, feature selection, and the ensemble Adaptive Boosting (AdaBoost) classification algorithm. Feature extraction was performed using fractal analysis with the box-counting method, lacunarity with the gliding box method, and invariant moment. Feature selection is done by using the information gain method, to rank all the features that are produced, furthermore, it is selected by referring to the gain value. The best system performance is generated in the number of clusters 2 with fractal dimension, lacunarity with box size 22-29, and invariant moment M1 and M3. Performance in these conditions is able to provide 84% sensitivity, 88% specificity, 7.0 likelihood ratio positive (LR+), and 86% area under the curve (AUC). This model is also better than a number of ensemble algorithms, such as bagging and random forest. Referring to these results, it can be concluded that the use of this model can be an alternative to CAD-RH, where the resulting performance is in a good category.
Keywords: Hypertensive Retinopathy; Self-organizing Maps; Segmentation; Adaboost; Classification; Information Gain.
1- Introduction
Hypertension can be detected when we are diligent in checking blood pressure. Hypertension can cause severe health complications and increase the risk of heart disease, stroke, and sometimes death. Hypertension can also cause damage to the retina and blood vessels around the retina, a condition called hypertensive retinopathy. In hypertensive retinopathy, there is the thickening of the blood vessels, which in turn can disrupt blood flow to the retina. Disruption of blood flow to the retina can cause vision problems.
Hypertensive retinopathy can be detected by analyzing the retina of the eye. Analysis can be carried out in person by the clinician or with the aid of a computer. Analysis for diagnosis is carried out with the help of computers, namely processing the retinal image from the fundus camera. The hypertensive retinopathy diagnosis model generally has preprocessing, segmentation, feature extraction, classification, and performance analysis stages[1].
An important stage in the diagnosis process is the segmentation, feature extraction, and classification stages. These three stages have many methods used, such as the segmentation stage. Segmentation has several approaches, one of which is clustering [2]. Retinal image segmentation has been done a lot, as done by Wiharto et al.[3]. This study analyzes the effect of the number of clusters on segmentation performance using the fuzzy c-means clustering algorithm. Besides, this study also conducted a comparison of the mean and median methods in determining the threshold used to separate blood vessels from the background. The same thing was done by Wiharto et al.[4], namely segmentation using the k-means algorithm, with the method of determining the threshold using the mean of the center of the cluster.
The clustering approach used in the blood vessel segmentation process does not only feature clustering-based but also Neural Network-based [2]. The neural network-based use includes self-organizing maps (SOM), as used in the research of Wiharto et al.[5]. In this study, retinal blood vessels were segmented using SOM combined with the determination of the threshold which was the median of the center of the cluster. The use of clustering for segmentation was also used in the study of Shafiei et al.[6] for the case of CT scan images for the detection of lung cancer tumors. In this study, the algorithm for segmentation is SLIC (simple linear iterative clustering) based on k-means clustering. Referring to the study of Lupascu et al.[7], explained that SOM is better than k-means for retinal vascular segmentation. The ability to perform SOM clustering is also described in the research of Budayan et al.[8], and in general image segmentation with SOM and FCM is better than k-means [9]–[11].
The next stage in CAD-RH is feature extraction. Referring to the segmentation stage, which mostly focuses on the segmentation of retinal blood vessels, the feature extraction stage uses several methods such as fractal dimensions and lacunarity. The fractal and lacunarity dimensions have been associated with hypertension, arterial, and venous blood vessels in the retina [12]–[15]. Referring to this, a number of studies have used feature extraction for CAD-RH. Research conducted by Wiharto et al. [16] used the fractal and lacunarity dimensions, where the fractal dimensions used the box-counting method. The fractal dimension was also used in a study conducted by Syahputra et al. [17] but in combination with the invariant moment. Both studies use segmentation with a threshold approach. Invariant moments were also used in the study of Narasimhan et al.[18], but combined with gray levels. The same feature extraction as Syahputra et al. [17] was also used by Hutson et al.[19] but in the case of CAD-Diabetic Retinopathy.
The last stage in CAD-RH after feature selection is classification. Classification methods that have been used in the CAD-RH model are classification based on artificial neural networks, decision trees, naïve Bayesian, support vector machines, and ensemble learning [16]–[18], [20], [21]. These methods have a number of drawbacks, one of which is overfitting. The method that can overcome overfit is the ensemble method. The ensemble method has a number of algorithms such as random forest and AdaBoost. The AdaBoost algorithms has better ability than random forest [22] and has better overfit ability [23].
Referring to studies that have been carried out in the segmentation and classification stages, it shows that the segmentation performance using the clustering approach is able to provide performance with an excellent AUC value above 90%, namely the SOM method. The ability of SOM is better than segmentation by using a combination of frangi filter and otsu thresholding. The results of segmentation with SOM have not been tested whether they are able to produce features that can be optimally used for CAD-RH. Feature extraction used in a number of previous studies includes fractal, lacunarity, and invariant moment dimensions, with the segmentation method using a threshold-based approach. Another thing from previous studies at the classification stage, most of them use algorithms that are not able to overcome overfit.
Referring to a number of studies that have been carried out, this study analyzes the performance of the CAD-RH system, where the segmentation is clustering-based, namely the SOM. Feature extraction uses the dimensions of fractal, lacunarity, and invariant moment, because of the large number of features it is necessary to scale to select the features used in the classification. The hoisting method used is information gain. The classification algorithm uses AdaBoost. CAD-RH performance is measured using parameters of accuracy, sensitivity, specificity, area under the curve (AUC), likelihood ratio positive (LR+), and likelihood ratio negative (LR-).
2- Material and Method
2-1- Dataset
This research method uses a number of stages which can be shown in Figure 1. The core stages in this research are divided into 6 stages, namely preprocessing, segmentation, feature extraction, feature selection, classification, and performance analysis. This study uses data obtained online, namely the STARE (Structured Analysis of the Retina) dataset. The dataset consists of 50 data, with the distribution of 25 healthy retinal data and 25 retinal data identified hypertensive retinopathy.
2-2- Preprocessing
Retinal image preprocessing was performed to overcome noise, poor contrast, and irregular blood vessel width[24]. The image that is obtained in the dataset is a color image, for that it needs to be converted into a gray image. This is based on research by Dey et al.[24] and Kande et al.[25], who changed the retinal image to a gray image first before segmenting it by clustering. Before the retinal image is further processed, it is converted into 3 channels of the gray image. The gray image taken is the green channel. Green canal image yields significant information about blood vessels and retinal structures because it has the best light reflection [26]. The use of the preprocessing stage is very important. If preprocessing is not used, the segmentation process of blood vessels will result in low segmentation quality. The low quality is due to the number of pixels of the blood vessels that are translated as background, so it will have an impact on the diagnosis result.
The next stage of the green channel image is changed to a negative intensity or it is also called an inverse operation, then the CLAHE process is carried out on the negative image. This is intended to highlight the characteristics of blood vessels. The process of removing the optical disk was carried out using the opening morphology with a structure ball element, which the structure is measuring at 17x17 against the CLAHE result image. The next step is to subtract the CLAHE result image with the opening morphology image, then the optical disk removal image is obtained.
Fig. 1: Research Method
The next process is background subtraction. The image resulting from optical disk removal was processed using a 3x3 median filter, then the morphology of the opening was performed with a 29x29 size disk structure element in the median filter image. The image resulting from the median filter will be subtracted with the image resulting from the opening morphology, then the image resulting from the background subtraction is obtained. The background subtraction process will make the background darker and the veins more prominent, as well as smooth the image texture. However, background subtraction causes the image to appear darker, this is because the subtraction between pixels makes the pixel value decrease. To increase the brightness, the image contrast is increased. Contrast enhancement is done by the contrast stretching process.
2-3- Segmentation
After preprocessing, the segmentation process is carried out using the SOM clustering method. The parameters used in SOM are neighborhood 3, 200 iterations. SOM clustering of retinal images will produce a cluster of k cluster centers. Each cluster has a centroid, in the case of SOM, the centroid of a cluster is the weight of the neuron. The next step is to obtain the image of the blood vessels by thresholding the contrast stretched image. The thresholding process is carried out using the median centroid value of the cluster generated in the SOM process [3], [5]. The result of thresholding is a binary image. The next step is to process the opening area with a radius of 30 to remove small areas in the binary image.
The last step in the segmentation stage is to combine the resulting image of the opening area with the image that was the result of the masking, by multiplying each pixel. This process is done to remove the cover on the retinal image. The process of making a masking image is done by changing the retina image to grayscale, then every pixel that has a value of more than 45 is converted to 1 and the others are converted to 0.
2-4- Feature Extraction
The CAD-RH stage after segmentation is feature extraction. Feature extraction is done by processing the resulting image from segmentation using the fractal dimension, lacunarity, and invariant moment method to obtain image features. Fractal-based feature extraction is used to identify retinal vascular patterns. One of the signs of hypertensive retinopathy is the appearance of tortuosity in the blood vessels that will affect the pattern of the blood vessels. It can be analyzed using fractal analysis, both dimensions and lacunarity [27]. In order to strengthen the features, an invariant moment is added, to see the features from the shape side. This method has the ability not to be susceptible to image changes caused by Rotation, Scale, and Translation [28].
2-4-1- Fractal Dimension
Fractal is a simple geometry that can be broken in such a way that it becomes several parts that have the previous shape with a smaller size [29]. This study uses the box-counting method to calculate the fractal dimensions of an image. Box-counting is done by dividing the image into smaller squares of a certain size.
The following are the steps for the Box Counting method according to Backes and Bruno [30] :
a. The image is divided into squares of size r. The value of r is changed from 1 to 2k, where k = 0, 1, 2, ... and so on, the value from 2k cannot exceed the image size.
b. Counts the number of N boxes containing parts of the object in the image. The value of N is very dependent on r.
c. Calculates the value of log(1/r) and log(N).
d. Make a straight line using log(1/r) and log(N) values.
e. Calculate the slope (slope) of a straight line with equations
|
| (1) |
|
| (2) |
|
| (3) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
| (4) |
|
| (5) |
| (6) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| (7) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| (8) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The normalized central moment is defined as
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| (9) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
|
| (10) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
| (11) |
|
| (12) |
|
| (13) |
|
| (14) |
|
| (15) |
|
| (16) |
|
| (17) |
| (18) | ||||
| (19) | ||||
| (20)
| ||||
| (21) | ||||
| (22)
| ||||
| (23) |
|
|
Image original | Green Channel |
|
|
Blue channel | Read channel |
Fig. 2: Retina Image & Gray Image
|
|
Image Inverse | Output CLAHE |
|
|
Output Morphology | Output Removal OD |
Fig. 3: Optic Disc Removal
|
|
Output median filter | Output Opening |
|
|
Background subtraction | Contrast Stretch |
Fig. 4: Output contrast stretch
The segmentation process was carried out on 50 retinal images. Image segmentation is performed using the SOM clustering algorithm. The SOM algorithm uses neighbor parameter 3, 200 iterations, and the number of cluster centers tested is 2 to 10. The results of the image segmentation process using the number of clusters 5 can be seen in Figure 5. Furthermore, for other retinal image segmentation results for the number of clusters in the SOM algorithm between 2 to 7 can be shown in Figure 6.
The next stage is the feature extraction process, where the output at this stage can be shown in Table 1, by taking the number of clusters as an example 5. Table 1 shows the average value for each feature, both hypertensive or normal retinopathy. Table 1 also shows the results of statistical testing with a significance level of 95%. The next step is feature selection, using the information gain algorithm. The results of this process can be shown in Table 2.
|
|
Output Thresholding | Output Opening |
|
|
Mask | Output Segmentation |
Fig. 5: Output segmentation process
|
|
ΣCluster =2 | ΣCluster=3 |
|
|
ΣCluster=4 | ΣCluster=5 |
|
|
ΣCluster=6 | ΣCluster=7 |
Fig. 6: The results of segmentation are based on the number of clusters
The next test result is the result of the classification process. The classification results are measured by the performance parameters of sensitivity, specificity, likelihood ratio positive, and area under the curve, shown in Table 3. Table 3 shows the performance for testing with clustering variables, and the best number of features.
Table 1 : Average feature extraction yield
Feature | RH | Normal | p-value | |
---|---|---|---|---|
1 | FD | 1.602208 | 1.575276 | 0.001122 |
2 | 21 | 5.012032 | 6.029760 | 0.001182 |
3 | 22 | 4.301156 | 5.183120 | 0.001376 |
4 | 23 | 3.356580 | 4.024624 | 0.002155 |
5 | 24 | 2.432772 | 2.831768 | 0.007441 |
6 | 25 | 1.822812 | 2.018344 | 0.063592 |
7 | 26 | 1.433240 | 1.529700 | 0.220680 |
8 | 27 | 1.114265 | 1.172235 | 0.229945 |
9 | 28 | 0.861410 | 0.890279 | 0.058108 |
10 | 29 | 0.499498 | 0.499787 | 0.591386 |
11 | M1 | 0.582954 | 0.668611 | 0.018811 |
12 | M2 | 0.012519 | 0.011148 | 0.531156 |
13 | M3 | 0.004236 | 0.011822 | 0.019364 |
14 | M4 | 0.005215 | 0.010684 | 0.192822 |
15 | M5 | 0.000015 | 0.000048 | 0.667529 |
16 | M6 | 0.000232 | 0.000031 | 0.328706 |
17 | M7 | 0.000002 | -0.000047 | 0.614520 |
Table 2 : Information Gain Results
Rank | 2 Clusters | 5 Clusters | ||
---|---|---|---|---|
score | Feature | Score | Feature | |
1 | 0.395816 | FD | 0.270252 | FD |
2 | 0.360657 | 21 | 0.222020 | 21 |
3 | 0.327324 | 22 | 0.208735 | 23 |
4 | 0.327324 | 23 | 0.182119 | M1 |
5 | 0.236453 | 24 | 0.182119 | 24 |
6 | 0.156513 | M1 | 0.173600 | 22 |
7 | 0.156513 | 25 | 0.124511 | M3 |
8 | 0.124511 | 27 | 0.106740 | M6 |
9 | 0.124511 | M3 | 0.096311 | 25 |
10 | 0.117188 | 28 | 0.087804 | 27 |
11 | 0.106740 | 26 | 0.087736 | 28 |
12 | 0.082296 | M5 | 0.087736 | M4 |
13 | 0.076591 | M2 | 0.085438 | M5 |
14 | 0.068648 | 29 | 0.085024 | M7 |
15 | 0.052821 | M6 | 0.069342 | 26 |
16 | 0.041203 | M4 | 0.051262 | 29 |
17 | 0.025695 | M7 | 0.041203 | M2 |
The CAD-RH system model, which is a hybrid SOM with AdaBoost, has the best performance when the number of SOM clusters is 2. The resulting performance is able to have an AUC value of 86%, or is included in the good category [40]. The resulting performance requires a relatively large number of features, namely 11 features. In addition to the number of clusters 2, the performance of CAD-RH, in the number of clusters of 5, is also able to provide performance with AUC values> 80%. The advantage of the number of clusters 5 is that it only requires 3 features, namely the fractal dimensions, the lacunarity with the size box are 21 and 23. The weakness of the number of clusters 5 is that the specificity value has a large difference compared to the number of clusters 2. Referring to the statistical test using the t-test method with 95% significance shown in Table 1, it can also be believed that 11 features and 3 features have a significant difference between positive and negative of hypertension retinopathy. This shows that the ranking generated by the information gain has similarities with the results of the t-test.
Table 3 : Classification Results in each Cluster
The number of | LR+ | SN | SP | AU | |
cluster | feature | ||||
2 | 11 | 7.00 | 84 | 88 | 86 |
3 | 4 | 2.18 | 96 | 56 | 76 |
4 | 2 | 2.11 | 76 | 64 | 70 |
5 | 3 | 4.20 | 84 | 80 | 82 |
6 | 17 | 2.86 | 80 | 72 | 76 |
7 | 5 | 3.17 | 76 | 76 | 76 |
8 | 17 | 2.38 | 76 | 68 | 72 |
9 | 3 | 1.25 | 80 | 36 | 58 |
10 | 17 | 1.08 | 52 | 52 | 52 |
The CAD-RH system in the number of clusters 2 with an AUC performance of 86% shows that, when the system is used to detect 100 patients, the system is able to detect as many as 86 patients with true positive hypertensive retinopathy. In the number of clusters, the AUC value was 82%, which means it was able to detect correctly for 82 patients. The performance on the number of clusters 2 and 5 has the same sensitivity performance parameters. Sensitivity is the ability of the CAD-RH system to identify positive patients with hypertensive retinopathy, identified by the CAD-RH system with positive results of hypertensive retinopathy. This is when used for initial screening, the sensitivity parameter becomes vital. In this hybrid model, the highest sensitivity occurs in the number of clusters 3, namely 96%, however, the specificity value is very low. So the ability of the CAD-RH system is low when identifying negative patients, the system is identified as negative.
When the number of clusters 5 shows that the invariant moment feature does not provide a significant additional performance. The test also shows that the tortuosity vascular pattern features can be extracted properly. When using fractal analysis, namely the fractal dimension and lacunarity. This is also supported by the results of feature selection with the information gain method, where the invariant moment, both M1-M7, has a relatively low entropy value compared to the fractal and lacunarity dimensions. This condition also proves the relationship between hypertensive retinopathy with fractal and lacunarity dimensions[12], [13].
Hybrid SOM with AdaBoost on the CAD-RH system shows relatively good capabilities with the resulting performance parameters, both when the number of clusters is 2 and the number of clusters 5. AUC's performance is in the range of 80%-90%, so it is categorized as good [40]. Referring to the research by McGee [39], that is, one of the performance parameters in the diagnostic system is the positive (LR +) and negative (LR-) likelihood ratio. This parameter is not limited to a scale of 0-100. Referring to these parameters, the value of LR +, for the number of clusters 2 with 11 features shows LR + = 7 and LR- = 0.182, while for the number of clusters 5 with 3 features LR + = 4.2 and LR- = 0.2, as shown in Table 3. The value of LR + will be better the higher the value, while LR- will be smaller the value. This also shows the performance in the number of clusters 2 is better.
Fig. 7 : Comparison of Algorithms
The use of AdaBoost algorithm in the CAD-RH system shows more capabilities when compared to other ensemble algorithms. The comparison of AdaBoost with other algorithms can be shown in Figure 7. The ability that approaches the AdaBoost algorithm is Random Forest which is able to have AUC 80% when the number of clusters is 2. When compared to AdaBoost When the number of clusters is 5, Random Forest is superior when referring to the LR + parameter, whereas Lower AUC Random Forest. This difference was caused by the better random forest specificity value, but lower sensitivity. This means that there is a Boost with a number of clusters of 5, still better than Random Forest. This is also supported in research conducted by Stella et al.[42] and also Prastyo et al.[43], but in a different case. In this study, a number of algorithms were compared, including random forest, AdaBoost, and support vector machine (SVM). The results showed that there was an AdaBoost better than random forest and SVM.
4- Conclusions
The CAD-RH system model with SOM and AdaBoost has good performance when using the number of clusters 2 and the number of clusters 5. The number of clusters is able to provide good performance. The AUC value for the number of clusters 2 was 86% while for the number of clusters 5 the value was 82%. If we refer to the number of features, it achieved AUC more than 80% needed 3 feature when the number of clusters 5. While for the number of clusters 2 requires feature 11. Referring to the resulting performance, the Hybrid SOM and AdaBoost models can be an alternative in the initial diagnosis. hypertensive retinopathy.
Acknowledgments
We thank the Universitas Sebelas Maret for providing MRG research grants with contract number Nomor : 260/UN27.22/HK.07.00/2021. We also say thank you to the number of parties who have helped us complete the research that we did.
References
[1] W. Wiharto and E. Suryani, “The Review of Computer Aided Diagnostic Hypertensive Retinopathy Based on The Retinal Image Processing,” in The 2nd Sriwijaya international Conference on Science, Engineering, and Technology [SICEST], Palembang, Indonesia, 2018, vol. 690, pp. 1–9. doi: 10.1088/1757-899X/620/1/012099.
[2] F. Garcia-Lamont, J. Cervantes, A. López, and L. Rodriguez, “Segmentation of images by color features: A survey,” Neurocomputing, vol. 292, pp. 1–27, May 2018, doi: 10.1016/j.neucom.2018.01.091.
[3] W. Wiharto and E. Suryani, “The Analysis Effect of Cluster Numbers On Fuzzy C-Means Algorithm for Blood Vessel Segmentation of Retinal Fundus Image,” in IEEE The 2nd International Conference on Information and Communications Technology, Yogyakarta, Indonesia, 2019, pp. 1–5. doi: 10.1109/ICOIACT46704.2019.8938583.
[4] W. Wiharto and E. Suryani, “The Segmentation Analysis of Retinal Image Based on K-means Algorithm for Computer-Aided Diagnosis of Hypertensive Retinopathy,” Indonesian Journal of Electrical Engineering and Informatics (IJEEI), vol. 8, no. 2, pp. 419–426, 2020, doi: 10.11591/ijeei.v8i2.1287.
[5] W. Wiharto, E. Suryani, and M. Susilo, “The Hybrid Method of SOM Artificial Neural Network and Median Thresholding for Segmentation of Blood Vessels in the Retina Image Fundus,” International Journal of Fuzzy Logic and Intelligent Systems, vol. 19, no. 4, pp. 323–331, 2019.
[6] F. Shafiei and S. Fekri-Ershad, “Detection of Lung Cancer Tumor in CT Scan Images Using Novel Combination of Super Pixel and Active Contour Algorithms,” Traitement du Signal, vol. 37, no. 6, pp. 1029–1035, Dec. 2020, doi: 10.18280/ts.370615.
[7] C. Lupascu and D. Tegolo, “Automatic unsupervised segmentation of retinal vessels using self-organizing maps and k-means clustering,” in Computational Intelligence Methods for …, Berlin, Heidelberg, 2011, vol. 6685, pp. 263–274. doi: 10.1007/978-3-642-21946-7_21.
[8] C. Budayan, I. Dikmen, and M. T. Birgonul, “Comparing the performance of traditional cluster analysis, self-organizing maps and fuzzy C-means method for strategic grouping,” Expert Systems with Applications, vol. 36, no. 9, pp. 11772–11781, 2009, doi: 10.1016/j.eswa.2009.04.022.
[9] S. Arumugadevi and V. Seenivasagam, “Comparison of Clustering Methods for Segmenting Color Images,” Indian Journal of Science and Technology, vol. 8, no. 7, pp. 670–677, 2015, doi: 10.17485/ijst/2015/v8i7/62862.
[10] O. A. Abbas, “Comparisons Between Data Clustering Algorithms,” The International Arab Journal of Information Technology, vol. 5, no. 3, pp. 320–325, 2008.
[11] K. K. Jassar, “Comparative Study and Performance Analysis of Clustering Algorithms,” IJCA Proceedings on International Conference on ICT for Healthcare, vol. ICTHC 2015, no. 1, pp. 1–6, 2016.
[12] P. Zhu et al., “The relationship of retinal vessel diameters and fractal dimensions with blood pressure and cardiovascular risk factors,” PLoS ONE, vol. 9, no. 9, pp. 1–10, 2014, doi: 10.1371/journal.pone.0106551.
[13] N. Popovic, M. Radunovic, J. Badnjar, and T. Popovic, “Fractal dimension and lacunarity analysis of retinal microvascular morphology in hypertension and diabetes,” Microvascular Research, vol. 118, no. 2018, pp. 36–43, 2018, doi: 10.1016/j.mvr.2018.02.006.
[14] H. A. Crystal et al., “Association of the Fractal Dimension of Retinal Arteries and Veins with Quantitative Brain MRI Measures in HIV-Infected and Uninfected Women,” PLoS ONE, vol. 11, no. 5, pp. 1–11, 2016, doi: 10.1371/journal.pone.0154858.
[15] E. V. L. Costa and R. A. Nogueira, “Fractal, multifractal and lacunarity analysis applied in retinal regionsof diabetic patients with and without non-proliferative diabetic retinopathy,” Fractal Geometry and Nonlinear Anal in Med and Biol, vol. 1, no. 3, pp. 112–119, 2016, doi: 10.15761/FGNAMB.1000118.
[16] W. Wiharto, E. Suryani, and M. Yahya Kipti, “Assessment of Early Hypertensive Retinopathy using Fractal Analysis of Retinal Fundus Image,” TELKOMNIKA (Telecommunication Computing Electronics and Control), vol. 16, no. 1, pp. 445-454, 2018, doi: 10.12928/telkomnika.v16i1.6188
[17] M. F. Syahputra, I. Aulia, R. F. Rahmat, and others, “Hypertensive retinopathy identification from retinal fundus image using probabilistic neural network,” in 2017 International Conference on Advanced Informatics, Concepts, Theory, and Applications (ICAICTA), 2017, pp. 1–6.
[18] K. Narasimhan, V. C. Neha, and K. Vijayarekha, “Hypertensive retinopathy diagnosis from fundus images by estimation of AVR,” Procedia Engineering, vol. 38, no. 2012, pp. 980–993, 2012, doi: 10.1016/j.proeng.2012.06.124.
[19] N. Hutson, A. Karan, J. A. Adkinson, P. Sidiropoulos, I. Vlachos, and L. Iasemidis, “Classification of Ocular Disorders Based on Fractal and Invariant Moment Analysis of Retinal Fundus Images,” in 2016 32nd Southern Biomedical Engineering Conference (SBEC), Shreveport, LA, USA, Mar. 2016, pp. 57–58. doi: 10.1109/SBEC.2016.21.
[20] U. G. Abbasi and U. M. Akram, “Classification of blood vessels as arteries and veins for diagnosis of hypertensive retinopathy,” in 2014 10th International Computer Engineering Conference: Today Information Society What’s Next?, ICENCO 2014, Giza, Egypt, 2014, pp. 5–9. doi: 10.1109/ICENCO.2014.7050423.
[21] B. K. Triwijoyo, W. Budiharto, and E. Abdurachman, “The Classification of Hypertensive Retinopathy using Convolutional Neural Network,” Procedia Computer Science, vol. 116, pp. 166–173, 2017, doi: 10.1016/j.procs.2017.10.066.
[22] X. Miao and J. S. Heaton, “A comparison of random forest and Adaboost tree in ecosystem classification in east Mojave Desert,” in 2010 18th International Conference on Geoinformatics, Beijing, China, Jun. 2010, pp. 1–6. doi: 10.1109/GEOINFORMATICS.2010.5567504.
[23] C. A. Lupascu, D. Tegolo, and E. Trucco, “FABC: Retinal Vessel Segmentation Using AdaBoost,” IEEE Trans. Inform. Technol. Biomed., vol. 14, no. 5, pp. 1267–1274, Sep. 2010, doi: 10.1109/TITB.2010.2052282.
[24] N. Dey, A. B. Roy, M. Pal, and A. Das, “FCM Based Blood Vessel Segmentation Method for Retinal Images,” International Journal of Computer Science and Network (IJCSN), vol. 1, no. 3, pp. 1–5, 2012.
[25] G. B. Kande, T. S. Savithri, and P. Subbaiah, “Segmentation of vessels in fundus images using spatially weighted fuzzy c-means clustering algorithm,” International journal of computer science and network security, vol. 7, no. 12, pp. 102–109, 2007.
[26] R. A. Aras, T. Lestari, H. Adi Nugroho, and I. Ardiyanto, “Segmentation of retinal blood vessels for detection of diabetic retinopathy: A review,” Communications in Science and Technology, vol. 1, no. 2016, pp. 33–41, 2016, doi: 10.21924/cst.1.1.2016.13.
[27] Ş. Talu, C. Vlǎduţiu, L. A. Popescu, C. A. Lupaşcu, Ş. C. Vesa, and S. D. Ţǎlu, “Fractal and lacunarity analysis of human retinal vessel arborisation in normal and amblyopic eyes,” Human and Veterinary Medicine, vol. 5, no. 2, pp. 45–51, 2013.
[28] T. Acharya and K. R. Ray, Image Processing: Principles and Applications. USA: John Willey & Sons, 2005.
[29] B. B. Mandelbrot, The fractal geometry of nature, vol. 173. WH freeman New York, 1983.
[30] A. R. Backes and O. M. Bruno, “A new approach to estimate fractal dimension of texture images,” in In: Elmoataz A., Lezoray O., Nouboud F., Mammass D. (eds) Image and Signal Processing. ICISP 2008. Lecture Notes in Computer Science, Berlin, Heidelberg, 2008, vol. 5099, pp. 136–143. doi: 10.1007/978-3-540-69905-7_16.
[31] C. Allain and M. Cloitre, “Characterizing the lacunarity of random and deterministic fractal sets,” Physical Review A, vol. 44, no. 6, pp. 3552–3558, 1991, doi: 10.1103/PhysRevA.44.3552.
[32] M.-K. Hu, “Visual pattern recognition by moment invariants,” IRE transactions on information theory, vol. 8, no. 2, pp. 179–187, 1962.
[33] Z. Wu et al., “Application of image retrieval based on convolutional neural networks and Hu invariant moment algorithm in computer telecommunications,” Computer Communications, vol. 150, pp. 729–738, Jan. 2020, doi: 10.1016/j.comcom.2019.11.053.
[34] S. Lefkovits and L. Lefkovits, “Gabor Feature Selection Based on Information Gain,” Procedia Engineering, vol. 181, no. 2017, pp. 892–898, 2017, doi: 10.1016/j.proeng.2017.02.482.
[35] Y. Freund and R. E. Schapire, “Experiments with a New Boosting Algorithm,” in Proceedings of the Thirteenth International Conference on International Conference on Machine Learning, The University of Virginia, 1996, pp. 148–156.
[36] G. Eibl and K. Pfeiffer, “Multiclass Boosting forWeak Classifiers,” Journal of Machine Learning Research, vol. 6, no. 7, pp. 189–210, 2005.
[37] Y. Freund and R. E. Schapire, “A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting,” Journal of Computer and System Sciences, vol. 55, no. 1, pp. 119–139, Aug. 1997, doi: 10.1006/jcss.1997.1504.
[38] E. Ramentol, Y. Caballero, R. Bello, and F. Herrera, “SMOTE-RSB *: A hybrid preprocessing approach based on oversampling and undersampling for high imbalanced data-sets using SMOTE and rough sets theory,” Knowledge and Information Systems, vol. 33, no. 2, pp. 245–265, 2012, doi: 10.1007/s10115-011-0465-6.
[39] S. McGee, “Simplifying likelihood ratios,” J Gen Intern Med, vol. 17, no. 8, pp. 647–650, Aug. 2002, doi: 10.1046/j.1525-1497.2002.10750.x.
[40] F. Gorunescu, Data Mining: Concepts, Models and Techniques. Berlin, Heidelberg: Springer, 2011.
[41] M. Stella and S. Kumar, “Prediction and Comparison using AdaBoost and ML Algorithms with Autistic Children Dataset,” IJERT, vol. 9, no. 7, pp. 133–136, Jul. 2020, doi: 10.17577/IJERTV9IS070091.
[42] P. H. Prastyo, I. G. Paramartha, M. S. M. Pakpahan, and I. Ardiyanto, “Predicting Breast Cancer: A Comparative Analysis of Machine Learning Algorithms,” PROC. INTERNAT. CONF. SCI. ENGIN., vol. 3, no. 2020, pp. 455–459, 2020.
Wiharto is an Associate professor of Computer Science at Department of Informatics, Sebelas Maret University, Surakarta, Indonesia. He received his Ph.D. degree in Biomedical Engineering (Medical Informatics) from Gadjah Mada University, Indonesia in 2017. He is conducting research activities in the areas of Artificial Intelligence, Computational Intelligence, Medical Imaging, Clinical Decision Support System and Data Mining.
Esti Suryani received obtained a Bachelor of Science (B.S.) from Gadjah Mada University, Yogyakarta, Indonesia, 2002 and Master’s Degree in Computer Science (M.Cs.) from Gadjah Mada University, Yogyakarta, Indonesia, 2006. He is presently working as an Assistant professor in the Department of Informatics, Faculty of mathematics and natural sciences, Sebelas Maret University, Surakarta, Indonesia. His experience and areas of interest focus on image processing, fuzzy logic, and Data security.
Murdoko Susilo received obtained a Bachelor of Science (B.S.) from Sebelas Maret University, Surakarta, Indonesia, 2020. The area of research being carried out is the medical image processing, data mining, artificial intelligence, and clinical decision support system