Foreground-Back ground Segmentation using K-Means Clustering Algorithm and Support Vector Machine
محورهای موضوعی : IT StrategyMasoumeh Rezaei 1 , mansoureh rezaei 2 , Masoud Rezaei 3
1 - Faculty of Electrical and Computer Engineering, University of Sistan and Baluchestan, Zahedan, Iran
2 - Computer Engineering Department, Faculty of Engineering, Yazd University, Yazd, Iran
3 - Faculty of Electrical Engineering, K. N. Toosi University of Technology, Tehran, Iran
کلید واژه: Foreground-Background Segmentation, Support vector machine, k-means clustering, saliency map,
چکیده مقاله :
Foreground-background image segmentation has been an important research problem. It is one of the main tasks in the field of computer vision whose purpose is detecting variations in image sequences. It provides candidate objects for further attentional selection, e.g., in video surveillance. In this paper, we introduce an automatic and efficient Foreground-background segmentation. The proposed method starts with the detection of visually salient image regions with a saliency map that uses Fourier transform and a Gaussian filter. Then, each point in the maps classifies as salient or non-salient using a binary threshold. Next, a hole filling operator is applied for filling holes in the achieved image, and the area-opening method is used for removing small objects from the image. For better separation of the foreground and background, dilation and erosion operators are also used. Erosion and dilation operators are applied for shrinking and expanding the achieved region. Afterward, the foreground and background samples are achieved. Because the number of these data is large, K-means clustering is used as a sampling technique to restrict computational efforts in the region of interest. K cluster centers for each region are set for training of Support Vector Machine (SVM). SVM, as a powerful binary classifier, is used to segment the interest area from the background. The proposed method is applied on a benchmark dataset consisting of 1000 images and experimental results demonstrate the supremacy of the proposed method to some other foreground-background segmentation methods in terms of ER, VI, GCE, and PRI.
Foreground-background image segmentation has been an important research problem. It is one of the main tasks in the field of computer vision whose purpose is detecting variations in image sequences. It provides candidate objects for further attentional selection, e.g., in video surveillance. In this paper, we introduce an automatic and efficient Foreground-background segmentation. The proposed method starts with the detection of visually salient image regions with a saliency map that uses Fourier transform and a Gaussian filter. Then, each point in the maps classifies as salient or non-salient using a binary threshold. Next, a hole filling operator is applied for filling holes in the achieved image, and the area-opening method is used for removing small objects from the image. For better separation of the foreground and background, dilation and erosion operators are also used. Erosion and dilation operators are applied for shrinking and expanding the achieved region. Afterward, the foreground and background samples are achieved. Because the number of these data is large, K-means clustering is used as a sampling technique to restrict computational efforts in the region of interest. K cluster centers for each region are set for training of Support Vector Machine (SVM). SVM, as a powerful binary classifier, is used to segment the interest area from the background. The proposed method is applied on a benchmark dataset consisting of 1000 images and experimental results demonstrate the supremacy of the proposed method to some other foreground-background segmentation methods in terms of ER, VI, GCE, and PRI.
[1] X. Y. Wang, W. W. Sun, Z. F. Wu, H. Y. Yang, "Color image segmentation using PDTDFB domain hidden Markov tree model", Applied Soft Computing, Vol. 29, 2015, pp. 138-152.
[2] A. Dirami, K. Hammouche, M. Diaf, P. Siarry, P., "Fast multilevel thresholding for image segmentation through a multiphase level set method", Signal processing, 93(1), 2013, pp. 139-153.
[3] H. Cai, Z. Yang, X. Cao, W. Xia, X. Xu, "A new iterative triclass thresholding technique in image segmentation", IEEE transactions on image processing, Vol. 23, No. 3, 2014, pp.1038-1046.
[4] L. U. Ambata and E. P. Dadios, "Foreground Background Separation and Tracking", International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment (HNICEM), 2019, pp. 1-6.
[5] F. A. Khan, M. Nawaz, M. Imran, A. U. Rahman and F. Qayum, "Foreground detection using motion histogram threshold algorithm in high-resolution large datasets", Multimedia Systems, 2020, pp. 1-12.
[6] M. Castillo-Martinez, F. J. GallegosFunes, B. E. Carvajal- Gamez, G. Urriolagoitia-Sosa, A. J. Rosales-Silva, "Color index based thresholding method for background and foreground segmentation of plant images", Computers and Electronics in Agriculture, Vol. 178, 2020, p. 105783.
[7] J. Canny, "A computational approach to edge detection", IEEE Transactions on pattern analysis and machine intelligence, Vol. 93, No. 6, 1986, pp. 679-698.
[8] J. M. Prewitt, "Object enhancement and extraction", Picture processing and Psychopictorics, Vol. 10, No. 1, 1970, pp. 15-19.
[9] R. C. Gonzalez and R. E. Woods, "Digital image processing", 2002.
[10] T. Uemura and G. Koutaki and K. Uchimura, "T. Uemura and G. Koutaki and K. Uchimura", International Journal of Innovative computing, Information and control, Vol. 7, No. 10, 2011, pp. 6073-6083.
[11] D. Díaz-Pernil, A. Berciano, F. Peña-Cantillana and M. A. Gutiérrez-Naranjo, "Segmenting images with gradient-based edge detection using membrane computing", Pattern Recognition Letters, Vol. 34, No. 8, 2013, pp. 846-855.
[12] C. Panagiotakis, I. Grinias and G. Tziritas, "Natural image segmentation based on tree equipartition, bayesian flooding and region merging", IEEE Transactions on Image Processing, Vol. 20, No. 8, 2011, pp. 2276-2287.
[13] J. Ning, D. Zhang and C. Wu and F. Yue, "Automatic tongue image segmentation based on gradient vector flow and region merging", Neural Computing and Applications, Vol. 21, No. 8, 2012, pp. 1819-1826.
[14] L. Wang, H. Wu and C. Pan, "Region-based image segmentation with local signed difference energy", Pattern Recognition Letters, Vol. 34, No. 6, 2013, pp. 637-645.
[15] S. E. Ebadi and E. Izquierdo, "Foreground segmentation with tree-structured sparse RPCA", IEEE transactions on pattern analysis and machine intelligence, Vol. 40, No. 9, 2017, pp. 2273-2280.
[16] Y. Boykov, O. Veksler and R. Zabih, "Fast approximate energy minimization via graph cuts", IEEE Transactions on pattern analysis and machine intelligence, Vol. 23, No. 11, 2001, pp. 1222-1239. [17] T.M. Nguyen and Q. J. Wu, "Fast and robust spatially constrained Gaussian mixture model for image segmentation", IEEE transactions on circuits and systems for video technology, Vol. 23, No. 4, 2012, pp. 621-635.
[18] O. O. Karadag and F. T. Y. Vural, "Image segmentation by fusion of low level and domain specific information via Markov Random Fields", Pattern Recognition Letters, Vol. 46, 2014, pp. 75-82.
[19] N. Dhanachandra, K. Manglem and Y. J. Chanu, "Image segmentation using K-means clustering algorithm and subtractive clustering algorithm", Procedia Computer Science, Vol. 54, 2015, pp. 764-771.
[20] Y. Yang, Y. Wang and X. Xue, "A novel spectral clustering method with superpixels for image segmentation", Optik, Vol. 127, No. 1, 2016, pp. 161-167.
[21] L. A. Lim and H. Y. Keles, "Foreground segmentation using convolutional neural networks for multiscale feature encoding", Pattern Recognition Letters, Vol. 112, 2018, pp. 256-262.
[22] A. Shahbaz and K. H. Jo, "Deep Foreground Segmentation using Convolutional Neural Network", IEEE 28th International Symposium on Industrial Electronics (ISIE), 2019, p. 103334.
[23] P. Patil and S. Murala, "Fggan: A cascaded unpaired learning for background estimation and foreground segmentation", IEEE Winter Conference on Applications of Computer Vision (WACV), 2019, pp. 1770-1778.
[24] D. Sakkos and E. S. Ho and H. P. Shum, "Illumination-aware multi-task GANs for foreground segmentation", IEEE Access, Vol. 7, 2019, pp. 10976-10986.
[25] J. Liang and Y. Xue and J. Wang, "Genetic programming based feature construction methods for foreground object segmentation", Engineering Applications of Artificial Intelligence", Vol. 89, 2020, p. 103334.
[26] Z. Yu, H. S. Wong and G. Wen, "A modified support vector machine and its application to image segmentation", Image and Vision Computing, Vol. 29, No. 1, 2011, pp. 29-40.
[27] X. Y. Wang, Q. Y. Wang, H. Y. Yang and J. Bu, "Color image segmentation using automatic pixel classification with support vector machine", Neurocomputing, Vol. 74, No. 18, 2011, pp. 3898-3911.
[28] X. Bai and W. Wang, "Saliency-SVM: An automatic approach for image segmentation", Neurocomputing, Vol. 136, 2014, pp. 243-255.
[29] M. K. Sangale and N. B. Kadu, "Real-time Foreground Segmentation and Boundary Matting for Live Videos using SVM".
[30] C. Tang and M. O. Ahmad and C. Wang, "Foreground segmentation in video sequences with a dynamic background", 11th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), 2018, pp. 1-6.
[31] L. U. Shuhan and S. J. YE, "Using an image segmentation and support vector machine method for identifying two locust species and instars", Journal of Integrative Agriculture, Vol. 19, No. 5, 2020, pp. 1301-1313.
[32] N. Dhanachandra, K. Manglem, and Y. Chanu, "Image segmentation using K-means clustering algorithm and subtractive clustering algorithm". Procedia Computer Science, 54, 2015, pp. 764-771.
[33] W. Chen, C. He, C. Ji, M. Zhang, S. and Chen, "An improved K-means algorithm for underwater image background segmentation", Multimedia Tools and Applications, 80(14), 2021, pp. 21059-21083.
[34] Y. Yang, H. Bilen, Q. Zou, W. Y. Cheung, X. Ji, "Learning Foreground-Background Segmentation from Improved Layered GANs", In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (pp. 2524-2533), 2022.
[35] B. E. Boser, I. M. Guyon and V. N. Vapnik, "A training algorithm for optimal margin classifiers", In Proceedings of the fifth annual workshop on Computational learning theory, 1992, pp. 144-152.
[36] V. N. Vapnik, "Statistical learning theory", Wiley, New York, 1998.
[37] M. A. Aizerman,"Theoretical foundations of the potential function method in pattern recognition learning",Automation and remote control, Vol. 25, 1964, pp.821-837.
[38] S. Lloyd,"Least squares quantization in PCM", IEEE transactions on information theory, Vol. 28, No. 2, 1982, pp. 129-137.
[39] C. Guo and L. Zhang, "A novel multiresolution spatiotemporal saliency detection model and its applications in image and video compression", IEEE transactions on image processing, Vol. 19, No. 1, 2009, pp. 185-198.
[40] P. Soille, "Morphological image analysis: principles and applications", Springer Science and Business Media, 2013.
[41] R. Achanta and Sh. Hemami and F. Estrada and S. Susstrunk, "Frequency-tuned salient region detection", IEEE conference on computer vision and pattern recognition, 2009, pp. 1597-1604.
[42] R. Unnikrishnan, C. M. Pantofaru and M. Hebert, "Toward objective evaluation of image segmentation algorithms", IEEE transactions on pattern analysis and machine intelligence, Vol. 29, No. 6, 2007, pp.929-944.
[43] M. Meila,"Comparing clusterings—an information based distance",Journal of multivariate analysis, Vol. 98, No. 5, 2007, pp. 873-895.
[44] D. Martin and C. Fowlkes and D. Tal and J. Malik, "A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics", Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, Vol. 2, 2001, pp. 416-423.
http://jist.acecr.org ISSN 2322-1437 / EISSN:2345-2773 |
Journal of Information Systems and Telecommunication
|
Foreground-Back ground Segmentation using K-Means Clustering Algorithm and Support Vector Machine |
Masoumeh Rezaei1*, Mansoureh Rezaei2 , Masoud Rezaei3
|
1. Faculty of Electrical and Computer Engineering, University of Sistan and Baluchestan, Zahedan, Iran 2. Computer Engineering Department, Faculty of Engineering, Yazd University, Yazd, Iran 3. Faculty of Electrical Engineering, K. N. Toosi University of Technology, Tehran, Iran |
Received: 10 Jun 2021/ Revised: 04 Apr 2022/ Accepted: 06 May 2022 |
|
Abstract
Foreground-background image segmentation has been an important research problem. It is one of the main tasks in the field of computer vision whose purpose is detecting variations in image sequences. It provides candidate objects for further attentional selection, e.g., in video surveillance. In this paper, we introduce an automatic and efficient Foreground-background segmentation. The proposed method starts with the detection of visually salient image regions with a saliency map that uses Fourier transform and a Gaussian filter. Then, each point in the maps classifies as salient or non-salient using a binary threshold. Next, a hole filling operator is applied for filling holes in the achieved image, and the area-opening method is used for removing small objects from the image. For better separation of the foreground and background, dilation and erosion operators are also used. Erosion and dilation operators are applied for shrinking and expanding the achieved region. Afterward, the foreground and background samples are achieved. Because the number of these data is large, K-means clustering is used as a sampling technique to restrict computational efforts in the region of interest. K cluster centers for each region are set for training of Support Vector Machine (SVM). SVM, as a powerful binary classifier, is used to segment the interest area from the background. The proposed method is applied on a benchmark dataset consisting of 1000 images and experimental results demonstrate the supremacy of the proposed method to some other foreground-background segmentation methods in terms of ER, VI, GCE, and PRI.
Keywords: Foreground-Background Segmentation; Support vector machine; k-means clustering; saliency map.
1- Introduction
Image segmentation is one of the most significant image processing tasks in image analysis and understanding. The main goal of image segmentation is finding objects of interest from a given image using image characteristics such as color, texture, gray level and, so on. Typically, image segmentation methods can be categorized into six categories, Edge detection-based methods, Histogram thresholding-based methods, Graph-based methods, Region-based methods, Statistical model-based methods, and Machine learning-based methods [1].
Histogram thresholding-based approaches use the assumption that adjacent pixels whose value lies within a certain range belong to the same class. Because of their intuitive properties, simplicity of implementation, and computational speed image, these techniques are widely used [2-6]. Edge detection-based approaches assume that pixel values at the boundary between two regions change quickly. There are some edge detectors such as Canny [7], Prewitt [8], Sobel [9], and so on. The output of edge detectors provides candidates for the region boundaries. These algorithms are only suitable for noise-free and simple images [10,11]. The region-based approaches assume that adjacent pixels in the same region have similar visual characteristics [12-15]. By using these methods, pixels can be grouped into homogeneous regions that might be corresponding to an object.
In the Graph-based methods, an image is considered a weighted graph. Pixels and similarities between them are considered as nodes and edges of the graph, respectively. In these methods, image segmentation is measured as a problem of partitioning this graph into components with minimizing a cost function [1]. Graph cuts as one of the most important graph-based methods was introduced in 2001 [16].
Statistical model-based methods use a statistical model that characterizes pixel values [17, 18]. Machine learning-based methods use machine learning techniques for image segmentation [19-25]. In the last decade, some classification techniques have been successfully used in image segmentation. Especially, SVM as a binary classifier can be used for this purpose. In 2011, the Fast Support Vector Machine (FSVM) as a modified SVM was introduced for image segmentation [26]. User-selected objects and background pixels are used for training in this method. In the same year, Wang et al. applied Fuzzy C-Means-SVM (FCM-SVM) for color image segmentation [27]. In this method, training samples are selected randomly from the FCM clustering results. The drawback of this method is the number of FCM clusters must be set in advance, and the random selection of training samples also affects the performance of the final segmentation.
The saliency-SVM (SSVM) method is a combination of visual saliency detection and SVM classification [28]. In this method, a trimap of the given image is extracted according to the saliency map for estimating the prominent locations of the objects. Positive and negative training sets are automatically selected for SVM training through histogram analysis in trimap. The entire highlighted object is segmented using a trained SVM classifier. In 2018, Sangale and Kadu introduced a real-time Foreground- background segmentation using C-1SVM (Competing 1- class Support Vector Machines) technique [29]. The method first trains local C-1SVMs at every pixel area. Then, it relabels each pixel using learned C-1SVM. In the next step, it performs matting along the foreground boundary and then it applies global optimization.
Tang et al. applied SVM for Foreground Segmentation in video sequences [30]. They introduced a novel feature image and used it in the framework of a support vector machine. In 2020, the SVM method is proposed for identifying two locust species and instars [31]. They used the Grab Cut method and principal component analysis for extracting eight features from 73 features of locusts. However, the proposed Image segmentation and feature extraction of this method are complicated which causes difficulty to achieve fully automatic identification.
In addition to the SVM-based methods, some other methods have been introduced in this field. Dhanachandra et al. used k-means to segment the foreground area from the background. The subtractive clustering method is applied to generate the centroid based on the potential value of the data points. In this method, partial contrast stretching is used for improving the quality of the image and the middle filter is applied to improve the segmented image [32].
In 2021, Chen et al. proposed an improved K-means algorithm for underwater image background segmentation. The method deals with the problem of improper determination of the value of K and minimizes the effect of the initial centroid position of the grayscale image during the quantification of the gray surface of the conventional K-means algorithm [33].
In recent years, researchers have proposed some deep learning methods for the image segmentation. In 2022, Yang et al. proposed a generative adversarial deep network for foreground and background segmentation. The method avoids trivial decompositions by maximizing mutual information between generated images and latent variables [34].
In this paper, we propose a novel and efficient foreground-background segmentation. First, SVM is used for segmenting the interest area from the background. Then, K-means clustering is applied for selecting the training data of SVM. It restricts the computational efforts in the region of interest. However, before applying the K-means algorithm, the first saliency parts are selected using a saliency map and some efficient operators.
The rest of this paper is organized as follows. In Section 2, The SVM and K-Means methods are explained in detail. Section 3 describes the proposed method. Section 4 illustrates the experimental results. Finally, the paper is concluded in Section 5.
2- Primary Concepts
This section provides a detailed description of SVM and K-Means methods as the approaches along with the proposed method.
2-1- Support Vector Machine
SVM was proposed by Vapnik and coworkers [35]. It is a supervised learning method that originated from statistical learning theory. The main idea behind SVM is the separation of the two classes with a hyperplane that maximizes the margin between them. Maximizing margin results in minimizing structural risk. The basis of minimizing structural risk instead of empirical risk is an interesting property of SVM [36]. Thus, SVM outperforms other methods such as neural networks which are based on minimizing empirical risk. Also, its strong generalization reduces the influence of the noise. This method also can be considered a non-linear classification using the kernel trick. The kernel maps their inputs into high-dimensional feature spaces implicitly [37].
Consider the problem of separating the data set of N points with the input dataand the corresponding target . In feature space SVM models take the form:
| (1) |
where is a bias term and is a nonlinear function that maps the input space to a high-dimensional space. The dimensionis implicitly defined that it can be infinite-dimensional. SVM optimization problem for the linear separate case is:
| (2) |
and for the non-separable case is:
|
(3) |
where parameter C is the regularization parameter that controls the tradeoff between the complexity of the model and the training error that needs to be specified a priori. A larger C means assigning a higher penalty to errors. The Lagrangian dual problem for the SVM is simply:
|
(4) |
whereare positive Lagrange multipliers and. Finally, the classification problem is solved using quadratic programming packages. The new data can be classified as follows:
| (5) |
|
(6)
|
where is the total number of support vectors.
2-2- K-Means Clustering Algorithm
One of the simplest learning algorithms to solve the clustering problem is K-Means [38]. It partitions a collection of data into k number of disjoint clusters. K-means is an iterative algorithm that has two steps. In the first step, the k centroids are calculated and in the second step, each point is taken to the cluster that has the nearest centroid from the point. It minimizes the sum of distances of each point to its corresponding cluster centroid. The algorithm of k-means clustering is as follows:
Initialize the number of clusters k and centers (e.g., randomly)
For each object,
Calculate the distance d (e. g. Euclidean distance) to each cluster
Assign it to the closest cluster
End
Recalculate the cluster centers positions
Repeat the process until it satisfies the terminal conditions
3- The Proposed Method
The proposed method starts with the detection of visually salient image regions. Saliency detection is the recognition of pixels or regions whose state stands out relative to their neighbors. It defines what is prominent or noticeable. For finding the region of interest in a given image, a saliency map is used. Many researchers have proposed different algorithms for calculating the saliency map. In this paper, the saliency map is considered as follows [39]:
| (7) |
where is the inverse of Fourier Transform,is the phase spectrum of the Fourier transform of a gray level image, and g* represents a 2D Gaussian filter. Using a small standard deviation, as shown in Fig. 1 (c), we hope that the well-defined boundaries of salient objects are achieved. Then, the binarization threshold t is applied for classifying each point on the maps as salient or non-salient (Fig. 1 (d)).
| (8) |
In the next step, the Hole filling algorithm is applied as previously suggested by Soille [40]. Hole filling operators are found in most photo-editing programs, and they are used to fill holes in a given image. Morphologically, a hole is defined as a set of pixels in the background that cannot be reached by filling in the background from the edge of the image. Fig. 1 (e) shows the result after applying the filling holes operation.
The achieved binary image has a small extra spot that needs to be removed. The area-opening method is used for removing objects in the binary image that are too small. By using this method, all connected components with
|
fewer than P pixels are removed from the image. In this paper, in all experiments, P is set to 50. The achieved image is shown in Fig. 1 (f).
For better separation of the foreground and background, dilation and erosion operators are used. Erosion and dilation operators are applied for shrinking and expanding the achieved region (the white region in Fig. 1 (f)), respectively. Two images are realized in this step, one is achieved by applying erosion operator (Fig. 1(g)) and one by applying dilation and then negative operation (Fig. 1(h)). The data in the white region of the first image (F) can be used for foreground samples and the data in the white region of the second image (B) can be used for background samples. Because the number of these data is large, K-means clustering is used as a sampling technique to restrict computational efforts in the region of interest. K cluster centers for each region are set for the training of SVM. It causes reducing the complexity of the SVM classifier. In this paper, for each training pixel, six features including i, j (spatial features), R, G, B, and SM(i,j) are extracted. Finally, all pixels are classified by the trained SVM model.
The algorithm for the proposed method is as follows:
Step 1: Calculate the saliency map for each pixel
Step 2: Classify each point in the maps as salient or non-salient.
Step 3: Fill Holes by applying the Hole filling operator.
Step 4: Remove too small objects with an area-opening operator.
Step 5: Identify F and B regions by applying the erosion and dilation operators.
Step 6: Calculate k cluster centers for each region by applying K-Means
Step 7: Train SVM with achieved data.
Step 8: Classify all pixels by the trained SVM model.
4- Experimental Results
In our experiments, images from a benchmark dataset proposed by Achanta et al. [41] are used. The dataset consists of 1000 images. The results are evaluated in terms of PRI, VOI, GCE, and ER. Each of these criteria reveals the capability of segmentation methods from a specific aspect. Given a set of elements and two partitions of S, a partition of S into r subsets, and , a partition of S to s subsets, the mentioned evaluation metrics are evaluated that described briefly as follows:
I. Probabilistic Rand Index:
The Probabilistic Rand Index (PRI) is introduced as an extension of the Rand Index for evaluating the segmentation approaches [42]. PRI is calculated as follows:
| (9) |
where is the event that pixels i and j have the same label and is its probability.
II. Variation of Information
Meila suggested the Variation of Information (VI) which measures the amount of information lost and gained in changing from one clustering to another [43]. VI is considered as follows:
| (10) |
where H and I are the entropy and mutual information between two segmentation X and Y, respectively.
III. Segmentation Error Rate
The segmentation error rate is calculated as:
| (11) |
where and are the number of false-segmented image pixels and the number of miss-segmented image pixels, respectively.
IV. Global Consistency Error
The Global Consistency Error (GCE) measures the extent to which one segmentation can be viewed as a refinement of the other [44]. For a given element , consider segments that contain in X and Y. These sets of pixels are denoted by and , respectively. The GCE can be defined as follows:
| (12) |
In the first experiment, the proposed method was applied to six images. The segmentation results are demonstrated in Fig. 2. It can be found from Fig. 2 that segmentation results are closest to the ground truth segmentation images in all cases. For better evaluation, the mentioned criteria were calculated and reported in Table 1.
The proposed method is also compared with other segmentation methods. The results for those methods are elicited from [28]. The achieved segmented images for five images are shown in Fig. 3. We can see qualitatively that the proposed method outperformed the other methods, and the segmented images achieved by the proposed method are closest to the ground-truth images.
The methods are also evaluated quantitively in terms of ER, VI, GCE and, PRI (Table. 2-5). The results prove the superiority of the proposed method in comparison to other methods. Just in some cases, SSVM could lead to better results. For example, in image #7, SSVM performs better than the proposed method in terms of VI, GCE, and PRI. We can also be found visually in Fig. 1 that the result of SSVM for this image is better than the proposed method.
For a better comparison, the quantitative results on the whole dataset are reported in Table 6. The numbers are average values of ER, GCE, PRI, and VI on the whole dataset. As reported in Table 6, the performance of the proposed method is superior to other segmentation methods in three metrics. But the SSVM and ISVM methods outperformed the proposed method in terms of VI. Unlike pairwise counting comparison criteria, VI is not directly related to the relationships between point pairs. It can be said that VI is based on the relationship between a point and its cluster in each of the two clusters. Note that this is neither a direct advantage nor a disadvantage compared to criteria based on the pair counts.
The proposed method is simple and efficient and the results demonstrate its supremacy in terms of ER, VI, GCE, and PRI.
5- Conclusion
We have developed a novel and efficient foreground-background segmentation using SVM with the K-means clustering method. First, saliency data are automatically selected using a saliency map and some efficient and simple operators, then K-means is applied as a sampling technique to restrict computational efforts in the region of interest. Then, the SVM model is trained using achieved data. We demonstrated the superiority of the proposed approach to some other segmentation methods based on the SVM model in terms of ER, VI, GCE, and PRI on a benchmark dataset consisting of 1000 images. Although the proposed method shows satisfying results in the mentioned criteria, it has one drawback. The proposed method has two free parameters (K and P) which should be allocated with trial and error. In future works, we wish to apply a Convolutional Neural Network to learn saliency features directly.
|
|
|
References
[1] X. Y. Wang, W. W. Sun, Z. F. Wu, H. Y. Yang, "Color image segmentation using PDTDFB domain hidden Markov tree model", Applied Soft Computing, Vol. 29, 2015, pp. 138-152.
[2] A. Dirami, K. Hammouche, M. Diaf, P. Siarry, P., "Fast multilevel thresholding for image segmentation through a multiphase level set method", Signal processing, 93(1), 2013, pp. 139-153.
[3] H. Cai, Z. Yang, X. Cao, W. Xia, X. Xu, "A new iterative triclass thresholding technique in image segmentation", IEEE transactions on image processing, Vol. 23, No. 3, 2014, pp.1038-1046.
[4] L. U. Ambata and E. P. Dadios, "Foreground Background Separation and Tracking", International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment (HNICEM), 2019, pp. 1-6.
[5] F. A. Khan, M. Nawaz, M. Imran, A. U. Rahman and F. Qayum, "Foreground detection using motion histogram threshold algorithm in high-resolution large datasets", Multimedia Systems, 2020, pp. 1-12.
[6] M. Castillo-Martinez, F. J. GallegosFunes, B. E. Carvajal- Gamez, G. Urriolagoitia-Sosa, A. J. Rosales-Silva, "Color index based thresholding method for background and foreground segmentation of plant images", Computers and Electronics in Agriculture, Vol. 178, 2020, p. 105783.
[7] J. Canny, "A computational approach to edge detection", IEEE Transactions on pattern analysis and machine intelligence, Vol. 93, No. 6, 1986, pp. 679-698.
[8] J. M. Prewitt, "Object enhancement and extraction", Picture processing and Psychopictorics, Vol. 10, No. 1, 1970, pp. 15-19.
[9] R. C. Gonzalez and R. E. Woods, "Digital image processing", 2002.
[10] T. Uemura and G. Koutaki and K. Uchimura, "T. Uemura and G. Koutaki and K. Uchimura", International Journal of Innovative computing, Information and control, Vol. 7, No. 10, 2011, pp. 6073-6083.
[11] D. Díaz-Pernil, A. Berciano, F. Peña-Cantillana and M. A. Gutiérrez-Naranjo, "Segmenting images with gradient-based edge detection using membrane computing", Pattern Recognition Letters, Vol. 34, No. 8, 2013, pp. 846-855.
[12] C. Panagiotakis, I. Grinias and G. Tziritas, "Natural image segmentation based on tree equipartition, bayesian flooding and region merging", IEEE Transactions on Image Processing, Vol. 20, No. 8, 2011, pp. 2276-2287.
[13] J. Ning, D. Zhang and C. Wu and F. Yue, "Automatic tongue image segmentation based on gradient vector flow and region merging", Neural Computing and Applications, Vol. 21, No. 8, 2012, pp. 1819-1826.
[14] L. Wang, H. Wu and C. Pan, "Region-based image segmentation with local signed difference energy", Pattern Recognition Letters, Vol. 34, No. 6, 2013, pp. 637-645.
[15] S. E. Ebadi and E. Izquierdo, "Foreground segmentation with tree-structured sparse RPCA", IEEE transactions on pattern analysis and machine intelligence, Vol. 40, No. 9, 2017, pp. 2273-2280.
[16] Y. Boykov, O. Veksler and R. Zabih, "Fast approximate energy minimization via graph cuts", IEEE Transactions on pattern analysis and machine intelligence, Vol. 23, No. 11, 2001, pp. 1222-1239.
[17] T.M. Nguyen and Q. J. Wu, "Fast and robust spatially constrained Gaussian mixture model for image segmentation", IEEE transactions on circuits and systems for video technology, Vol. 23, No. 4, 2012, pp. 621-635.
[18] O. O. Karadag and F. T. Y. Vural, "Image segmentation by fusion of low level and domain specific information via Markov Random Fields", Pattern Recognition Letters, Vol. 46, 2014, pp. 75-82.
[19] N. Dhanachandra, K. Manglem and Y. J. Chanu, "Image segmentation using K-means clustering algorithm and subtractive clustering algorithm", Procedia Computer Science, Vol. 54, 2015, pp. 764-771.
[20] Y. Yang, Y. Wang and X. Xue, "A novel spectral clustering method with superpixels for image segmentation", Optik, Vol. 127, No. 1, 2016, pp. 161-167.
[21] L. A. Lim and H. Y. Keles, "Foreground segmentation using convolutional neural networks for multiscale feature encoding", Pattern Recognition Letters, Vol. 112, 2018, pp. 256-262.
[22] A. Shahbaz and K. H. Jo, "Deep Foreground Segmentation using Convolutional Neural Network", IEEE 28th International Symposium on Industrial Electronics (ISIE), 2019, p. 103334.
[23] P. Patil and S. Murala, "Fggan: A cascaded unpaired learning for background estimation and foreground segmentation", IEEE Winter Conference on Applications of Computer Vision (WACV), 2019, pp. 1770-1778.
[24] D. Sakkos and E. S. Ho and H. P. Shum, "Illumination-aware multi-task GANs for foreground segmentation", IEEE Access, Vol. 7, 2019, pp. 10976-10986.
[25] J. Liang and Y. Xue and J. Wang, "Genetic programming based feature construction methods for foreground object segmentation", Engineering Applications of Artificial Intelligence", Vol. 89, 2020, p. 103334.
[26] Z. Yu, H. S. Wong and G. Wen, "A modified support vector machine and its application to image segmentation", Image and Vision Computing, Vol. 29, No. 1, 2011, pp. 29-40.
[27] X. Y. Wang, Q. Y. Wang, H. Y. Yang and J. Bu, "Color image segmentation using automatic pixel classification with support vector machine", Neurocomputing, Vol. 74, No. 18, 2011, pp. 3898-3911.
[28] X. Bai and W. Wang, "Saliency-SVM: An automatic approach for image segmentation", Neurocomputing, Vol. 136, 2014, pp. 243-255.
[29] M. K. Sangale and N. B. Kadu, "Real-time Foreground Segmentation and Boundary Matting for Live Videos using SVM".
[30] C. Tang and M. O. Ahmad and C. Wang, "Foreground segmentation in video sequences with a dynamic background", 11th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), 2018, pp. 1-6.
[31] L. U. Shuhan and S. J. YE, "Using an image segmentation and support vector machine method for identifying two locust species and instars", Journal of Integrative Agriculture, Vol. 19, No. 5, 2020, pp. 1301-1313.
[32] N. Dhanachandra, K. Manglem, and Y. Chanu, "Image segmentation using K-means clustering algorithm and subtractive clustering algorithm". Procedia Computer Science, 54, 2015, pp. 764-771.
[33] W. Chen, C. He, C. Ji, M. Zhang, S. and Chen, "An improved K-means algorithm for underwater image background segmentation", Multimedia Tools and Applications, 80(14), 2021, pp. 21059-21083.
[34] Y. Yang, H. Bilen, Q. Zou, W. Y. Cheung, X. Ji, "Learning Foreground-Background Segmentation from Improved Layered GANs", In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (pp. 2524-2533), 2022.
[35] B. E. Boser, I. M. Guyon and V. N. Vapnik, "A training algorithm for optimal margin classifiers", In Proceedings of the fifth annual workshop on Computational learning theory, 1992, pp. 144-152.
[36] V. N. Vapnik, "Statistical learning theory", Wiley, New York, 1998.
[37] M. A. Aizerman,"Theoretical foundations of the potential function method in pattern recognition learning",Automation and remote control, Vol. 25, 1964, pp.821-837.
[38] S. Lloyd,"Least squares quantization in PCM", IEEE transactions on information theory, Vol. 28, No. 2, 1982, pp. 129-137.
[39] C. Guo and L. Zhang, "A novel multiresolution spatiotemporal saliency detection model and its applications in image and video compression", IEEE transactions on image processing, Vol. 19, No. 1, 2009, pp. 185-198.
[40] P. Soille, "Morphological image analysis: principles and applications", Springer Science and Business Media, 2013.
[41] R. Achanta and Sh. Hemami and F. Estrada and S. Susstrunk, "Frequency-tuned salient region detection",
IEEE conference on computer vision and pattern recognition, 2009, pp. 1597-1604.
[42] R. Unnikrishnan, C. M. Pantofaru and M. Hebert, "Toward objective evaluation of image segmentation algorithms", IEEE transactions on pattern analysis and machine intelligence, Vol. 29, No. 6, 2007, pp.929-944.
[43] M. Meila,"Comparing clusterings—an information based distance",Journal of multivariate analysis, Vol. 98, No. 5, 2007, pp. 873-895.
[44] D. Martin and C. Fowlkes and D. Tal and J. Malik, "A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics", Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, Vol. 2, 2001, pp. 416-423.
* Masoumeh Rezaei