Proposing an FCM-MCOA Clustering Approach Stacked with Convolutional Neural Networks for Analysis of Customers in Insurance Company
محورهای موضوعی : Data MiningMotahareh Ghavidel 1 , meisam Yadollahzadeh tabari 2 , Mehdi Golsorkhtabaramiri 3
1 - Department of Computer Engineering, Islamic Azad University, Babol Branch, Babol, Iran.
2 - Department of Computer Engineering, Islamic Azad University, Babol Branch, Babol, Iran.
3 - Department of Computer Engineering, Islamic Azad University, Babol Branch, Babol, Iran.
کلید واژه: Customer Clustering, Fuzzy C-Means, Cuckoo Optimization, Convolutional Neural Networks,
چکیده مقاله :
To create a customer-based marketing strategy, it is necessary to perform a proper analysis of customer data so that customers can be separated from each other or predict their future behavior. The datasets related to customers in any business usually are high-dimensional with too many instances and include both supervised and unsupervised ones. For this reason, companies today are trying to satisfy their customers as much as possible. This issue requires careful consideration of customers from several aspects. Data mining algorithms are one of the practical methods in businesses to find the required knowledge from customer’s both demographic and behavioral. This paper presents a hybrid clustering algorithm using the Fuzzy C-Means (FCM) method and the Modified Cuckoo Optimization Algorithm (MCOA). Since customer data analysis has a key role in ensuring a company's profitability, The Insurance Company (TIC) dataset is utilized for the experiments and performance evaluation. We compare the convergence of the proposed FCM-MCOA approach with some conventional optimization methods, such as Genetic Algorithm (GA) and Invasive Weed Optimization (IWO). Moreover, we suggest a customer classifier using the Convolutional Neural Networks (CNNs). Simulation results reveal that the FCM-MCOA converges faster than conventional clustering methods. In addition, the results indicate that the accuracy of the CNN-based classifier is more than 98%. CNN-based classifier converges after some couples of iterations, which shows a fast convergence in comparison with the conventional classifiers, such as Decision Tree (DT), Support Vector Machine (SVM), K-Nearest Neighborhood (KNN), and Naive Bayes (NB) classifiers.
To create a customer-based marketing strategy, it is necessary to perform a proper analysis of customer data so that customers can be separated from each other or predict their future behavior. The datasets related to customers in any business usually are high-dimensional with too many instances and include both supervised and unsupervised ones. For this reason, companies today are trying to satisfy their customers as much as possible. This issue requires careful consideration of customers from several aspects. Data mining algorithms are one of the practical methods in businesses to find the required knowledge from customer’s both demographic and behavioral. This paper presents a hybrid clustering algorithm using the Fuzzy C-Means (FCM) method and the Modified Cuckoo Optimization Algorithm (MCOA). Since customer data analysis has a key role in ensuring a company's profitability, The Insurance Company (TIC) dataset is utilized for the experiments and performance evaluation. We compare the convergence of the proposed FCM-MCOA approach with some conventional optimization methods, such as Genetic Algorithm (GA) and Invasive Weed Optimization (IWO). Moreover, we suggest a customer classifier using the Convolutional Neural Networks (CNNs). Simulation results reveal that the FCM-MCOA converges faster than conventional clustering methods. In addition, the results indicate that the accuracy of the CNN-based classifier is more than 98%. CNN-based classifier converges after some couples of iterations, which shows a fast convergence in comparison with the conventional classifiers, such as Decision Tree (DT), Support Vector Machine (SVM), K-Nearest Neighborhood (KNN), and Naive Bayes (NB) classifiers.
[1] A. Voulodimos, N. Doulamis, A. Doulami, and E. Protopapadakis, “Deep learning for computer vision: A brief review”, Computational intelligence and neuroscience, 2018.
[2] M. Jahangiri, and S. Ghavami, “Hybrid fuzzy c-means clustering algorithm and multilayer perceptron for increasing the estimate accuracy of the geochemical element concentration case study: eastern zone of porphyry copper deposit of Sonajil”, Iranian Journal of Geology, Vol. 48, No. 48, pp. 0, 2019.
[3] M. K. Pakhira, “A fast k-means algorithm using cluster shifting to produce compact and separate clusters”, Int J Eng, Vol. 28, No. 1, pp. 35-43, 2015.
[4] M. Setnes, and U. Kaymak, “Extended fuzzy c-means with volume prototypes and cluster merging”, In Proceedings of the 6th European Conference on Intelligent Techniques and Soft Computing (EUFIT’98), 1998, pp. 1360-1364.
[5] G. S. Budhi, R. Chiong, and Z. Wang, “Resampling imbalanced data to detect fake reviews using machine learning classifiers and textual-based features”, Multimedia Tools and Applications, Vol. 80, No. 9, pp.
13079-13097, 2021. [6] S. Cateni, V. Colla, A.Vignali, and M.Vannucci, “Data Pre-processing for Efficient Design of Machine Learning-Based Models to be Applied in the Steel Sector”, In Impact and Opportunities of Artificial Intelligence Techniques in the Steel Industry: Ongoing Applications, Perspectives and Future Trends, pp. 13-27, 2021.
[7] Z. Abtahi, R. Sahraeian, and D. Rahmani, “A Stochastic Model for Prioritized Outpatient Scheduling in a Radiology Center”, International Journal of Engineering Transactions A: Basics, Vol. 33, No. 4, 2020.
[8] J. MacQueen, “Classification and analysis of multivariate observations”, In 5th Berkeley Symp. Math. Statist. Probability, pp. 281-297, 1967.
[9] T. Abukhalil, M. Patil, and T. Sobh, “A comprehensive survey on decentralized modular swarm robotic systems and deployment environments”, International Journal of Engineering (IJE), Vol. 7, No. 2, pp. 44, 2013.
[10] C. Li, L. Liu, X. Sun, J. Zhao, and J. Yin,” Image segmentation based on fuzzy clustering with cellular automata and features weighting”, EURASIP Journal on Image and Video Processing, Nom. 1, pp. 1-11, 2019.
[11] T. M. Silva filho, B. A. Pimentel, R. M. Souza, and A. L. Oliveira, “Hybrid methods for fuzzy clustering based on fuzzy c-means and improved particle swarm optimization”, Expert Systems with Applications, Vol. 42, No. 17-18, pp. 6315-6328, 2015.
[12] S. Das, A. Abraham, and A. Konar, “Automatic clustering using an improved differential evolution algorithm”, IEEE Transactions on systems, man, and cybernetics-part A: Systems and Humans, Vol. 38, No. 1, pp. 218-237, 2007.
[13] S. Paterlini and T. Krink, “Differential evolution and particle swarm optimization in partitional clustering”, Computational statistics & data analysis, Vol. 50, No. 5, pp. 1220-1247, 2006.
[14] T. Niknam, M. Nayeripour, and B. B. Firouzi, “Application of a new hybrid optimization algorithm on cluster analysis”, In Proceedings of world academy of science, engineering and technology, Vol. 36, pp. 599, 2008.
[15] K. Krishna, and M. N. Murty, “Genetic K-means algorithm”, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), Vol. 29, No. 3, pp. 433-439, 1999.
[16] H. Izakian, A. Abraham, and V. Snášel, “Fuzzy clustering using hybrid fuzzy c-means and fuzzy particle swarm optimization”, In 2009 World Congress on Nature & Biologically Inspired Computing (NaBIC), pp. 1690-1694, 2009.
[17] X. S. Yang, and S. Deb, “Cuckoo search: recent advances and applications”, Neural Computing and applications, Vol. 24, No. 1, pp. 169-174, 2014.
[18] X. S.Yang, and S. Deb, “Engineering optimisation by cuckoo search” , International Journal of Mathematical Modelling and Numerical Optimisation, Vol. 1, No. 4, pp. 330-343, 2010.
[19] R. Rajabioun, “Cuckoo optimization algorithm”, Applied soft computing, Vol. 11, No. 8, pp. 5508-5518, 2011.
[20] H. Kahramanli, “A modified cuckoo optimization algorithm for engineering optimization”, International Journal of Future Computer and Communication, Vol. 1, No. 2, pp. 199, 2012.
[21] M. Momeny, M. Agha Sarram, A.M. Latif, and R. Sheikhpour, “Improving the Architecture of Convolutional Neural Network for Classification of Images Corrupted by Impulse Noise”, Nashriyyah-i Muhandisi-i Barq va Muhandisi-i Kampyutar-i Iran, Vol. 76, No. 4, pp. 267, 2020.
[22] M. Rohanian, M. Salehi, A. Darzi, and V. Ranjbar, “Convolutional Neural Networks for Sentiment Analysis in Persian Social Media”, arXiv preprint arXiv:2002.06233, 2020.
[23] M. Mobini, G. Kaddoum, and M. Herceg, “Design of a SIMO deep learning-based chaos shift keying (DLCSK) communication system”, Sensors, Vol. 22, No. 1, pp. 333, 2022.
[24] D. Madurasinghe, and G. K. Venayagamoorthy, “LVQ neural network for online identification of power system network branch events”, In 2020 Clemson University Power Systems Conference, 2020, pp. 1-7.
[25] B. Zadrozny, and C. Elkan, “Transforming classifier scores into accurate multiclass probability estimates”, In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, 2002, pp. 694-699.
[26] A. R. Mehrabian, and C. Lucas, “A novel numerical optimization algorithm inspired from weed colonization”, Ecological informatics, Vol. 1, No. 4, pp. 355-366, 2006.
[27] K. J. Kim, and H. Ahn, “Using a clustering genetic algorithm to support customer segmentation for personalized recommender systems”, In International Conference on AI, Simulation, and Planning in High Autonomy Systems, 2004, pp. 409-415.
[28] C. Mouton, J. C. Myburgh, and M. H. Davel, “Stride and translation invariance in CNNs.”, In Southern African Conference for Artificial Intelligence Research, 2021, pp. 267-281.