A Feature Selection Algorithm in Online Stream Dataset Based on Multivariate Mutual Information
Subject Areas : electrical and computer engineeringMaryam Rahmaninia 1 , Parham Moradi 2
1 -
2 -
Keywords: Feature selectiononline stream datasetmutual informationjoint random variables,
Abstract :
Today, in many real-world applications, such as social networks, we are faced with data streams which new data is appeared every moment. Since the efficiency of most data mining algorithms decreases with increasing data dimensions, analysis of the data has become one of the most important issues recently. Online stream feature selection is an effective approach which aims at removing those of redundant features and keeping relevant ones, leads to reduce the size of the data and improve the accuracy of the online data mining methods. There are several critical issues for online stream feature selection methods including: unavailability of the entire feature set before starting the algorithm, scalability, stability, classification accuracy, and size of selected feature set. So far, existing methods have only been able to address a few numbers of these issues simultaneously. To this end, in this paper, we present an online feature selection method called MMIOSFS that provides a better tradeoff between these challenges using Mutual Information. In the proposed method, first the feature set is mapped to a new feature using joint Random variables technique, then the mutual information of new feature with the class label is computed as the degree of relationship between the features set. The efficiency of the proposed method was compared to several online feature selection algorithms based on different categories. The results show that the proposed method usually achieves better tradeoff between the mentioned challenges.
[1] J. Dean, Big Data, Data Mining, and Machine Learning: Value Creation for Business Leaders and Practitioners, CreateSpace Independent Publishing Platform. 318, 2014.
[2] S. Gilpin, B. Qian, and I. Davidson, "Efficient hierarchical clustering of large high dimensional datasets," in Proc. of the 22nd ACM Int. Conf. on Information, Knowledge Management, pp. 1371-1380, San Francisco, CA, USA, 27 Oct.-1 Nov. 2013.
[3] A. K. Farahat and A. Elgohary, A. Ghodsi, and M. Kamel, "Greedy column subset selection for large-scale data sets," Knowledge and Information Systems, vol. 45, pp. 1-34, 2014.
[4] P. Moradi and M. Gholampour, "A hybrid particle swarm optimization for feature subset selection by integrating a novel local search strategy," Applied Soft Computing, vol. 43, pp. 117-130, Jun. 2016.
[5] M. Labani, et al., "A novel multivariate filter method for feature selection in text classification problems," Engineering Applications of Artificial Intelligence, vol. 70, pp. 25-37, Apr. 2018.
[6] H. Peng, F. Long, and C. Ding, "Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy," IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 27, no. 8, pp. 1226-1238, Aug. 2005.
[7] M. Wang, H. Li, D. Tao, K. Lu, and X. Wu, "Multimodal graph-based reranking for web image search," IEEE Trans. on on Image Processing, vol. 21, no. 11, pp. 4649-4661, Jul. 2012.
[8] X. Hu, P. Zhou, P. Li, J. Wang, and X. Wu, "A survey on online feature selection with streaming features," Frontiers of Computer Science, vol. 12, no. 3, pp. 479-493, May 2018.
[9] S. C. H. Hoi, J. Wang, P. Zhao, and R. Jin, "Online feature selection for mining big data," in Proc. of the 1st Int. Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications, pp. 93-100, Beijing, China, 12-14 Aug. 2012.
[10] K. Yu, et al., "Scalable and accurate online feature selection for big data," ACM Trans. on Knowledge Discovery from Data, vol. 11, no. 2, pp. 1-39, Jul. 2016.
[11] X. Wu, X. Wu, W. Ding, and J. Pei, "Online streaming feature selection," in Proc. of the 27th Int. Conf. on Machine Learning, pp. 1159-1166, 21-30, Jun. 2010.
[12] J. Zhou, D. P. Foster, R. A. Stine, and L. H. Ungar, "Streamwise feature selection," Journal of Machine Learning Research, vol. 7, no. 67, pp. 1861-1885, Jul. 2006.
[13] S. Perkins and J. Theiler, "Online feature selection using grafting," in Proc. of the 12th Int. Conf. on Machine Learning, pp. 592-599, Washington, DC, USA,12-21Aug. 2003.
[14] X. Wu, et al., "Online feature selection with streaming features," IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 35, no. 5, pp. 1178-1192, May 2013.
[15] J. Zhou, et al., "Streaming feature selection using alpha-investing," in Proc. of the 11th ACM SIGKDD Int. Conf. on Knowledge Discovery in Data Mining, pp. 384-393, Chicago, IL, USA, Chicago, IL, USA, 21-24 Aug. 2005.
[16] M. Rahmaninia and P. Moradi, "OSFSMI: online stream feature selection method based on mutual information," Applied Soft Computing, vol. 68, pp. 733-746, Jul. 2018.
[17] J. Liang, F. Wang, C. Dang and Y. Qian, "A group incremental approach to feature selection applying rough set technique," IEEE Trans. on Knowledge and Data Engineering, vol. 26, no. 2, pp. 294-308, Feb. 2014.
[18] K. Henni, N. Mezghani, and C. Gouin-Vallerand, "Unsupervised graph-based feature selection via subspace and pagerank centrality," Expert Systems with Applications, vol. 114, pp. 46-53, Dec. 2018.
[19] M. Bennasar, Y. Hicks, and R. Setchi, "Feature selection based on joint mutual information," Expert Systems with Applications, vol. 14, pp. 1-55, Sept. 2015.
[20] R. Battiti, "Using mutual information for selecting features in supervised neural net learning," IEEE Trans. on Neural Networks, vol. 5, no. 4, pp. 537-550, Jul. 1994.
[21] C. Lai, M. J. T. Reinders, and L. Wessels, "Random subspace method for multivariate feature selection," Pattern Recognition Letters, vol. 27, no. 10, pp. 1067-1076, Jul. 2006.
[22] M. Haindl, P. Somol, D. Ververidis, and C. Kotropoulos, "Feature selection based on mutual correlation," Pattern Recognition, Image Analysis and Applications, CIARP'06, Lecture Notes in Computer Science, vol. 4225, pp. 569-577, Jul. 2006.
[23] A. J. Ferreira and M. A. T. Figueiredo, "An unsupervised approach to feature discretization and selection," Pattern Recognition, vol. 45, no. 9, pp. 3048-3060, Sept. 2012.
[24] J. Liang, F. Wang, C. Dang, and Y. Qian, "A group incremental approach to feature selection applying rough set technique," IEEE Trans. on Knowledge and Data Engineering, vol. 26, no. 2, pp. 294-308, Feb. 2014.
[25] Y. Zhang, A. Yang, C. Xiong, T. Wang, and Z. Zhang, "Feature selection using data envelopment analysis," Knowledge-Based Systems, vol. 64, pp. 70-80, Apr. 2014.
[26] S. J. Russell and P. Norvig, Artificial Intelligence: A Modern Approach, Prentice Hall, 2005.
[27] D. B. Skalak, "Prototype and feature selection by sampling and random mutation hill climbing algorithms," in Proc. 11th Int. Conf. on Machine Learning, vol. 11, pp. 293-301, New Brunswick, NJ, USA, 10-13 Jul. 1994.
[28] D. Gelbart, N. Morgan, and A. Tsymbal, "Hill-climbing feature selection for multi-stream ASR," in Proc. 10th Annual Conf. of the Int. Speech Communication Association, INTERSPEECH'09, 4 pp., Brighton, UK, Sept. 2009.
[29] S. Tabakhi, P. Moradi, and F. Akhlaghian, "An unsupervised feature selection algorithm based on ant colony optimization," Engineering Applications of Artificial Intelligence, vol. 32, pp. 112-123, Feb. 2014.
[30] I. Guyon, J. Weston, S. Barnhill, and V. Vapnik, "Gene selection for cancer classification using support vector machines," Machine Learning, vol. 46, no. 1-3, pp. 389-422, Jan. 2002.
[31] M. Pratama, G. Zhang, M. J. Er, and S. Anavatti, "An incremental type-2 meta-cognitive extreme learning machine," IEEE Trans. on Cybernetics, vol. 47, no. 2, pp. 339-353, Feb. 2017.
[32] E. Lughofer, "On-line incremental feature weighting in evolving fuzzy classifiers," Fuzzy Sets and Systems, vol. 163, no. 1, pp. 1-23, Jan. 2011.
[33] S. Eskandari and M. M. Javidi, "Online streaming feature selection using rough sets," International Journal of Approximate Reasoning, vol. 69, pp. 35-57, Feb. 2016.
[34] J. Liu, Y. Lin, T. Li, W. Wang, and S. Wu, "Online multi-label streaming feature selection based on neighborhood rough set," Pattern Recognition, vol. 84, pp. 273-287, Jul. 2018.
[35] H. Wang, et al., "Multi-label online streaming feature selection based on spectral granulation and mutual information," in Rough Sets, Cham: Springer International Publishing, 2018.
[36] H. Li, X. Wu; Z. Li, and W. Ding, "Group feature selection with streaming features," in Proc. IEEE 13th Int. Conf. on Data Mining pp. 1109-1114,, Dallas, TX, USA, Dec. 2013.
[37] J. Wang, et al., "Online feature selection with group structure analysis," IEEE Trans. on Knowledge and Data Engineering, vol. 27, no. 11, pp. 3029-3041, Nov. 2015.
[38] J. Liu, Y. Lin, S. Wu, and C. Wang, "Online multi-label group feature selection," Knowledge-Based Systems, vol. 143, pp. 42-57, Mar. 2018.
[39] T. M. Cover and J. A. Thomas, Elements of Information Theory, Wiley-Interscience, 2006.
[40] K. He and G. Meeden, "Selecting the number of bins in a histogram: a decision theoretic approach," J. of Statistical Planning and Inference, vol. 61, no. 1, pp. 49-59, May 1997.
[41] W. McGill, "Multivariate information transmission," Trans. of the IRE Professional Group on Information Theory, vol. 4, no. 4, pp. 93-111, Jun. 1954.
[42] N. S. Altman, "An introduction to kernel and nearest-neighbor nonparametric regression," The American Statistician, vol. 46, no. 3, pp. 175-185, Aug. 1992.
[43] I. H. Witten, E. Frank, and M. A. Hall, Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufmann Publishers Inc., 2011.
[44] M. Friedman, "A comparison of alternative tests of significance for the problem of m rankings," The Annals of Mathematical Statistics, vol. 11, no. 1, pp. 86-92, Mar. 1940.
[45] K. Yu, W. Ding, and X. Wu, "LOFS: a library of online streaming feature selection," Knowledge-Based Systems, vol. 113, pp. 1-3, Mar. 2016.