• Home
  • Feature selectiononline stream datasetmutual informationjoint random variables
    • List of Articles Feature selectiononline stream datasetmutual informationjoint random variables

      • Open Access Article

        1 - A Feature Selection Algorithm in Online Stream Dataset Based on Multivariate Mutual Information
        Maryam Rahmaninia Parham Moradi
        Today, in many real-world applications, such as social networks, we are faced with data streams which new data is appeared every moment. Since the efficiency of most data mining algorithms decreases with increasing data dimensions, analysis of the data has become one of More
        Today, in many real-world applications, such as social networks, we are faced with data streams which new data is appeared every moment. Since the efficiency of most data mining algorithms decreases with increasing data dimensions, analysis of the data has become one of the most important issues recently. Online stream feature selection is an effective approach which aims at removing those of redundant features and keeping relevant ones, leads to reduce the size of the data and improve the accuracy of the online data mining methods. There are several critical issues for online stream feature selection methods including: unavailability of the entire feature set before starting the algorithm, scalability, stability, classification accuracy, and size of selected feature set. So far, existing methods have only been able to address a few numbers of these issues simultaneously. To this end, in this paper, we present an online feature selection method called MMIOSFS that provides a better tradeoff between these challenges using Mutual Information. In the proposed method, first the feature set is mapped to a new feature using joint Random variables technique, then the mutual information of new feature with the class label is computed as the degree of relationship between the features set. The efficiency of the proposed method was compared to several online feature selection algorithms based on different categories. The results show that the proposed method usually achieves better tradeoff between the mentioned challenges. Manuscript profile