• Home
  • Ensemble Classifier
    • List of Articles Ensemble Classifier

      • Open Access Article

        1 - An Experimental Study on Performance of Text Representation Models for Sentiment Analysis
        Sajjad Jahanbakhsh Gudakahriz Amir Masoud Eftekhari Moghaddam Fariborz Mahmoudi
        Sentiment analysis in social networks has been an active research field since 2000 and it is highly useful in the decision-making process of various domains and applications. In sentiment analysis, the goal is to analyze the opinion texts posted in social networks and o More
        Sentiment analysis in social networks has been an active research field since 2000 and it is highly useful in the decision-making process of various domains and applications. In sentiment analysis, the goal is to analyze the opinion texts posted in social networks and other web-based resources to extract the necessary information from them. The data collected from various social networks and web sites do not possess a structured format, and this unstructured format is the main challenge for facing such data. It is necessary to represent the texts in the form of a text representation model to be able to analyze the content to overcome this challenge. Afterward, the required analysis can be done. The research on text modeling started a few decades ago, and so far, various models have been proposed for performing this modeling process. The main purpose of this paper is to evaluate the efficiency and effectiveness of a number of commons and famous text representation models for sentiment analysis. This evaluation is carried out by using these models for sentiment classification by ensemble methods. An ensemble classifier is used for sentiment classification and after preprocessing, the texts is represented by selected models. The selected models for this study are TF-IDF, LSA, Word2Vec, and Doc2Vec and the used evaluation measures are Accuracy, Precision, Recall, and F-Measure. The results of the study show that in general, the Doc2Vec model provides better performance compared to other models in sentiment analysis and at best, accuracy is 0.72. Manuscript profile
      • Open Access Article

        2 - An Effective Method of Feature Selection in Persian Text for Improving the Accuracy of Detecting Request in Persian Messages on Telegram
        zahra khalifeh zadeh Mohammad Ali Zare Chahooki
        In recent years, data received from social media has increased exponentially. They have become valuable sources of information for many analysts and businesses to expand their business. Automatic document classification is an essential step in extracting knowledge from More
        In recent years, data received from social media has increased exponentially. They have become valuable sources of information for many analysts and businesses to expand their business. Automatic document classification is an essential step in extracting knowledge from these sources of information. In automatic text classification, words are assessed as a set of features. Selecting useful features from each text reduces the size of the feature vector and improves classification performance. Many algorithms have been applied for the automatic classification of text. Although all the methods proposed for other languages are applicable and comparable, studies on classification and feature selection in the Persian text have not been sufficiently carried out. The present research is conducted in Persian, and the introduction of a Persian dataset is a part of its innovation. In the present article, an innovative approach is presented to improve the performance of Persian text classification. The authors extracted 85,000 Persian messages from the Idekav-system, which is a Telegram search engine. The new idea presented in this paper to process and classify this textual data is on the basis of the feature vector expansion by adding some selective features using the most extensively used feature selection methods based on Local and Global filters. The new feature vector is then filtered by applying the secondary feature selection. The secondary feature selection phase selects more appropriate features among those added from the first step to enhance the effect of applying wrapper methods on classification performance. In the third step, the combined filter-based methods and the combination of the results of different learning algorithms have been used to achieve higher accuracy. At the end of the three selection stages, a method was proposed that increased accuracy up to 0.945 and reduced training time and calculations in the Persian dataset. Manuscript profile
      • Open Access Article

        3 - Semi-Supervised Ensemble Using Confidence Based Selection Metric in Nnon-Stationary Data Streams
        shirin khezri jafar tanha ali ahmadi arash Sharifi
        In this article, we propose a novel Semi-Supervised Ensemble classifier using Confidence Based Selection metric, named SSE-CBS. The proposed approach uses labeled and unlabeled data, which aims at reacting to different types of concept drift. SSE-CBS combines an accurac More
        In this article, we propose a novel Semi-Supervised Ensemble classifier using Confidence Based Selection metric, named SSE-CBS. The proposed approach uses labeled and unlabeled data, which aims at reacting to different types of concept drift. SSE-CBS combines an accuracy-based weighting mechanism known from block-based ensembles with the incremental nature of Hoeffding Tree. The proposed algorithm is experimentally compared to the state-of-the-art stream methods, including supervised, semi-supervised, single classifiers, and block-based ensembles in different drift scenarios. Out of all the compared algorithms, SSE-CBS outperforms other semi-supervised ensemble approaches. Experimental results show that SSE-CBS can be considered suitable for scenarios, involving many types of drift in limited labeled data. Manuscript profile