• فهرست مقالات Ensemble Classifier

      • دسترسی آزاد مقاله

        1 - An Experimental Study on Performance of Text Representation Models for Sentiment Analysis
        Sajjad Jahanbakhsh Gudakahriz Amir Masoud Eftekhari Moghaddam Fariborz Mahmoudi
        Sentiment analysis in social networks has been an active research field since 2000 and it is highly useful in the decision-making process of various domains and applications. In sentiment analysis, the goal is to analyze the opinion texts posted in social networks and o چکیده کامل
        Sentiment analysis in social networks has been an active research field since 2000 and it is highly useful in the decision-making process of various domains and applications. In sentiment analysis, the goal is to analyze the opinion texts posted in social networks and other web-based resources to extract the necessary information from them. The data collected from various social networks and web sites do not possess a structured format, and this unstructured format is the main challenge for facing such data. It is necessary to represent the texts in the form of a text representation model to be able to analyze the content to overcome this challenge. Afterward, the required analysis can be done. The research on text modeling started a few decades ago, and so far, various models have been proposed for performing this modeling process. The main purpose of this paper is to evaluate the efficiency and effectiveness of a number of commons and famous text representation models for sentiment analysis. This evaluation is carried out by using these models for sentiment classification by ensemble methods. An ensemble classifier is used for sentiment classification and after preprocessing, the texts is represented by selected models. The selected models for this study are TF-IDF, LSA, Word2Vec, and Doc2Vec and the used evaluation measures are Accuracy, Precision, Recall, and F-Measure. The results of the study show that in general, the Doc2Vec model provides better performance compared to other models in sentiment analysis and at best, accuracy is 0.72. پرونده مقاله
      • دسترسی آزاد مقاله

        2 - An Effective Method of Feature Selection in Persian Text for Improving the Accuracy of Detecting Request in Persian Messages on Telegram
        zahra khalifeh zadeh Mohammad Ali Zare Chahooki
        In recent years, data received from social media has increased exponentially. They have become valuable sources of information for many analysts and businesses to expand their business. Automatic document classification is an essential step in extracting knowledge from چکیده کامل
        In recent years, data received from social media has increased exponentially. They have become valuable sources of information for many analysts and businesses to expand their business. Automatic document classification is an essential step in extracting knowledge from these sources of information. In automatic text classification, words are assessed as a set of features. Selecting useful features from each text reduces the size of the feature vector and improves classification performance. Many algorithms have been applied for the automatic classification of text. Although all the methods proposed for other languages are applicable and comparable, studies on classification and feature selection in the Persian text have not been sufficiently carried out. The present research is conducted in Persian, and the introduction of a Persian dataset is a part of its innovation. In the present article, an innovative approach is presented to improve the performance of Persian text classification. The authors extracted 85,000 Persian messages from the Idekav-system, which is a Telegram search engine. The new idea presented in this paper to process and classify this textual data is on the basis of the feature vector expansion by adding some selective features using the most extensively used feature selection methods based on Local and Global filters. The new feature vector is then filtered by applying the secondary feature selection. The secondary feature selection phase selects more appropriate features among those added from the first step to enhance the effect of applying wrapper methods on classification performance. In the third step, the combined filter-based methods and the combination of the results of different learning algorithms have been used to achieve higher accuracy. At the end of the three selection stages, a method was proposed that increased accuracy up to 0.945 and reduced training time and calculations in the Persian dataset. پرونده مقاله
      • دسترسی آزاد مقاله

        3 - الگوریتم نیمه نظارتی جمعی با استفاده از معیار انتخاب مبتنی بر آستانه امتیاز اطمینان در جریان-داده های غیر ایستا
        شیرین خضری جعفر  تنها علی احمدی آرش شريفي
        در این مقاله، یک الگوریتم طبقه‌بندی نیمه‌نظارتی جمعی با استفاده از معیار انتخاب مبتنی بر آستانه امتياز اطمينان تحت عنوان SSE-CBS در محیط‌های غیر ایستا ارائه می‌شود. رویکرد پیشنهادی از داده‌های دارای برچسب و فاقد برچسب با هدف مقابله با انواع تغییر مفهوم در جریان داده‌ها چکیده کامل
        در این مقاله، یک الگوریتم طبقه‌بندی نیمه‌نظارتی جمعی با استفاده از معیار انتخاب مبتنی بر آستانه امتياز اطمينان تحت عنوان SSE-CBS در محیط‌های غیر ایستا ارائه می‌شود. رویکرد پیشنهادی از داده‌های دارای برچسب و فاقد برچسب با هدف مقابله با انواع تغییر مفهوم در جریان داده‌ها استفاده می‌کند. SSE-CBS مکانیزم مشهور وزن‌دهی بر اساس دقت الگوریتم‌های جمعی مبتنی بر بلوک را با ماهیت افزایشی الگوریتم درخت هافدینگ تلفیق می‌کند. الگوریتم پیشنهادی به طور تجربی با 8 رویکرد منطبق بر جدیدترین دستاوردها، از جمله مدل‌های طبقه‌بندی نظارتی، نیمه‌نظارتی، منفرد و الگوریتم‌های جمعی مبتنی بر بلوک روی مجموعه داده‌های متنوع مقایسه شده است. بر اساس نتایج تجربی، SSE-CBS بهترین میانگین دقت طبقه‌بندی را نسبت به سایر رویکردهای نیمه‌نظارتی داراست و قادر است در محیط‌های دارای تغییر مفهوم با محدودیت داده برچسب‌دار عملکرد مناسبی داشته باشد. پرونده مقاله