• Home
  • Concept drift
    • List of Articles Concept drift

      • Open Access Article

        1 - Fast and accurate concept drift detection from event logs
        mahdi yaghoobi ali sebti Soheila Karbasi
        In organizations and large companies that are using business process management systems (BPMSs), process model can change due to upstream laws, market conditions. BPMSs have flexible to these changes. Effect of these change are saved in storage devises and event logs; t More
        In organizations and large companies that are using business process management systems (BPMSs), process model can change due to upstream laws, market conditions. BPMSs have flexible to these changes. Effect of these change are saved in storage devises and event logs; these changes are sometimes applied suddenly or gradually on the event logs. Changing the season or starting a new financial term can be a factor to make these changes. This change is called concept drift in business process model. On time detection and recognition of process concept drift can affect the decision making of managers and administrations of systems. An analysis of the event logs in BPMS allows the automatic detection of the concept drift. This paper presents an innovative method by introducing a modified distance function to identify the concept drift. Experimental results were performed on 72 datasets in the research history, which included 648 concept drifts in 12 different types. It shows that the proposed method detects 98.18% of the drifts, while the proposed method is much faster than other state of the art methods. Manuscript profile
      • Open Access Article

        2 - Incremental Opinion Mining Using Active Learning over a Stream of Documents
        F. Noorbehbahani
        Today, opinion mining is one the most important applications of natural language processing which requires special methods to process documents due to the high volume of comments produced. Since the users’ opinions on social networks and e-commerce websites constitute a More
        Today, opinion mining is one the most important applications of natural language processing which requires special methods to process documents due to the high volume of comments produced. Since the users’ opinions on social networks and e-commerce websites constitute an evolving stream, the application of traditional non-incremental classification algorithm for opinion mining leads to the degradation of the classification model as time passes. Moreover, because the users’ comments are massive, it is not possible to label enough comments to build training data for updating the learned model. Another issue in incremental opinion mining is the concept drift that should be supported to handle changing class distributions and evolving vocabulary. In this paper, a new incremental method for polarity detection is proposed which with the application of stream-based active learning selects the best documents to be labeled by experts and updates the classifier. The proposed method is capable of detecting and handling concept drift using a limited labeled data without storing the documents. We compare our method with the state of the art incremental and non-incremental classification methods using credible datasets and standard evaluation measures. The evaluation results show the effectiveness of the proposed method for polarity detection of opinions. Manuscript profile
      • Open Access Article

        3 - Semi-Supervised Ensemble Using Confidence Based Selection Metric in Nnon-Stationary Data Streams
        shirin khezri jafar tanha ali ahmadi arash Sharifi
        In this article, we propose a novel Semi-Supervised Ensemble classifier using Confidence Based Selection metric, named SSE-CBS. The proposed approach uses labeled and unlabeled data, which aims at reacting to different types of concept drift. SSE-CBS combines an accurac More
        In this article, we propose a novel Semi-Supervised Ensemble classifier using Confidence Based Selection metric, named SSE-CBS. The proposed approach uses labeled and unlabeled data, which aims at reacting to different types of concept drift. SSE-CBS combines an accuracy-based weighting mechanism known from block-based ensembles with the incremental nature of Hoeffding Tree. The proposed algorithm is experimentally compared to the state-of-the-art stream methods, including supervised, semi-supervised, single classifiers, and block-based ensembles in different drift scenarios. Out of all the compared algorithms, SSE-CBS outperforms other semi-supervised ensemble approaches. Experimental results show that SSE-CBS can be considered suitable for scenarios, involving many types of drift in limited labeled data. Manuscript profile
      • Open Access Article

        4 - Deep Extreme Learning Machine: A Combined Incremental Learning Approach for Data Stream Classification
        Javad Hamidzadeh Mona Moradi
        Streaming data refers to data that is continuously generated in the form of fast streams with high volumes. This kind of data often runs into evolving environments where a change may affect the data distribution. Because of a wide range of real-world applications of dat More
        Streaming data refers to data that is continuously generated in the form of fast streams with high volumes. This kind of data often runs into evolving environments where a change may affect the data distribution. Because of a wide range of real-world applications of data streams, performance improvement of streaming analytics has become a hot topic for researchers. The proposed method integrates online ensemble learning into extreme machine learning to improve the data stream classification performance. The proposed incremental method does not need to access the samples of previous blocks. Also, regarding the AdaBoost approach, it can react to concept drift by the component weighting mechanism and component update mechanism. The proposed method can adapt to the changes, and its performance is leveraged to retain high-accurate classifiers. The experiments have been done on benchmark datasets. The proposed method can achieve 0.90% average specificity, 0.69% average sensitivity, and 0.87% average accuracy, indicating its superiority compared to two competing methods. Manuscript profile