• List of Articles Text mining

      • Open Access Article

        1 - Proposing a Model for Extracting Information from Textual Documents, Based on Text Mining in E-learning
        Somayeh Ahari
        As computer networks become the backbones of science and economy, enormous quantities documents become available. So, for extracting useful information from textual data, text mining techniques have been used. Text Mining has become an important research area that disco More
        As computer networks become the backbones of science and economy, enormous quantities documents become available. So, for extracting useful information from textual data, text mining techniques have been used. Text Mining has become an important research area that discoveries unknown information, facts or new hypotheses by automatically extracting information from different written documents. Text mining aims at disclosing the concealed information by means of methods which on the one hand are able to cope with the large number of words and structures in natural language and on the other hand allow handling vagueness, uncertainty and fuzziness. Text mining, referred to as text data mining, roughly equivalent to text analytics, refers to the process of deriving high-quality information from text that high-quality information is typically derived through the patterns and processes. Moreover, text mining, also known as text data mining or knowledge discovery from textual databases, refers to the process of extracting patterns or knowledge from text documents. In this research, a survey of text mining techniques and applications in e-learning has been presented. During these studies, relevant researches in the field of e-learning were classified. After classification of researches, related problems and solutions were extracted. In this paper, first, definition of text mining is presented. Then, the process of text mining and its applications in e-learning domain are described. Furthermore, text mining techniques are introduced, and each of these methods in the field of e-learning is considered. Finally, a model for the information extraction by text mining techniques in e-learning domain is proposed. Manuscript profile
      • Open Access Article

        2 - Discover product defect reports from the text of users' online comments
        narges nematifard Muharram Mansoorizadeh mahdi sakhaei nia
        With the development of Web 2 and social networks, customers and users can share their opinions about different products They leave. These ideas can be used as a valuable resource to determine the position of the product and its success in marketing. Extracting the rep More
        With the development of Web 2 and social networks, customers and users can share their opinions about different products They leave. These ideas can be used as a valuable resource to determine the position of the product and its success in marketing. Extracting the reported shortcomings from the large volume of comments generated by users is one of the major problems in this field of research. By comparing the products of different manufacturers, customers and consumers express the strengths and weaknesses of the products in the form of positive and negative comments. Classification of comments based on positive and negative sensory words in the text does not lead to accurate results without reference to documents containing a defect report. Because defects are not reported solely in negative comments. It is possible for a customer to feel positive about a product and still report a defect in their opinion. Therefore, another challenge of this research field is the correct and accurate classification of opinions. To solve these problems and challenges, this article provides an effective and efficient way to extract comments containing product defect reports from users' online comments. For this purpose, stochastic forest classifiers were used to identify the defect report and the unattended thematic modeling technique used the Dirichlet hidden allocation to provide a summary of the defect report. Data from the Amazon website has been used to analyze and evaluate the proposed method. The results showed that random forest has an acceptable performance for defect reporting even with a small number of educational data. Results and outputs extracted from documents containing the defect report, including a summary of the defect report to facilitate manufacturers' decision making, finding patterns of the defect report in the text automatically, and discovering the aspects of the product that reported the most defects Related to themDemonstrates the ability of Dirichlet's latent allocation method. Manuscript profile
      • Open Access Article

        3 - Using web analytics in forecasting the stock price of chemical products group in the stock exchange
        amir daee Omid Mahdi Ebadati E. keyvan borna
        Forecasting markets, including stocks, has been attractive to researchers and investors due to the high volume of transactions and liquidity. The ability to predict the price enables us to achieve higher returns by reducing risk and avoiding financial losses. News plays More
        Forecasting markets, including stocks, has been attractive to researchers and investors due to the high volume of transactions and liquidity. The ability to predict the price enables us to achieve higher returns by reducing risk and avoiding financial losses. News plays an important role in the process of assessing current stock prices. The development of data mining methods, computational intelligence and machine learning algorithms have led to the creation of new models in prediction. The purpose of this study is to store news agencies' news and use text mining methods and support vector machine algorithm to predict the next day's stock price. For this purpose, the news published in 17 news agencies has been stored and categorized using a thematic language in Phoenician. Then, using text mining methods, support vector machine algorithm and different kernels, the stock price forecast of the chemical products group in the stock exchange is predicted. In this study, 300,000 news items in political and economic categories and stock prices of 25 selected companies in the period from November to March 1997 in 122 trading days have been used. The results show that with the support vector machine model with linear kernel, prices can be predicted by an average of 83%. Using nonlinear kernels and the quadratic equation of the support vector machine, the prediction accuracy increases by an average of 85% and other kernels show poorer results. ارسال Manuscript profile
      • Open Access Article

        4 - An Effective Method of Feature Selection in Persian Text for Improving the Accuracy of Detecting Request in Persian Messages on Telegram
        zahra khalifeh zadeh Mohammad Ali Zare Chahooki
        In recent years, data received from social media has increased exponentially. They have become valuable sources of information for many analysts and businesses to expand their business. Automatic document classification is an essential step in extracting knowledge from More
        In recent years, data received from social media has increased exponentially. They have become valuable sources of information for many analysts and businesses to expand their business. Automatic document classification is an essential step in extracting knowledge from these sources of information. In automatic text classification, words are assessed as a set of features. Selecting useful features from each text reduces the size of the feature vector and improves classification performance. Many algorithms have been applied for the automatic classification of text. Although all the methods proposed for other languages are applicable and comparable, studies on classification and feature selection in the Persian text have not been sufficiently carried out. The present research is conducted in Persian, and the introduction of a Persian dataset is a part of its innovation. In the present article, an innovative approach is presented to improve the performance of Persian text classification. The authors extracted 85,000 Persian messages from the Idekav-system, which is a Telegram search engine. The new idea presented in this paper to process and classify this textual data is on the basis of the feature vector expansion by adding some selective features using the most extensively used feature selection methods based on Local and Global filters. The new feature vector is then filtered by applying the secondary feature selection. The secondary feature selection phase selects more appropriate features among those added from the first step to enhance the effect of applying wrapper methods on classification performance. In the third step, the combined filter-based methods and the combination of the results of different learning algorithms have been used to achieve higher accuracy. At the end of the three selection stages, a method was proposed that increased accuracy up to 0.945 and reduced training time and calculations in the Persian dataset. Manuscript profile
      • Open Access Article

        5 - Technology Watch” via “Information Technology
        Kiyarash Jahanpour
        Information is power, but knowledge is more powerful .information in patents and papers are good source of codified knowledge. Everyday a higher number of businesses make use of information from patents (as a main indicator of technology) and papers(as a principal More
        Information is power, but knowledge is more powerful .information in patents and papers are good source of codified knowledge. Everyday a higher number of businesses make use of information from patents (as a main indicator of technology) and papers(as a principal indicator of science) to see what products and systems are appearing in our globe. In an era of rapidly expanding digital content, overwhelming data available on the web and the high speed of S&T progress makes it difficult for experts to extract useful knowledge without powerful tools and they need to find new ways of reviewing and managing vast quantities of textual information. “Technology watch” is a collective voluntary process with which the companies work the information in an active manner. Purpose of “technology watch” is to gather process and integrate the technical information. TW has at least 3 objectives: Facilitating the innovation process; Easy and cost effective access to information and Answering to technological questions and problems. “Technology Watch” maintains awareness at all levels of global S&T through a combination of human-based overt and IT-based approaches for analyzing and tracking the myriad S&T outputs. Powerful IT-based techniques, such as text mining, now exist to identify and extract relevant data from the S&T literature and are especially useful in making sense out of disjointed and disparate data. Regarded by many as the next wave of knowledge discovery, text mining has very high commercial values. Manuscript profile
      • Open Access Article

        6 - Modeling of Electronic Word of Mouth Marketing Based on text mining User comments, A new approach On social commerce
        Elham Ramezani Ali Rajabzadeh Ghatary Vahid   Baradaran Maryam Shoar
        The purpose of this article is to present an Electronic Word of Mouth marketing model in social commerce Based on text mining User comments in sale sites. Due to the new research in this field and using the text mining method of user comments to express the variables of More
        The purpose of this article is to present an Electronic Word of Mouth marketing model in social commerce Based on text mining User comments in sale sites. Due to the new research in this field and using the text mining method of user comments to express the variables of this type of marketing model, this research is a kind of Exploratory Developmental Research. The method used in this research Is combination of qualitative and quantitative. In this regard,by studying previous researches As well as receiving, preprocessing and analyzing 11thousand Customers Online Comments In the case of digital products, Repetitive words with a positive label were selected Then, using Word2vec algorithm The variables of the Electronic Word of Mouth marketing model Were extracted using text mining technique. Fitting the model extracted, based on the comments of experts and users of internet sales sites in Iran with the help of a Questionnaire and analysed with statistical tools of least squares. The statistical sample of the second phase Due to the unlimited statistical population it was estimated according to Cochran's formula 384. In order to review and present the final model from the structural equations approach with SmartPLS software was used. The results show that customer interaction, message quality and Customer mental image will have positive and significant impact on the Platform and channel attractiveness of Electronic word of mouth marketing channel, Finally, these two variables will have a positive and significant impact on the Customer behavior and business brand. This model emphasizes new dimensions of variables of the Electronic Word of Mouth marketing model that can be helpful for business owners and marketers. Manuscript profile
      • Open Access Article

        7 - Search Engine for Structured Event Retrieval from News Sources
        A. mirzaeiyan s. aliakbary
        Analysis of published news content is one of the most important issues in information retrieval. Much research has been conducted to analyze individual news articles, while most news events in the media are published in the form of several related articles. Event detect More
        Analysis of published news content is one of the most important issues in information retrieval. Much research has been conducted to analyze individual news articles, while most news events in the media are published in the form of several related articles. Event detection is the task of discovering and grouping documents that describe the same event. It also facilitates better navigation of users in news spaces by presenting an understandable structure of news events. With rapid and increasing growth of online news, the need for search engines to retrieve news events is felt more than ever. The main assumption of event detection is that the words associated with an event appear in the same time windows and similar documents. Accordingly, in this research, we propose a retrospective and feature-pivot method that clusters words into groups according to semantic and temporal features. We then use these words to produce a time frame and a human readable text description. The proposed method is evaluated on the All The News dataset, which consists of two hundred thousand articles from 15 news sources in 2016 and compared to other methods. The evaluation shows that the proposed method outperforms previous methods in terms of precision and recall. Manuscript profile
      • Open Access Article

        8 - Improving Opinion Aspect Extraction Using Domain Knowledge and Term Graph
        Mohammadreza Shams Ahmad  Baraani Mahdi Hashemi
        With the advancement of technology, analyzing and assessing user opinions, as well as determining the user's attitude toward various aspects, have become a challenging and crucial issue. Opinion mining is the process of recognizing people’s attitudes from textual commen More
        With the advancement of technology, analyzing and assessing user opinions, as well as determining the user's attitude toward various aspects, have become a challenging and crucial issue. Opinion mining is the process of recognizing people’s attitudes from textual comments at three different levels: document-level, sentence-level, and aspect-level. Aspect-based Opinion mining analyzes people’s viewpoints on various aspects of a subject. The most important subtask of aspect-based opinion mining is aspect extraction, which is addressed in this paper. Most previous methods suggest a solution that requires labeled data or extensive language resources to extract aspects from the corpus, which can be time consuming and costly to prepare. In this paper, we propose an unsupervised approach for aspect extraction that uses topic modeling and the Word2vec technique to integrate semantic information and domain knowledge based on term graph. The evaluation results show that the proposed method not only outperforms previous methods in terms of aspect extraction accuracy, but also automates all steps and thus eliminates the need for user intervention. Furthermore, because it is not reliant on language resources, it can be used in a wide range of languages. Manuscript profile