• List of Articles web crawler

      • Open Access Article

        1 - Using web analytics in forecasting the stock price of chemical products group in the stock exchange
        amir daee Omid Mahdi Ebadati E. keyvan borna
        Forecasting markets, including stocks, has been attractive to researchers and investors due to the high volume of transactions and liquidity. The ability to predict the price enables us to achieve higher returns by reducing risk and avoiding financial losses. News plays More
        Forecasting markets, including stocks, has been attractive to researchers and investors due to the high volume of transactions and liquidity. The ability to predict the price enables us to achieve higher returns by reducing risk and avoiding financial losses. News plays an important role in the process of assessing current stock prices. The development of data mining methods, computational intelligence and machine learning algorithms have led to the creation of new models in prediction. The purpose of this study is to store news agencies' news and use text mining methods and support vector machine algorithm to predict the next day's stock price. For this purpose, the news published in 17 news agencies has been stored and categorized using a thematic language in Phoenician. Then, using text mining methods, support vector machine algorithm and different kernels, the stock price forecast of the chemical products group in the stock exchange is predicted. In this study, 300,000 news items in political and economic categories and stock prices of 25 selected companies in the period from November to March 1997 in 122 trading days have been used. The results show that with the support vector machine model with linear kernel, prices can be predicted by an average of 83%. Using nonlinear kernels and the quadratic equation of the support vector machine, the prediction accuracy increases by an average of 85% and other kernels show poorer results. ارسال Manuscript profile
      • Open Access Article

        2 - A Customized Web Spider for Why-QA Pairs Corpus Preparation
        Manvi  Breja
        Considering the growth of researches on improving the performance of non-factoid question answering system, there is a need of an open-domain non-factoid dataset. There are some datasets available for non-factoid and even how-type questions but no appropriate dataset av More
        Considering the growth of researches on improving the performance of non-factoid question answering system, there is a need of an open-domain non-factoid dataset. There are some datasets available for non-factoid and even how-type questions but no appropriate dataset available which comprises only open-domain why-type questions that can cover all range of questions format. Why-questions play a significant role and are usually asked in every domain. They are more complex and difficult to get automatically answered by the system as why-questions seek reasoning for the task involved. They are prevalent and asked in curiosity by real users and thus their answering depends on the users’ need, knowledge, context and their experience. The paper develops a customized web crawler for gathering a set of why-questions from five popular question answering websites viz. Answers.com, Yahoo! Answers, Suzan Verberne’s open-source dataset, Quora and Ask.com available on Web irrespective of any domain. Along with the questions, their category, document title and appropriate answer candidates are also maintained in the dataset. With this, distribution of why-questions according to their type and category are illustrated. To the best of our knowledge, it is the first large enough dataset of 2000 open-domain why-questions with their relevant answers that will further help in stimulating researches focusing to improve the performance of non-factoid type why-QAS. Manuscript profile