• Home
  • Information Retrieval
    • List of Articles Information Retrieval

      • Open Access Article

        1 - The Surfer Model with a Hybrid Approach to Ranking the Web Pages
        Javad Paksima Homa  Khajeh
        Users who seek results pertaining to their queries are at the first place. To meet users’ needs, thousands of webpages must be ranked. This requires an efficient algorithm to place the relevant webpages at first ranks. Regarding information retrieval, it is highly impor More
        Users who seek results pertaining to their queries are at the first place. To meet users’ needs, thousands of webpages must be ranked. This requires an efficient algorithm to place the relevant webpages at first ranks. Regarding information retrieval, it is highly important to design a ranking algorithm to provide the results pertaining to user’s query due to the great deal of information on the World Wide Web. In this paper, a ranking method is proposed with a hybrid approach, which considers the content and connections of pages. The proposed model is a smart surfer that passes or hops from the current page to one of the externally linked pages with respect to their content. A probability, which is obtained using the learning automata along with content and links to pages, is used to select a webpage to hop. For a transition to another page, the content of pages linked to it are used. As the surfer moves about the pages, the PageRank score of a page is recursively calculated. Two standard datasets named TD2003 and TD2004 were used to evaluate and investigate the proposed method. They are the subsets of dataset LETOR3. The results indicated the superior performance of the proposed approach over other methods introduced in this area. Manuscript profile
      • Open Access Article

        2 - Proposing an Information Retrieval Model Using Interval Numbers
        Hooman Tahayori farzad ghahremani
        Recent expansions of web demands for more capable information retrieval systems that more accurately address the users' information needs. Weighting the words and terms in documents plays an important role in any information retrieval system. Various methods for weighti More
        Recent expansions of web demands for more capable information retrieval systems that more accurately address the users' information needs. Weighting the words and terms in documents plays an important role in any information retrieval system. Various methods for weighting the words are proposed, however, it is not straightforward to assert which one is more effective than the others. In this paper, we have proposed a method that calculates the weights of the terms in documents and queries as interval numbers. The interval numbers are derived by aggregating the crisp weights that are calculated by exploiting the existing weighting methods. The proposed method, calculates an interval number as the overall relevancy of each document with the given query. We have discussed three approaches for ranking the interval relevancy numbers. In the experiments we have conducted on Cranfield and Medline datasets, we have studied the effects of weight normalization, use of variations of term and document frequency and have shown that appropriate selection of basic term weighting methods in conjunction with their aggregation into an interval number would considerably improve the information retrieval performance. Through appropriate selection of basic weighting methods we have reached the MAP of 0.43323 and 0.54580 on the datasets, respectively. Obtained results show that he proposed method, outperforms the use of any single basic weighting method and other existing complicated weighting methods. Manuscript profile
      • Open Access Article

        3 - Search Engine for Structured Event Retrieval from News Sources
        A. mirzaeiyan s. aliakbary
        Analysis of published news content is one of the most important issues in information retrieval. Much research has been conducted to analyze individual news articles, while most news events in the media are published in the form of several related articles. Event detect More
        Analysis of published news content is one of the most important issues in information retrieval. Much research has been conducted to analyze individual news articles, while most news events in the media are published in the form of several related articles. Event detection is the task of discovering and grouping documents that describe the same event. It also facilitates better navigation of users in news spaces by presenting an understandable structure of news events. With rapid and increasing growth of online news, the need for search engines to retrieve news events is felt more than ever. The main assumption of event detection is that the words associated with an event appear in the same time windows and similar documents. Accordingly, in this research, we propose a retrospective and feature-pivot method that clusters words into groups according to semantic and temporal features. We then use these words to produce a time frame and a human readable text description. The proposed method is evaluated on the All The News dataset, which consists of two hundred thousand articles from 15 news sources in 2016 and compared to other methods. The evaluation shows that the proposed method outperforms previous methods in terms of precision and recall. Manuscript profile
      • Open Access Article

        4 - Survey on the Applications of the Graph Theory in the Information Retrieval
        Maryam Piroozmand Amir Hosein Keyhanipour Ali Moeini
        Due to its power in modeling complex relations between entities, graph theory has been widely used in dealing with real-world problems. On the other hand, information retrieval has emerged as one of the major problems in the area of algorithms and computation. As graph- More
        Due to its power in modeling complex relations between entities, graph theory has been widely used in dealing with real-world problems. On the other hand, information retrieval has emerged as one of the major problems in the area of algorithms and computation. As graph-based information retrieval algorithms have shown to be efficient and effective, this paper aims to provide an analytical review of these algorithms and propose a categorization of them. Briefly speaking, graph-based information retrieval algorithms might be divided into three major classes: the first category includes those algorithms which use a graph representation of the corresponding dataset within the information retrieval process. The second category contains semantic retrieval algorithms which utilize the graph theory. The third category is associated with the application of the graph theory in the learning to rank problem. The set of reviewed research works is analyzed based on both the frequency as well as the publication time. As an interesting finding of this review is that the third category is a relatively hot research topic in which a limited number of recent research works are conducted. Manuscript profile
      • Open Access Article

        5 - On the use of Intelligent Information Retrieval in Patent Prior-Art Search
        Habibollah Asghari Azadeh Shakery
        Patents play an important role in Intellectual Property protection. So, in recent years a considerable attention has been paid to patent and prior-art search. In process of patent application filing, searching in the previous patent database is of great importance. Pat More
        Patents play an important role in Intellectual Property protection. So, in recent years a considerable attention has been paid to patent and prior-art search. In process of patent application filing, searching in the previous patent database is of great importance. Patent examiners search in a huge database of patents to find if there exists any similarity between applicant’s claim and the previous registered patents. This process that called patent invalidity run, is one of the important stages of patent registration. Because of legal aspects of this process, the searcher should not leave any relevant patent document. So patent searching is essentially a recall-oriented issue in information retrieval applications. In recent years, the use of intelligent information retrieval in this search process has been investigated by many researchers. In this paper we investigate various methods of information retrieval that have been proven to be effective in retrieving relevant results. The survey also has focused on query formulation and how to transform a query patent into a search query. So we have explored different factors of a successful transformation, such as how many query words should be used, where to extract query words, how to weight them and whether to use noun-phrases instead of individual words. Furthermore, the survey covers researches that combine different features and has been proven to make a significant improvement in retrieval performance. Manuscript profile
      • Open Access Article

        6 - Survey on the Applications of the Graph Theory in the Information Retrieval
        Maryam Piroozmand Amir Hosein Keyhanipour Ali Moeini
        Due to its power in modeling complex relations between entities, graph theory has been widely used in dealing with real-world problems. On the other hand, information retrieval has emerged as one of the major problems in the area of algorithms and computation. As graph- More
        Due to its power in modeling complex relations between entities, graph theory has been widely used in dealing with real-world problems. On the other hand, information retrieval has emerged as one of the major problems in the area of algorithms and computation. As graph-based information retrieval algorithms have shown to be efficient and effective, this paper aims to provide an analytical review of these algorithms and propose a categorization of them. Briefly speaking, graph-based information retrieval algorithms might be divided into three major classes: the first category includes those algorithms which use a graph representation of the corresponding dataset within the information retrieval process. The second category contains semantic retrieval algorithms which utilize the graph theory. The third category is associated with the application of the graph theory in the learning to rank problem. The set of reviewed research works is analyzed based on both the frequency as well as the publication time. As an interesting finding of this review is that the third category is a relatively hot research topic in which a limited number of recent research works are conducted. Manuscript profile