• Home
  • خلاصه‌‌سازی مبتنی بر طرح
    • List of Articles خلاصه‌‌سازی مبتنی بر طرح

      • Open Access Article

        1 - Improving Efficiency of Finding Frequent Subgraphs in Graph Stream Using gMatrix Summarization
        masoud kazemi Seyed Hossein Khasteh hamidreza rokhsati
        In many real-world frameworks, dealing with huge domains of nodes and online streaming edges are unavoidable. Transportation systems, IP networks and developed social medias are quintessential examples of such scenarios. One of the most important open problems while dea More
        In many real-world frameworks, dealing with huge domains of nodes and online streaming edges are unavoidable. Transportation systems, IP networks and developed social medias are quintessential examples of such scenarios. One of the most important open problems while dealing with massive graph streams are finding frequent sub-graph. There are some approaches such as count-min for storing the frequent nodes, however performing these methods will result in inaccurate modelling of structures based on the main graph. Having said that, gMatrix is one of the recently developed approaches which can fairly save the important properties of the main graph. In this approach, different hash functions are utilized to store the basis of streams in the main graph. As a result, having the reverse of the hash functions will be extremely useful in calculation of the frequent subgraph. Though gMatrix mainly suffer from two problems. First, they are not really accurate due to high compression rate of the main graph and second, the complexity of returning a query is high. In this thesis, we have presented a new approach based on gMatrix which can reduce the amount of memory usage as well as returning the queries in less amount of time. The main contribution of the introduced approach is to reduce the dependency among the hash functions. This will result in less conflicts while creating the gMatrix later. In this study we have used Cosine Similarity in order to estimate the amount of dependency and similarity among hash functions. Our experimental results prove the higher performance in terms of algorithm and time complexity. Manuscript profile