• List of Articles Similarity

      • Open Access Article

        1 - Improving Efficiency of Finding Frequent Subgraphs in Graph Stream Using gMatrix Summarization
        masoud kazemi Seyed Hossein Khasteh hamidreza rokhsati
        In many real-world frameworks, dealing with huge domains of nodes and online streaming edges are unavoidable. Transportation systems, IP networks and developed social medias are quintessential examples of such scenarios. One of the most important open problems while dea More
        In many real-world frameworks, dealing with huge domains of nodes and online streaming edges are unavoidable. Transportation systems, IP networks and developed social medias are quintessential examples of such scenarios. One of the most important open problems while dealing with massive graph streams are finding frequent sub-graph. There are some approaches such as count-min for storing the frequent nodes, however performing these methods will result in inaccurate modelling of structures based on the main graph. Having said that, gMatrix is one of the recently developed approaches which can fairly save the important properties of the main graph. In this approach, different hash functions are utilized to store the basis of streams in the main graph. As a result, having the reverse of the hash functions will be extremely useful in calculation of the frequent subgraph. Though gMatrix mainly suffer from two problems. First, they are not really accurate due to high compression rate of the main graph and second, the complexity of returning a query is high. In this thesis, we have presented a new approach based on gMatrix which can reduce the amount of memory usage as well as returning the queries in less amount of time. The main contribution of the introduced approach is to reduce the dependency among the hash functions. This will result in less conflicts while creating the gMatrix later. In this study we have used Cosine Similarity in order to estimate the amount of dependency and similarity among hash functions. Our experimental results prove the higher performance in terms of algorithm and time complexity. Manuscript profile
      • Open Access Article

        2 - A Persian Fuzzy Plagiarism Detection Approach
        Shima Rakian Faramarz Safi Esfahani Hamid Rastegari
        Plagiarism is one of the common problems that is present in all organizations that deal with electronic content. At present, plagiarism detection tools, only detect word by word or exact copy phrases and paraphrasing is often mixed. One of the successful and applicable More
        Plagiarism is one of the common problems that is present in all organizations that deal with electronic content. At present, plagiarism detection tools, only detect word by word or exact copy phrases and paraphrasing is often mixed. One of the successful and applicable methods in paraphrasing detection is fuzzy method. In this study, a new fuzzy approach has been proposed to detect external plagiarism in Persian texts which is called Persian Fuzzy Plagiarism Detection (PFPD). The proposed approach compares paraphrased texts with the aim to recognize text similarities. External plagiarism detection, evaluates through a comparison between query document and a document collection. To avoid un-necessary comparisons this tool employs intelligent technology for comparing, suspicious documents, in different levels hierarchically. This method intends to conformed Fuzzy model to Persian language and improves previous methods to evaluate similarity degree between two sentences. Experiments on three corpora TMC, Irandoc and extracted corpus from prozhe.com, are performed to get confidence on proposed method performance. The obtained results showed that using proposed method in candidate documents retrieval, and in evaluating text similarity, increases the precision, recall and F measurement in comparing with one of the best previous fuzzy methods, respectively 22.41, 17.61, and 18.54 percent on the average. Manuscript profile
      • Open Access Article

        3 - A fuzzy approach for ambiguity reducing in text similarity estimation (case study: Persian web contents)
        Hamid Ahangarbahan gholamali montazer
        Finding similar web contents have great efficiency in academic community and software systems. There are many methods and metrics in literature to measure the extent of text similarity among various documents and some its application especially in plagiarism detection s More
        Finding similar web contents have great efficiency in academic community and software systems. There are many methods and metrics in literature to measure the extent of text similarity among various documents and some its application especially in plagiarism detection systems. However, most of them do not take ambiguity inherent in word or text pair’s comparison as well as structural features into account. As a result, pervious methods did not have enough accuracy to deal vague information. So using structural features and considering ambiguity inherent word improve the identification of similar contents. In this paper, a new method has been proposed that taking lexical and structural features in text similarity measures into consideration. After preprocessing and removing stopwords, each text was divided into general words and domain-specific knowledge words. Then, the two lexical and structural fuzzy inference systems were designed to assess lexical and structural text similarity. The proposed method has been evaluated on Persian paper abstracts of International Conference on e-Learning and e-Teaching (ICELET) Corpus. The results shows that the proposed method can achieve a rate of 75% in terms of precision and can detect 81% of the similar cases. Manuscript profile
      • Open Access Article

        4 - De-lurking in Online Communities Using Repost Behavior Prediction Method
        Omid Reza Bolouki Speily
        Nowadays, with the advent of social networks, a big change has occurred in the structure of web-based services. Online community (OC) enable their users to access different type of Information, through the internet based structure anywhere any time. OC services are am More
        Nowadays, with the advent of social networks, a big change has occurred in the structure of web-based services. Online community (OC) enable their users to access different type of Information, through the internet based structure anywhere any time. OC services are among the strategies used for production and repost of information by users interested in a specific area. In this respect, users become members in a particular domain at will and begin posting. Considering the networking structure, one of the major challenges these groups face is the lack of reposting behavior. Most users of these systems take up a lurking position toward the posts in the forum. De-lurking is a type of social media behavior where a user breaks an "online silence" or habit of passive thread viewing to engage in a virtual conversation. One of the proposed ways to improve De-Lurking is the selection and display of influential posts for each individual. Influential posts are so selected as to be more likely reposted by users based on each user's interests, knowledge and characteristics. The present article intends to introduce a new method for selecting k influential posts to ensure increased repost of information. In terms of participation in OCs, users are divided into two groups of posters and lurkers. Some solutions are proposed to encourage lurking users to participate in reposting the contents. Based on actual data from Twitter and actual blogs with respect to reposts, the assessments indicate the effectiveness of the proposed method. Manuscript profile
      • Open Access Article

        5 - DBCACF: A Multidimensional Method for Tourist Recommendation Based on Users’ Demographic, Context and Feedback
        Maral Kolahkaj Ali Harounabadi Alireza Nikravan shalmani Rahim Chinipardaz
        By the advent of some applications in the web 2.0 such as social networks which allow the users to share media, many opportunities have been provided for the tourists to recognize and visit attractive and unfamiliar Areas-of-Interest (AOIs). However, finding the appropr More
        By the advent of some applications in the web 2.0 such as social networks which allow the users to share media, many opportunities have been provided for the tourists to recognize and visit attractive and unfamiliar Areas-of-Interest (AOIs). However, finding the appropriate areas based on user’s preferences is very difficult due to some issues such as huge amount of tourist areas, the limitation of the visiting time, and etc. In addition, the available methods have yet failed to provide accurate tourist’s recommendations based on geo-tagged media because of some problems such as data sparsity, cold start problem, considering two users with different habits as the same (symmetric similarity), and ignoring user’s personal and context information. Therefore, in this paper, a method called “Demographic-Based Context-Aware Collaborative Filtering” (DBCACF) is proposed to investigate the mentioned problems and to develop the Collaborative Filtering (CF) method with providing personalized tourist’s recommendations without users’ explicit requests. DBCACF considers demographic and contextual information in combination with the users' historical visits to overcome the limitations of CF methods in dealing with multi- dimensional data. In addition, a new asymmetric similarity measure is proposed in order to overcome the limitations of symmetric similarity methods. The experimental results on Flickr dataset indicated that the use of demographic and contextual information and the addition of proposed asymmetric scheme to the similarity measure could significantly improve the obtained results compared to other methods which used only user-item ratings and symmetric measures. Manuscript profile
      • Open Access Article

        6 - Measuring Similarity for Directed Path in Geometric Data
        Mohammad Farshi Zeinab Saeidi
        We consider the following similarity problem concerning the Fréchet distance. A directed path π is given as input and a horizontal segment Q is defined at query time by the user. Our goal is to preprocess and save the directed path π into a data structure such that base More
        We consider the following similarity problem concerning the Fréchet distance. A directed path π is given as input and a horizontal segment Q is defined at query time by the user. Our goal is to preprocess and save the directed path π into a data structure such that based on the information saved in the data structure, one sub-path of the directed path can be reported which Fréchet distance between the sub-path and the horizontal query segment Q is minimum between all possible sub-paths. To the best of our knowledge, no theoretical results have been reported for this problem. In this paper, the first heuristic algorithm is proposed. We only experimentally show the quality of the algorithm in several datasets due to no existing algorithm. Manuscript profile
      • Open Access Article

        7 - Immortality of the Qur'an and its relation to the Quran commentators (Ahl al-Bayt (as))
        Maryam  Ishaqzadeh Seyed mohammad  noorollahi
        The other features of the universe are its immortality, while the Qur'an, with its fixed principles, is progressing along with its time and place; it is discovered in every time and forever, new and inferior, and eternal (immortality). The proper axis is to link these a More
        The other features of the universe are its immortality, while the Qur'an, with its fixed principles, is progressing along with its time and place; it is discovered in every time and forever, new and inferior, and eternal (immortality). The proper axis is to link these attributes of the Qur'an. In particular, the principles and detailed descriptions of the Ahlul Bayt (a) are that the Quran expresses the general lines of the basic principles. And from the important issues that have been examined from almost all perspectives in almost all the Qur'anic texts and interpretations, It is a matter of magnificence that is addressed here from the angle of eternity of the Qur'an. Certainly, part of the verses that are due to the failure of the words are in expressing deep and profound meanings of the Qur'an. It is the same as the Qur'an. It is directly guided. The other features of the Qur'an, which are rooted in the immortality of the Qur'an, are unkind and obscenity. It is very important that the language of the Ahl al-Bayt (as) is repeatedly emphasized, it is the recognition of the unconscious and the dwelling. The reason is clear, because what should be known what is good for Allah? And it is from the interpretative guidelines of the Ahlul-Bayt (as) that there is a need for a precise understanding of abusive and obscurantism and the use of that interpretation.. Manuscript profile
      • Open Access Article

        8 - Robust Persian Isolated Digit Recognition Based on LSTM and Speech Spectral Features
        شیما طبیبیان
        One of the challenges of isolated Persian digit recognition is similar pronunciation of some digits such as "zero and three", "nine and two" and "five, seven and eight". This challenge leads to the high substitution errors and reduces the recognition accuracy. In this p More
        One of the challenges of isolated Persian digit recognition is similar pronunciation of some digits such as "zero and three", "nine and two" and "five, seven and eight". This challenge leads to the high substitution errors and reduces the recognition accuracy. In this paper, a combined solution based on short-term memory (LSTM) and hidden Markov model (HMM) is proposed to solve the mentioned challenge. The proposed approach increases the recognition rate of Persian digits on average 2 percent and in the best case 8 percent in comparison to the HMM-based approach. In the following of this work, due to the intensification of the mentioned challenge in noisy conditions, the robust recognition of Persian digits with similar pronunciation was considered. In order to increase the robustness of the LSTM-based recognizer, robust features extracted from the speech spectrum such as spectral entropy, burst degree, bisector frequency, spectral flatness, first formant and autocorrelation-based zero crossing rate were used. Using these features, while reducing the number of features for recognizing similar Persian digits from 39 coefficients to a maximum of 4 and a minimum of 1 coefficient, on average improved the robustness of the isolated digit recognizer in different noisy conditions (30 different situations resulting from five noise types of white, pink, babble, factory and car noises and six signal-to-noise ratios of -5, 0, 5, 10, 15 and 20 decibels) by 10%, 13%, 15% and 13% compared to the HMM-based, LSTM-based, deep belief network-based recognizers with Mel-Cepstrum coefficients and a convolutional neural network-recognizer with Mel Spectrogram features. Manuscript profile