• Home
  • Parametric Clustering
    • List of Articles Parametric Clustering

      • Open Access Article

        1 - Word Sense Induction in Persian and English: A Comparative Study
        Masood Ghayoomi
        Words in the natural language have forms and meanings, and there might not always be a one-to-one match between them. This property of the language causes words to have more than one meaning; as a result, a text processing system faces challenges to determine the precis More
        Words in the natural language have forms and meanings, and there might not always be a one-to-one match between them. This property of the language causes words to have more than one meaning; as a result, a text processing system faces challenges to determine the precise meaning of the target word in a sentence. Using lexical resources or lexical databases, such as WordNet, might be a help, but due to their manual development, they become outdated by passage of time and language change. Moreover, the lexical resources might be domain dependent which are unusable for open domain natural language processing tasks. These drawbacks are a strong motivation to use unsupervised machine learning approaches to induce word senses from the natural data. To reach the goal, the clustering approach can be utilized such that each cluster resembles a sense. In this paper, we study the performance of a word sense induction model by using three variables: a) the target language: in our experiments, we run the induction process on Persian and English; b) the type of the clustering algorithm: both parametric clustering algorithms, including hierarchical and partitioning, and non-parametric clustering algorithms, including probabilistic and density-based, are utilized to induce senses; c) the context of the target words to capture the information in vectors created for clustering: for the input of the clustering algorithms, the vectors are created either based on the whole sentence in which the target word is located; or based on the limited surrounding words of the target word. We evaluate the clustering performance externally. Moreover, we introduce a normalized, joint evaluation metric to compare the models. The experimental results for both Persian and English test data showed that the window-based partitioningK-means algorithm obtained the best performance. Manuscript profile
      • Open Access Article

        2 - Identifying Primary User Emulation Attacks in Cognitive Radio Network Based on Bayesian Nonparametric Bayesian
        K. Akbari J. Abouei
        Cognitive radio as a key technology is taken into consideration widely to cope with the shortage of spectrum in wireless networks. One of the major challenges to realization of CR networks is security. The most important of these threats is primary user emulation attack More
        Cognitive radio as a key technology is taken into consideration widely to cope with the shortage of spectrum in wireless networks. One of the major challenges to realization of CR networks is security. The most important of these threats is primary user emulation attack, thus malicious user attempts to send a signal same as primary user's signal to deceive secondary users and prevent them from sending signals in the spectrum holes. Meanwhile, causing traffic in CR network, malicious user obtains a frequency band to send their information. In this thesis, a method to identify primary user emulation attack is proposed. According to this method, primary users and malicious users are distinguished by clustering. In this method, the number of active users is recognized in the CR network by clustering. Indeed, by using Dirichlet process mixture model classification based on the Bayesian Nonparametric method, primary users are clustered. In addition, to achieve higher convergence rate, Chinese restaurant process method to initialize and non-uniform sampling is applied to select clusters parameter. Manuscript profile