• Home
  • machine learning
    • List of Articles machine learning

      • Open Access Article

        1 - Survey different aspects of the problem phishing website detection and Review to existing Methods
        nafise langari
        One of the latest security threats in cyberspace to steal personal and financial information is created by phisher. Due to there Are various methods to detect phishing and also there is not an up-date comprehensive study on the issue, the authors Motivated to review and More
        One of the latest security threats in cyberspace to steal personal and financial information is created by phisher. Due to there Are various methods to detect phishing and also there is not an up-date comprehensive study on the issue, the authors Motivated to review and analysis the proposed phishing detection methods in five categories such as: anti-phishing tools Based, data mining based, heuristic based, meta-heuristic based and machine learning based methods. The advantages and Disadvantages of each method are extracted from the current review and comparison. The outlines of this study can be suitable to identify the probability gaps in phishing detection problems for feature researches. Manuscript profile
      • Open Access Article

        2 - Density Measure in Context Clustering for Distributional Semantics of Word Sense Induction
        Masood Ghayoomi
        Word Sense Induction (WSI) aims at inducing word senses from data without using a prior knowledge. Utilizing no labeled data motivated researchers to use clustering techniques for this task. There exist two types of clustering algorithm: parametric or non-parametric. Al More
        Word Sense Induction (WSI) aims at inducing word senses from data without using a prior knowledge. Utilizing no labeled data motivated researchers to use clustering techniques for this task. There exist two types of clustering algorithm: parametric or non-parametric. Although non-parametric clustering algorithms are more suitable for inducing word senses, their shortcomings make them useless. Meanwhile, parametric clustering algorithms show competitive results, but they suffer from a major problem that is requiring to set a predefined fixed number of clusters in advance. The main contribution of this paper is to show that utilizing the silhouette score normally used as an internal evaluation metric to measure the clusters’ density in a parametric clustering algorithm, such as K-means, in the WSI task captures words’ senses better than the state-of-the-art models. To this end, word embedding approach is utilized to represent words’ contextual information as vectors. To capture the context in the vectors, we propose two modes of experiments: either using the whole sentence, or limited number of surrounding words in the local context of the target word to build the vectors. The experimental results based on V-measure evaluation metric show that the two modes of our proposed model beat the state-of-the-art models by 4.48% and 5.39% improvement. Moreover, the average number of clusters and the maximum number of clusters in the outputs of our proposed models are relatively equal to the gold data Manuscript profile
      • Open Access Article

        3 - Predicting Student Performance for Early Intervention using Classification Algorithms in Machine Learning
        Kalaivani K Ulagapriya K Saritha A Ashutosh  Kumar
        Predicting Student’s Performance System is to find students who may require early intervention before they fail to graduate. It is generally meant for the teaching faculty members to analyze Student's Performance and Results. It stores Student Details in a database and More
        Predicting Student’s Performance System is to find students who may require early intervention before they fail to graduate. It is generally meant for the teaching faculty members to analyze Student's Performance and Results. It stores Student Details in a database and uses Machine Learning Model using i. Python Data Analysis tools like Pandas and ii. Data Visualization tools like Seaborn to analyze the overall Performance of the Class. The proposed system suggests student performance prediction through Machine Learning Algorithms and Data Mining Techniques. The Data Mining technique used here is classification, which classifies the students based on student’s attributes. The Front end of the application is made using React JS Library with Data Visualization Charts and connected to a backend Database where all student’s records are stored in MongoDB and the Machine Learning model is trained and deployed through Flask. In this process, the machine learning algorithm is trained using a dataset to create a model and predict the output on the basis of that model. Three different types of data used in Machine Learning are continuous, categorical and binary. In this study, a brief description and comparative analysis of various classification techniques is done using student performance dataset. The six different machine learning Classification algorithms, which have been compared, are Logistic Regression, Decision Tree, K-Nearest Neighbor, Naïve Bayes, Support Vector Machine and Random Forest. The results of Naïve Bayes classifier are comparatively higher than other techniques in terms of metrics such as precision, recall and F1 score. The values of precision, recall and F1 score are 0.93, 0.92 and 0.92 respectively. Manuscript profile
      • Open Access Article

        4 - A Hybrid Machine Learning Approach for Sentiment Analysis of Beauty Products Reviews
        Kanika Jindal Rajni Aron
        Nowadays, social media platforms have become a mirror that imitates opinions and feelings about any specific product or event. These product reviews are capable of enhancing communication among entrepreneurs and their customers. These reviews need to be extracted and an More
        Nowadays, social media platforms have become a mirror that imitates opinions and feelings about any specific product or event. These product reviews are capable of enhancing communication among entrepreneurs and their customers. These reviews need to be extracted and analyzed to predict the sentiment polarity, i.e., whether the review is positive or negative. This paper aims to predict the human sentiments expressed for beauty product reviews extracted from Amazon and improve the classification accuracy. The three phases instigated in our work are data pre-processing, feature extraction using the Bag-of-Words (BoW) method, and sentiment classification using Machine Learning (ML) techniques. A Global Optimization-based Neural Network (GONN) is proposed for the sentimental classification. Then an empirical study is conducted to analyze the performance of the proposed GONN and compare it with the other machine learning algorithms, such as Random Forest (RF), Naive Bayes (NB), and Support Vector Machine (SVM). We dig further to cross-validate these techniques by ten folds to evaluate the most accurate classifier. These models have also been investigated on the Precision-Recall (PR) curve to assess and test the best technique. Experimental results demonstrate that the proposed method is the most appropriate method to predict the classification accuracy for our defined dataset. Specifically, we exhibit that our work is adept at training the textual sentiment classifiers better, thereby enhancing the accuracy of sentiment prediction. Manuscript profile
      • Open Access Article

        5 - The Development of a Hybrid Error Feedback Model for Sales Forecasting
        Mehdi Farrokhbakht Foumani Sajad Moazami Goudarzi
        Sales forecasting is one of the significant issues in the industrial and service sector which can lead to facilitated management decisions and reduce the lost values in case of being dealt with properly. Also sales forecasting is one of the complicated problems in analy More
        Sales forecasting is one of the significant issues in the industrial and service sector which can lead to facilitated management decisions and reduce the lost values in case of being dealt with properly. Also sales forecasting is one of the complicated problems in analyzing time series and data mining due to the number of intervening parameters. Various models were presented on this issue and each one found acceptable results. However, developing the methods in this study is still considered by researchers. In this regard, the present study provided a hybrid model with error feedback for sales forecasting. In this study, forecasting was conducted using a supervised learning method. Then, the remaining values (model error) were specified and the error values were forecasted using another learning method. Finally, two trained models were combined together and consecutively used for sales forecasting. In other words, first the forecasting was conducted and then the error rate was determined by the second model. The total forecasting and model error indicated the final forecasting. The computational results obtained from numerical experiments indicated the superiority of the proposed hybrid method performance over the common models in the available literature and reduced the indicators related to forecasting error. Manuscript profile
      • Open Access Article

        6 - Word Sense Induction in Persian and English: A Comparative Study
        Masood Ghayoomi
        Words in the natural language have forms and meanings, and there might not always be a one-to-one match between them. This property of the language causes words to have more than one meaning; as a result, a text processing system faces challenges to determine the precis More
        Words in the natural language have forms and meanings, and there might not always be a one-to-one match between them. This property of the language causes words to have more than one meaning; as a result, a text processing system faces challenges to determine the precise meaning of the target word in a sentence. Using lexical resources or lexical databases, such as WordNet, might be a help, but due to their manual development, they become outdated by passage of time and language change. Moreover, the lexical resources might be domain dependent which are unusable for open domain natural language processing tasks. These drawbacks are a strong motivation to use unsupervised machine learning approaches to induce word senses from the natural data. To reach the goal, the clustering approach can be utilized such that each cluster resembles a sense. In this paper, we study the performance of a word sense induction model by using three variables: a) the target language: in our experiments, we run the induction process on Persian and English; b) the type of the clustering algorithm: both parametric clustering algorithms, including hierarchical and partitioning, and non-parametric clustering algorithms, including probabilistic and density-based, are utilized to induce senses; c) the context of the target words to capture the information in vectors created for clustering: for the input of the clustering algorithms, the vectors are created either based on the whole sentence in which the target word is located; or based on the limited surrounding words of the target word. We evaluate the clustering performance externally. Moreover, we introduce a normalized, joint evaluation metric to compare the models. The experimental results for both Persian and English test data showed that the window-based partitioningK-means algorithm obtained the best performance. Manuscript profile
      • Open Access Article

        7 - Deep Learning-based Educational User Profile and User Rating Recommendation System for E-Learning
        Pradnya Vaibhav  Kulkarni Sunil Rai Rajneeshkaur Sachdeo Rohini Kale
        In the current era of online learning, the recommendation system for the eLearning process is quite important. Since the COVID-19 pandemic, eLearning has undergone a complete transformation. Existing eLearning Recommendation Systems worked on collaborative filtering or More
        In the current era of online learning, the recommendation system for the eLearning process is quite important. Since the COVID-19 pandemic, eLearning has undergone a complete transformation. Existing eLearning Recommendation Systems worked on collaborative filtering or content-based filtering based on historical data, students’ previous grade, results, or user profiles. The eLearning system selected courses based on these parameters in a generalized manner rather than on a personalized basis. Personalized recommendations, information relevancy, choosing the proper course, and recommendation accuracy are some of the issues in eLearning recommendation systems. In this paper, existing conventional eLearning and course recommendation systems are studied in detail and compared with the proposed approach. We have used, the dataset of User Profile and User Rating for a recommendation of the course. K Nearest Neighbor, Support Vector Machine, Decision Tree, Random Forest, Nave Bayes, Linear Regression, Linear Discriminant Analysis, and Neural Network were among the Machine Learning techniques explored and deployed. The accuracy achieved for all these algorithms ranges from 0.81 to 0.97. The proposed algorithm uses a hybrid approach by combining collaborative filtering and deep learning. We have improved accuracy to 0.98 which indicate that the proposed model can provide personalized and accurate eLearning recommendation for the individual user. Manuscript profile
      • Open Access Article

        8 - Detecting Human Activities Based on Motion Sensors in IOT Using Deep Learning
        Abbas Mirzaei fatemeh faraji
        Control of areas and locations and motion sensors in the Internet of Things requires continuous control to detect human activities in different situations, which is an important challenge, including manpower and human error. Permanent human control of IoT motion sensors More
        Control of areas and locations and motion sensors in the Internet of Things requires continuous control to detect human activities in different situations, which is an important challenge, including manpower and human error. Permanent human control of IoT motion sensors also seems impossible. The IoT is more than just a simple connection between devices and systems. IoT information sensors and systems help companies get a better view of system performance. This study presents a method based on deep learning and a 30-layer DNN neural network for detecting human activity on the Fordham University Activity Diagnostic Data Set. The data set contains more than 1 million lines in six classes to detect IoT activity. The proposed model had almost 90% and an error rate of 0.22 in the evaluation criteria, which indicates the good performance of deep learning in activity recognition. Manuscript profile
      • Open Access Article

        9 - Deep Extreme Learning Machine: A Combined Incremental Learning Approach for Data Stream Classification
        Javad Hamidzadeh Mona Moradi
        Streaming data refers to data that is continuously generated in the form of fast streams with high volumes. This kind of data often runs into evolving environments where a change may affect the data distribution. Because of a wide range of real-world applications of dat More
        Streaming data refers to data that is continuously generated in the form of fast streams with high volumes. This kind of data often runs into evolving environments where a change may affect the data distribution. Because of a wide range of real-world applications of data streams, performance improvement of streaming analytics has become a hot topic for researchers. The proposed method integrates online ensemble learning into extreme machine learning to improve the data stream classification performance. The proposed incremental method does not need to access the samples of previous blocks. Also, regarding the AdaBoost approach, it can react to concept drift by the component weighting mechanism and component update mechanism. The proposed method can adapt to the changes, and its performance is leveraged to retain high-accurate classifiers. The experiments have been done on benchmark datasets. The proposed method can achieve 0.90% average specificity, 0.69% average sensitivity, and 0.87% average accuracy, indicating its superiority compared to two competing methods. Manuscript profile
      • Open Access Article

        10 - An Autoencoder based Emotional Stress State Detection Approach by using Electroencephalography Signals
        Jia Uddin
        Identifying hazards from human error is critical for industrial safety since dangerous and reckless industrial worker actions, as well as a lack of measures, are directly accountable for human-caused problems. Lack of sleep, poor nutrition, physical deformities, and wea More
        Identifying hazards from human error is critical for industrial safety since dangerous and reckless industrial worker actions, as well as a lack of measures, are directly accountable for human-caused problems. Lack of sleep, poor nutrition, physical deformities, and weariness are some of the key factors that contribute to these risky and reckless behaviors that might put a person in a perilous scenario. This scenario causes discomfort, worry, despair, cardiovascular disease, a rapid heart rate, and a slew of other undesirable outcomes. As a result, it would be advantageous to recognize people's mental states in the future in order to provide better care for them. Researchers have been studying electroencephalogram (EEG) signals to determine a person's stress level at work in recent years. A full feature analysis from domains is necessary to develop a successful machine learning model using electroencephalogram (EEG) inputs. By analyzing EEG data, a time-frequency based hybrid bag of features is designed in this research to determine human stress dependent on their sex. This collection of characteristics includes features from two types of assessments: time-domain statistical analysis and frequency-domain wavelet-based feature assessment. The suggested two layered autoencoder based neural networks (AENN) are then used to identify the stress level using a hybrid bag of features. The experiment uses the DEAP dataset, which is freely available. The proposed method has a male accuracy of 77.09% and a female accuracy of 80.93%. Manuscript profile
      • Open Access Article

        11 - Comparative Study of 5G Signal Attenuation Estimation Models
        Md Anoarul Islam Manabendra Maiti Judhajit Sanyal Quazi Md Alfred
        Wireless networks functioning on 4G and 5G technology offer a plethora of options to users in terms of connectivity and multimedia content. However, such networks are prone to severe signal attenuation and noise in a number of scenarios. Significant research in recent y More
        Wireless networks functioning on 4G and 5G technology offer a plethora of options to users in terms of connectivity and multimedia content. However, such networks are prone to severe signal attenuation and noise in a number of scenarios. Significant research in recent years has consequently focused on establishment of robust and accurate attenuation models to estimate channel noise and subsequent signal loss. The identified challenge therefore is to identify or develop accurate computationally inexpensive models implementable on available hardware for generation of estimates with low error and validate the solutions experimentally. The present work surveys some of the most relevant recent work in this domain, with added emphasis on rain attenuation models and machine learning based approaches, and offers a perspective on the establishment of a suitable dynamic signal attenuation model for high-speed wireless communication in outdoor as well as indoor environments, presenting the performance evaluation of an autoregression-based machine learning model. Multiple versions of the model are compared on the basis of root mean square error (RMSE) for different orders of regression polynomials to find the best-fit solution. The accuracy of the technique proposed in the paper is then compared in terms of RMSE to corresponding moderate and high complexity machine learning techniques implementing adaptive spline regression and artificial neural networks respectively. The proposed method is found to be quite accurate with low complexity, allowing the method to be practically applicable in multiple scenarios. Manuscript profile
      • Open Access Article

        12 - A Content-Based Image Retrieval System Using Semi-Supervised Learning and Frequent Patterns Mining
        Maral Kolahkaj
        Content-based image retrieval, which is also known as query based on image content, is one of the sub-branches of machine vision, which is used to organize and recognize the content of digital images using visual features. This technology automatically searches the imag More
        Content-based image retrieval, which is also known as query based on image content, is one of the sub-branches of machine vision, which is used to organize and recognize the content of digital images using visual features. This technology automatically searches the images similar to the query image from huge image database and it provides the most similar images to the users by directly extracting visual features from image data; not keywords and textual annotations. Therefore, in this paper, a method is proposed that utilizes wavelet transformation and combining features with color histogram to reduce the semantic gap between low-level visual features and high-level meanings of images. In this regard, the final output will be presented using the feature extraction method from the input images. In the next step, when the query images are given to the system by the target user, the most similar images are retrieved by using semi-supervised learning that results from the combination of clustering and classification based on frequent patterns mining. The experimental results show that the proposed system has provided the highest level of effectiveness compared to other methods. Manuscript profile
      • Open Access Article

        13 - Breast Cancer Classification Approaches - A Comparative Analysis
        Mohan Kumar Sunil Kumar Khatri Masoud Mohammadian
        Cancer of the breast is a difficult disease to treat since it weakens the patient's immune system. Particular interest has lately been shown in the identification of particular immune signals for a variety of malignancies in this regard. In recent years, several methods More
        Cancer of the breast is a difficult disease to treat since it weakens the patient's immune system. Particular interest has lately been shown in the identification of particular immune signals for a variety of malignancies in this regard. In recent years, several methods for predicting cancer based on proteomic datasets and peptides have been published. The cells turns into cancerous cells because of various reasons and get spread very quickly while detrimental to normal cells. In this regard, identifying specific immunity signs for a range of cancers has recently gained a lot of interest. Accurately categorizing and compartmentalizing the breast cancer subtype is a vital job. Computerized systems built on artificial intelligence can substantially save time and reduce inaccuracy. Several strategies for predicting cancer utilizing proteomic datasets and peptides have been reported in the literature in recent years.It is critical to classify and categorize breast cancer treatments correctly. It's possible to save time while simultaneously minimizing the likelihood of mistakes using machine learning and artificial intelligence approaches. Using the Wisconsin Breast Cancer Diagnostic dataset, this study evaluates the performance of various classification methods, including SVC, ETC, KNN, LR, and RF (random forest). Breast cancer can be detected and diagnosed using a variety of measurements of data (which are discussed in detail in the article) (WBCD). The goal is to determine how well each algorithm performs in terms of precision, recall, and accuracy. The variation of each classification threshold has been tested on various algorithms and SVM turned out to be very promising. Manuscript profile
      • Open Access Article

        14 - Designing an Ensemble model for estimating the permeability of a hydrocarbon reservoir by petrophysical lithology Labeling
        abbas salahshoor Ahmad Gaeini Alireza shahin mossayeb kamari
        Permeability is one of the important characteristics of oil and gas reservoirs that is difficult to predict. In the present solution, experimental and regression models are used to predict permeability, which includes time and high costs associated with laboratory measu More
        Permeability is one of the important characteristics of oil and gas reservoirs that is difficult to predict. In the present solution, experimental and regression models are used to predict permeability, which includes time and high costs associated with laboratory measurements. Recently, machine learning algorithms have been used to predict permeability due to better predictability. In this study, a new ensemble machine learning model for permeability prediction in oil and gas reservoirs is introduced. In this method, the input data are labeled using the lithology information of the logs and divided into a number of categories and each category was modeled by machine learning algorithm. Unlike previous studies that worked independently on models, here we were able to predict the accuracy of such a square mean error by designing a group model using ETR, DTR, GBR algorithms and petrophysical data. Improve dramatically and predict permeability with 99.82% accuracy. The results showed that group models have a great effect on improving the accuracy of permeability prediction compared to individual models and also the separation of samples based on lithology information was a reason to optimize the Trojan estimate compared to previous studies. Manuscript profile
      • Open Access Article

        15 - SQ-PUF: A Resistant PUF-Based Authentication Protocol against Machine-Learning Attack
        Abolfazl Sajadi Bijan Alizadeh
        Physically unclonable functions (PUFs) provide hardware to generate a unique challenge-response pattern for authentication and encryption purposes. An essential feature of these circuits is their unpredictability, meaning that an adversary cannot sufficiently predict fu More
        Physically unclonable functions (PUFs) provide hardware to generate a unique challenge-response pattern for authentication and encryption purposes. An essential feature of these circuits is their unpredictability, meaning that an adversary cannot sufficiently predict future responses from previous observations. However, machine learning algorithms have been demonstrated to be a severe threat to PUFs since they are capable of accurately modeling their behavior. In this work, we analyze PUF security threats and propose a PUF-based authentication mechanism called SQ-PUF, which can provide good resistance to machine learning attacks. In order to make it harder to simulate or predict, we obfuscated the correlation between challenge-response pairs. Experimental results show that, unlike existing PUFs, even with a large data set, the SQ-PUF model cannot be successfully attacked with a maximum prediction accuracy of 53%, indicating that this model is unpredictable. In addition, the uniformity in this model remains almost the same as the ideal value in A-PUF. Manuscript profile
      • Open Access Article

        16 - An Analysis of Covid-19 Pandemic Outbreak on Economy using Neural Network and Random Forest
        Md. Nahid  Hasan Tanvir  Ahmed Md.  Ashik Md. Jahid  Hasan Tahaziba  Azmin Jia Uddin
        The pandemic disease outbreaks are causing a significant financial crisis affecting the worldwide economy. Machine learning techniques are urgently required to detect, predict and analyze the economy for early economic planning and growth. Consequently, in this paper, w More
        The pandemic disease outbreaks are causing a significant financial crisis affecting the worldwide economy. Machine learning techniques are urgently required to detect, predict and analyze the economy for early economic planning and growth. Consequently, in this paper, we use machine learning classifiers and regressors to construct an early warning model to tackle economic recession due to the cause of covid-19 pandemic outbreak. A publicly available database created by the National Bureau of Economic Research (NBER) is used to validate the model, which contains information about national revenue, employment rate, and workers' earnings of the USA over 239 days (1 January 2020 to 12 May 2020). Different techniques such as missing value imputation, k-fold cross validation have been used to pre-process the dataset. Machine learning classifiers- Multi-layer Perceptron- Neural Network (MLP-NN) and Random Forest (RF) have been used to predict recession. Additionally, machine learning regressors-Long Short-Term Memory (LSTM) and Random Forest (RF) have been used to detect how much recession a country is facing as a result of positive test cases of covid-19 pandemic. Experimental results demonstrate that the MLP-NN and RF classifiers have exhibited average 88.33% and 85% of recession (where 95%, 81%, 89% and 85%, 81%, 89% for revenue, employment rate and workers earnings, respectively) and average 90.67% and 93.67% of prediction accuracy for LSTM and RF regressors (where 92%, 90%, 90%, and 95%, 93%, 93% respectively). Manuscript profile
      • Open Access Article

        17 - Application of Machine Learning in the Telecommunications Industry: Partial Churn Prediction by using a Hybrid Feature Selection Approach
        Fatemeh Mozaffari Iman Raeesi Vanani Payam Mahmoudian Babak Sohrabi
        The telecommunications industry is one of the most competitive industries in the world. Because of the high cost of customer acquisition and the adverse effects of customer churn on the company's performance, customer retention becomes an inseparable part of strategic d More
        The telecommunications industry is one of the most competitive industries in the world. Because of the high cost of customer acquisition and the adverse effects of customer churn on the company's performance, customer retention becomes an inseparable part of strategic decision-making and one of the main objectives of customer relationship management. Although customer churn prediction models are widely studied in various domains, several challenges remain in designing and implementing an effective model. This paper addresses the customer churn prediction problem with a practical approach. The experimental analysis was conducted on the customers' data gathered from available sources at a telecom company in Iran. First, partial churn was defined in a new way that exploits the status of customers based on criteria that can be measured easily in the telecommunications industry. This definition is also based on data mining techniques that can find the degree of similarity between assorted customers with active ones or churners. Moreover, a hybrid feature selection approach was proposed in which various feature selection methods, along with the crowd's wisdom, were applied. It was found that the wisdom of the crowd can be used as a useful feature selection method. Finally, a predictive model was developed using advanced machine learning algorithms such as bagging, boosting, stacking, and deep learning. The partial customer churn was predicted with more than 88% accuracy by the Gradient Boosting Machine algorithm by using 5-fold cross-validation. Comparative results indicate that the proposed model performs efficiently compared to the ones applied in the previous studies. Manuscript profile
      • Open Access Article

        18 - A new approach to IoT-based disease diagnosis using genetic algorithms and various classifiers
        seyed ebrahim dashti maryam nikpor mehdi nikpor mahbobe johari
        Medical information technology and health services are related to the national welfare and livelihood of the people. The integration of cloud computing and the Internet of Things will be a major breakthrough in modern medical applications. This study focuses on the chro More
        Medical information technology and health services are related to the national welfare and livelihood of the people. The integration of cloud computing and the Internet of Things will be a major breakthrough in modern medical applications. This study focuses on the chronic disease of diabetes, which is one of the leading causes of death worldwide. This research has applied medical information technology in the field of IoT, especially in the field of medical monitoring and management applications. A model architecture for remote monitoring and management of the health information cloud platform is proposed and analyzed, and then an algorithm based on genetic algorithm and hybrid classification for the diagnosis of diabetes is proposed for medical monitoring. The results show that the proposed method has a higher performance than the basic methods and has reached an accuracy of 94%. Manuscript profile
      • Open Access Article

        19 - Machine Learning-Based Security Resource Allocation for Defending against Attacks in the Internet of Things
        Nasim Navaei Vesal Hakami
        Nowadays, the Internet of Things (IoT) has become the focus of security attacks due to the limitation of processing resources, heterogeneity, energy limitation in objects, and the lack of a single standard for implementing security mechanisms. In this article, a solutio More
        Nowadays, the Internet of Things (IoT) has become the focus of security attacks due to the limitation of processing resources, heterogeneity, energy limitation in objects, and the lack of a single standard for implementing security mechanisms. In this article, a solution will be presented for the problem of security resources allocating to deal with attacks in the Internet of Things. Security Resource Allocation (SRA) problem in the IoT networks refers to the placement of the security resources in the IoT infrastructure. To solve this problem, it is mandatory to consider the dynamic nature of the communication environments and the uncertainty of the attackers' actions. In the traditional approaches for solving the SRA, the attacker works over based on his assumptions about the system conditions. Meanwhile, the defender collects the system's information with prior knowledge of the attacker's behavior and the targeted nodes. Unlike the mentioned traditional approaches, this research has adopted a realistic approach for the Dynamic Security Resources Allocation in the IoT to battle attackers with unknown behavior. In the stated problem, since there is a need to decide on deploying several security resources during the learning periods, the state space of the strategies is expressed in the combinatorial form. Also, the SRAIoT problem is defined as a combinatorial-adversarial multi-armed bandit problem. Since switching in the security resources has a high cost, in real scenarios, this cost is included in the utility function of the problem. Thus, the proposed framework considers the switching cost and the earned reward. The simulation results show a faster convergence of the weak regret criterion of the proposed algorithms than the basic combinatorial algorithm. In addition, in order to simulate the IoT network in a realistic context, the attack scenario has been simulated using the Cooja simulator. Manuscript profile
      • Open Access Article

        20 - Persian Stance Detection Based On Multi-Classifier Fusion
        Mojgan Farhoodi Abbas Toloie Eshlaghy
        <p style="text-align: left;"><span style="font-size: 12.0pt; font-family: 'Times New Roman',serif; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: Nazanin; mso-ansi-language: EN-US; mso-fareast-language: EN-US; mso-bidi-language: FA;">Stance detection More
        <p style="text-align: left;"><span style="font-size: 12.0pt; font-family: 'Times New Roman',serif; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: Nazanin; mso-ansi-language: EN-US; mso-fareast-language: EN-US; mso-bidi-language: FA;">Stance detection (also known as stance classification, stance prediction, and stance analysis) is a recent research topic that has become an emerging paradigm of the importance of opinion-mining. The purpose of stance detection is to identify the author's viewpoint toward a specific target, which has become a key component of applications such as fake news detection, claim validation, argument search, etc. In this paper, we applied three approaches including machine learning, deep learning and transfer learning for Persian stance detection. Then we proposed a framework of multi-classifier fusion for getting final decision on output results. We used a weighted majority voting method based on the accuracy of the classifiers to combine their results. The experimental results showed the performance of the proposed multi-classifier fusion method is better than individual classifiers.</span></p> Manuscript profile
      • Open Access Article

        21 - Intrusion Detection Based on Cooperation on the Permissioned Blockchain Platform in the Internet of Things Using Machine Learning
        Mohammad Mahdi  Abdian majid ghayori Seyed Ahmad  Eftekhari
        Intrusion detection systems seek to realize several objectives, such as increasing the true detection rate, reducing the detection time, reducing the computational load, and preserving the resulting logs in such a way that they cannot be manipulated or deleted by unauth More
        Intrusion detection systems seek to realize several objectives, such as increasing the true detection rate, reducing the detection time, reducing the computational load, and preserving the resulting logs in such a way that they cannot be manipulated or deleted by unauthorized people. Therefore, this study seeks to solve the challenges by benefiting from the advantages of blockchain technology, its durability, and relying on IDS architecture based on multi-node cooperation. The proposed model is an intrusion detection engine based on the decision tree algorithm implemented in the nodes of the architecture. The architecture consists of several connected nodes on the blockchain platform. The resulting model and logs are stored on the blockchain platform and cannot be manipulated. In addition to the benefits of using blockchain, reduced occupied memory, the speed, and time of transactions are also improved by blockchain. In this research, several evaluation models have been designed for single-node and multi-node architectures on the blockchain platform. Finally, proof of architecture, possible threats to architecture, and defensive ways are explained. The most important advantages of the proposed scheme are the elimination of the single point of failure, maintaining trust between nodes, and ensuring the integrity of the model, and discovered logs. Manuscript profile
      • Open Access Article

        22 - Design and Collection of Speech Data as the First Step of Localization the Intelligent Diagnosis of Autism in Iranian Children
        Maryam Alizadeh Shima tabibian
        Autism Spectrum Disorder is a type of disorder in which, the patients suffer from a developmental disorder that manifests itself by symptoms such as inability to social communication. Thus, the most apparent sign of autism is a speech disorder. The first part of this pa More
        Autism Spectrum Disorder is a type of disorder in which, the patients suffer from a developmental disorder that manifests itself by symptoms such as inability to social communication. Thus, the most apparent sign of autism is a speech disorder. The first part of this paper reviews research studies conducted to automatically diagnose autism based on speech processing methods. According to our review, the main speech processing approaches for diagnosing autism can be divided into two groups. The first group detects individuals with autism by processing their answers or feelings in response to questions or stories. The second group distinguishes people with autism from healthy people because of the accuracy of recognizing their spoken utterances based on automatic speech recognition systems. Despite much research being conducted outside Iran, few studies have been conducted in Iran. The most important reason for this is the lack of rich data that meet the needs of autism diagnosis based on the speech processing of suspected people. In the second part of the paper, we discuss the process of designing, collecting, and evaluating a speaker-independent dataset for autism diagnosis in Iranian children as the first step in the localization of the mentioned field. Manuscript profile
      • Open Access Article

        23 - Combination of Instance Selection and Data Augmentation Techniques for Imbalanced Data Classification
        Parastoo Mohaghegh Samira Noferesti Mehri Rajaei
        Mohaghegh, S. Noferesti*, and M. Rajaei Abstract: In the era of big data, automatic data analysis techniques such as data mining have been widely used for decision-making and have become very effective. Among data mining techniques, classification is a common method fo More
        Mohaghegh, S. Noferesti*, and M. Rajaei Abstract: In the era of big data, automatic data analysis techniques such as data mining have been widely used for decision-making and have become very effective. Among data mining techniques, classification is a common method for decision making and prediction. Classification algorithms usually work well on balanced datasets. However, one of the challenges of the classification algorithms is how to correctly predicting the label of new samples based on learning on imbalanced datasets. In this type of dataset, the heterogeneous distribution of the data in different classes causes examples of the minority class to be ignored in the learning process, while this class is more important in some prediction problems. To deal with this issue, in this paper, an efficient method for balancing the imbalanced dataset is presented, which improves the accuracy of the machine learning algorithms to correct prediction of the class label of new samples. According to the evaluations, the proposed method has a better performance compared to other methods based on two common criteria in evaluating the classification of imbalanced datasets, namely "Balanced Accuracy" and "Specificity". Manuscript profile
      • Open Access Article

        24 - Application identification through intelligent traffic classification
        Shaghayegh Naderi
        Traffic classification and analysis is one of the big challenges in the field of data mining and machine learning, which plays an important role in providing security, quality assurance and network management. Today, a large amount of transmission traffic in the network More
        Traffic classification and analysis is one of the big challenges in the field of data mining and machine learning, which plays an important role in providing security, quality assurance and network management. Today, a large amount of transmission traffic in the network is encrypted by secure communication protocols such as HTTPS. Encrypted traffic reduces the possibility of monitoring and detecting suspicious and malicious traffic in communication infrastructures (instead of increased security and privacy of the user) and its classification is a difficult task without decoding network communications, because the payload information is lost, and only the header information (which is encrypted too in new versions of network communication protocols such as TLS1.03) is accessible. Therefore, the old approaches of traffic analysis, such as various methods based on port and payload, have lost their efficiency, and new approaches based on artificial intelligence and machine learning are used in cryptographic traffic analysis. In this article, after reviewing the traffic analysis methods, an operational architectural framework for intelligent traffic analysis and classification has been designed. Then, an intelligent model for Traffic Classification and Application Identification is presented and evaluated using machine learning methods on Kaggle141. The obtained results show that the random forest model, in addition to high interpretability compared to deep learning methods, has been able to provide high accuracy in traffic classification compared to other machine learning methods. Finally, tips and suggestions about using machine learning methods in the operational field of traffic classification have been provided. Manuscript profile
      • Open Access Article

        25 - Assessment of Spatial and temporal changes in land use using remote sensing (case study: Jayransoo rangeland, North Khorasan)
        Mohabat Nadaf Reza Omidipour Hossein  Sobhani
        <p>Awareness of changes process, as well as the proper management of land use in natural ecosystems, is of great importance in conservation natural resources. In this regard, the use of remote sensing has become a common approach due to the provision an extent spatial a More
        <p>Awareness of changes process, as well as the proper management of land use in natural ecosystems, is of great importance in conservation natural resources. In this regard, the use of remote sensing has become a common approach due to the provision an extent spatial and temporal information. In this research, in order to land use mapping, first, the accuracy of three common methods of pixel-based (maximum likelihood), machine learning (support vector machine) and object-oriented methods were compared. Then, the spatial and temporal changes of land use in a period of 26 years (1997-2023) assessed using six Landsat satellite imagery. The accuracy of image classification methods was evaluated using Kappa coefficient and overall accuracy indices and the change trend was evaluated using crosstab and spatial evaluation methods. Based on the results, the support vector machine method had the highest kappa coefficient (0.71 to 0.98) and overall accuracy (86 to 99%) for all studied courses. According to the results, poor rangeland had a decreasing trend, and the land uses of very poor rangeland, bare soil, and rainfed agriculture had increasing trends. The area of poor rangeland decreased from 962 hectares (44.36%) in 1997 to 489 hectares (22.57%) in 2023, while very poor rangeland increased from 1138 hectares (52.48%) to 1606 hectares (74.05 percent) in the same period. The results of this research indicated that the trend of land use changes in Jayransoo rangeland is towards the destruction of rangelands and with the passage of time this trend is intensifying. Also, based on the results obtained from this research, it is suggested to use machine learning based classification method to prepare land use mapping in future research.</p> Manuscript profile
      • Open Access Article

        26 - Determining the Location of Lightning Strike Using Electromagnetic Time Reversal (EMTR) Method and Machine Learning
        Abbas Hamedooni Asli m.h. m.
        <p>Determining the location of lightning strikes (LLS) is one of today's challenges in various fields, especially in the fields of electricity and electronics. To determine the location of the lightning strike, classical methods were used previously; however, the use of More
        <p>Determining the location of lightning strikes (LLS) is one of today's challenges in various fields, especially in the fields of electricity and electronics. To determine the location of the lightning strike, classical methods were used previously; however, the use of electromagnetic time reversal (EMTR) method has also become popular recently. According to the calculation of the complete waveform of the field using the EMTR method, the accuracy in determining the location of the lightning strike has significantly increased compared to the traditional methods. In the electromagnetic time reversal method with the help of finite difference time domain (FDTD), the transient electromagnetic field produced by the lightning channel is calculated first. After the time reversal of the wave, it is re-emitted from the sensor or sensors to its source and again with the help of FDTD, The re-emission electromagnetic field in the desired environment is calculated. With the electromagnetic field of the environment, using criteria (such as maximum field amplitude, maximum energy and entropy, etc.), the location of the lightning strike is determined. In traditional methods, it is quite difficult to determine the uniqueness of the final response in environments with different characteristics, and the use of at least three sensors is mandatory. In this paper, to overcome these limitations, a method based on the combination of machine learning and three-dimensional EMTR (3D-FDTD) is proposed to determine the lightning strike location. First, the three-dimensional time domain finite difference method is used to calculate the electromagnetic field of the environment and using EMTR, the back-diffusion electromagnetic field (again with the help of 3D-FDTD) is calculated in the entire environment. In this way, the necessary data for the production of RGB image profiles is prepared. Then VGG19, a pre-trained convolutional neural network (CNN), is used to extract image features. Finally, a fitting layer is used to determine the location of the lightning strike. The proposed method is simulated and implemented in MATLAB and Python, and the results show the effectiveness of the proposed method to determine the location of lightning strikes in a three-dimensional environment without requiring the use of at least three sensors.</p> Manuscript profile
      • Open Access Article

        27 - Liquidity Risk Prediction Using News Sentiment Analysis
        hamed mirashk albadvi albadvi mehrdad kargari Mohammad Ali Rastegar Mohammad Talebi
        One of the main problems of Iranian banks is the lack of risk management process with a forward-looking approach, and one of the most important risks in banks is liquidity risk. Therefore, predicting liquidity risk has become an important issue for banks. Conventional m More
        One of the main problems of Iranian banks is the lack of risk management process with a forward-looking approach, and one of the most important risks in banks is liquidity risk. Therefore, predicting liquidity risk has become an important issue for banks. Conventional methods of measuring liquidity risk are complex, time-consuming and expensive, which makes its prediction far from possible. Predicting liquidity risk at the right time can prevent serious problems or crises in the bank. In this study, it has been tried to provide an innovative solution for predicting bank liquidity risk and leading scenarios by using the approach of news sentiment analysis. The news sentiment analysis approach about one of the Iranian banks has been used in order to identify dynamic and effective qualitative factors in liquidity risk to provide a simpler and more efficient method for predicting the liquidity risk trend. The proposed method provides practical scenarios for real-world banking risk decision makers. The obtained liquidity risk scenarios are evaluated in comparison with the scenarios occurring in the bank according to the guidelines of the Basel Committee and the opinion of banking experts to ensure the correctness of the predictions and its alignment. The result of periodically evaluating the studied scenarios indicates a relatively high accuracy. The accuracy of prediction in possible scenarios derived from the Basel Committee is 95.5% and in scenarios derived from experts' opinions, 75%. Manuscript profile
      • Open Access Article

        28 - Presenting a web recommender system for user nose pages using DBSCAN clustering algorithm and machine learning SVM method.
        reza molaee fard Mohammad mosleh
        Recommender systems can predict future user requests and then generate a list of the user's favorite pages. In other words, recommender systems can obtain an accurate profile of users' behavior and predict the page that the user will choose in the next move, which can s More
        Recommender systems can predict future user requests and then generate a list of the user's favorite pages. In other words, recommender systems can obtain an accurate profile of users' behavior and predict the page that the user will choose in the next move, which can solve the problem of the cold start of the system and improve the quality of the search. In this research, a new method is presented in order to improve recommender systems in the field of the web, which uses the DBSCAN clustering algorithm to cluster data, and this algorithm obtained an efficiency score of 99%. Then, using the Page rank algorithm, the user's favorite pages are weighted. Then, using the SVM method, we categorize the data and give the user a combined recommender system to generate predictions, and finally, this recommender system will provide the user with a list of pages that may be of interest to the user. The evaluation of the results of the research indicated that the use of this proposed method can achieve a score of 95% in the recall section and a score of 99% in the accuracy section, which proves that this recommender system can reach more than 90%. It detects the user's intended pages correctly and solves the weaknesses of other previous systems to a large extent. Manuscript profile
      • Open Access Article

        29 - Platform for manufacturing and intelligent production of polymers: genome engineering of polymer materials
        Zeinab Sadat Hosseini
        High-performance polymer materials are the foundation of high-level technology development and advanced manufacturing. Recently, polymeric material genome engineering (PMGE) has been proposed as a basic platform for the intelligent production of polymeric materials. Po More
        High-performance polymer materials are the foundation of high-level technology development and advanced manufacturing. Recently, polymeric material genome engineering (PMGE) has been proposed as a basic platform for the intelligent production of polymeric materials. Polymeric Material Genome Engineering (PMGE) is an emerging field that combines the principles of the Materials Genome Initiative with polymer science to accelerate the discovery and development of new polymeric materials. The concept of PMGE is to create a comprehensive database of polymer properties obtained from both computational and experimental methods. This database can then be used to train machine learning models that can predict the properties of new polymers. However, the development of PMGE is still in its infancy and many issues remain to be addressed. Overall, PMGE represents a significant step towards the intelligent manufacturing of polymeric materials, with the potential to revolutionize the field by enabling faster and more efficient development of new materials. In this review are presented the fundamental concepts of PMGE and a summary of recent research and achievements, then are investigated the most important challenges and the future prospects. Specifically, this study focuses on the property prediction approaches, including of the proxy approach and the machine learning, and discusses the potential applications of PMGE, i.e. the advanced composites, the polymer materials used in the communication systems, and electrical integrated circuit manufacturing. Manuscript profile
      • Open Access Article

        30 - Estimating the shear sonic log using machine learning methods, and comparing it with the obtained data from the core
        Houshang Mehrabi Ebrahim Sfidari Seyedeh Sepideh  Mirrabie Sadegh  Barati Boldaji Seyed Mohammad Zamanzadeh
        Machine learning methods are widely used today to estimate petrophysical data. In this study, an attempt has been made to calculate shear sonic log (DTS) from other petrophysical data using machine learning methods and compare it with the sonic data obtained from the More
        Machine learning methods are widely used today to estimate petrophysical data. In this study, an attempt has been made to calculate shear sonic log (DTS) from other petrophysical data using machine learning methods and compare it with the sonic data obtained from the core. For this purpose, computational methods such as Standard Deviation, Isolation Forest, Min. Covariance, and Outlier Factors were used to normalize the data and were compared. Given the amount of missing data and box plots, the Standard Deviation method was selected for normalization. The machine learning methods used include Random Forest, Multiple Regression, Boosted Regression, Support Vector Regression, K-Nearest Neighbor, and MLP Regressor. Multiple regression had the lowest evaluation index (R2=0.94), while Random Forest regression had the highest correlation between the estimated shear sonic log and the original shear sonic log with an evaluation index of 0.98. Therefore, Random Forest regression was used for the final estimation, and to prevent data generalization or overfitting, the GridSearchCV function was used to calculate optimal hyperparameters and final estimation. The estimated sonic log showed a very high similarity with the core data. Manuscript profile
      • Open Access Article

        31 - Intrusion Detection Based on Cooperation on the Permissioned Blockchain Platform in the Internet of Things Using Machine Learning
        Mohammad Mahdi  Abdian majid ghayori Seyed Ahmad  Eftekhari
        Intrusion detection systems seek to realize several objectives, such as increasing the true detection rate, reducing the detection time, reducing the computational load, and preserving the resulting logs in such a way that they cannot be manipulated or deleted by unauth More
        Intrusion detection systems seek to realize several objectives, such as increasing the true detection rate, reducing the detection time, reducing the computational load, and preserving the resulting logs in such a way that they cannot be manipulated or deleted by unauthorized people. Therefore, this study seeks to solve the challenges by benefiting from the advantages of blockchain technology, its durability, and relying on IDS architecture based on multi-node cooperation. The proposed model is an intrusion detection engine based on the decision tree algorithm implemented in the nodes of the architecture. The architecture consists of several connected nodes on the blockchain platform. The resulting model and logs are stored on the blockchain platform and cannot be manipulated. In addition to the benefits of using blockchain, reduced occupied memory, the speed, and time of transactions are also improved by blockchain. In this research, several evaluation models have been designed for single-node and multi-node architectures on the blockchain platform. Finally, proof of architecture, possible threats to architecture, and defensive ways are explained. The most important advantages of the proposed scheme are the elimination of the single point of failure, maintaining trust between nodes, and ensuring the integrity of the model, and discovered logs. Manuscript profile
      • Open Access Article

        32 - Application Identification Through Intelligent Traffic Classification
        Shaghayegh Naderi
        Traffic classification and analysis is one of the big challenges in the field of data mining and machine learning, which plays an important role in providing security, quality assurance and network management. Today, a large amount of transmission traffic in the network More
        Traffic classification and analysis is one of the big challenges in the field of data mining and machine learning, which plays an important role in providing security, quality assurance and network management. Today, a large amount of transmission traffic in the network is encrypted by secure communication protocols such as HTTPS. Encrypted traffic reduces the possibility of monitoring and detecting suspicious and malicious traffic in communication infrastructures (instead of increased security and privacy of the user) and its classification is a difficult task without decoding network communications, because the payload information is lost, and only the header information (which is encrypted too in new versions of network communication protocols such as TLS1.03) is accessible. Therefore, the old approaches of traffic analysis, such as various methods based on port and payload, have lost their efficiency, and new approaches based on artificial intelligence and machine learning are used in cryptographic traffic analysis. In this article, after reviewing the traffic analysis methods, an operational architectural framework for intelligent traffic analysis and classification has been designed. Then, an intelligent model for Traffic Classification and Application Identification is presented and evaluated using machine learning methods on Kaggle141. The obtained results show that the random forest model, in addition to high interpretability compared to deep learning methods, has been able to provide high accuracy in traffic classification (95% and 97%) compared to other machine learning methods. Finally, tips and suggestions about using machine learning methods in the operational field of traffic classification have been provided. Manuscript profile