• Home
  • reinforcement learning
    • List of Articles reinforcement learning

      • Open Access Article

        1 - A Fast Machine Learning for 5G Beam Selection for Unmanned Aerial Vehicle Applications
        Wasswa Shafik Mohammad Ghasemzadeh S.Mojtaba Matinkhah
        Unmanned Aerial vehicles (UAVs) emerged into a promising research trend applied in several disciplines based on the benefits, including efficient communication, on-time search, and rescue operations, appreciate customer deliveries among more. The current technologies ar More
        Unmanned Aerial vehicles (UAVs) emerged into a promising research trend applied in several disciplines based on the benefits, including efficient communication, on-time search, and rescue operations, appreciate customer deliveries among more. The current technologies are using fixed base stations (BS) to operate onsite and off-site in the fixed position with its associated problems like poor connectivity. These open gates for the UAVs technology to be used as a mobile alternative to increase accessibility in beam selection with a fifth-generation (5G) connectivity that focuses on increased availability and connectivity. This paper presents a first fast semi-online 3-Dimensional machine learning algorithm suitable for proper beam selection as is emitted from UAVs. Secondly, it presents a detailed step by step approach that is involved in the multi-armed bandit approach in solving UAV solving selection exploration to exploitation dilemmas. The obtained results depicted that a multi-armed bandit problem approach can be applied in optimizing the performance of any mobile networked devices issue based on bandit samples like Thompson sampling, Bayesian algorithm, and ε-Greedy Algorithm. The results further illustrated that the 3-Dimensional algorithm optimizes utilization of technological resources compared to the existing single and the 2-Dimensional algorithms thus close optimal performance on the average period through machine learning of realistic UAV communication situations. Manuscript profile
      • Open Access Article

        2 - Extracting Bottlenecks Using Object Recognition in Reinforcement Learning
        B. Ghazanfari N. Mozayani M. R. Jahed Motlagh
        Extracting bottlenecks improves considerably the speed of learning and the ability knowledge transferring in reinforcement learning. But, extracting bottlenecks is a challenge in reinforcement learning and it typically requires prior knowledge and designer’s help. This More
        Extracting bottlenecks improves considerably the speed of learning and the ability knowledge transferring in reinforcement learning. But, extracting bottlenecks is a challenge in reinforcement learning and it typically requires prior knowledge and designer’s help. This paper will propose a new method that extracts bottlenecks for reinforcement learning agent automatically. We have inspired of biological systems, behavioral analysts and routing animals and the agent works on the basis of its interacting to environment. The agent finds landmarks based in clustering and hierarchical object recognition. If these landmarks in actions space are close to each other, bottlenecks are extracted using the states between them. The Experimental results show a considerable improvement in the process of learning in comparison to some key methods in the literature. Manuscript profile
      • Open Access Article

        3 - Proposing a New Method for Acquiring Skills in Reinforcement Learning with the Help of Graph Clustering
        M. Davoodabadi Farahani N. Mozayani
        Reinforcement learning is atype of machine learning methods in which the agent uses its transactions with the environment to recognize the environment and to improve its behavior.One of the main problems of standard reinforcement learning algorithms like Q-learning is t More
        Reinforcement learning is atype of machine learning methods in which the agent uses its transactions with the environment to recognize the environment and to improve its behavior.One of the main problems of standard reinforcement learning algorithms like Q-learning is that they are not able to solve large scale problems in a reasonable time. Acquiring skills helps to decompose the problem to a set of sub-problems and to solve it with hierarchical methods. In spite of the promising results of using skills in hierarchical reinforcement learning, it has been shown in some previous studies that based on the imposed task, the effect of skills on learning performance can be quite positive. On the contrary, if they are not properly selected, they can increase the complexity of problem-solving. Hence, one of the weaknesses of previous methods proposed for automatically acquiring skills is the lack of a systematic evaluation method for each acquired skill. In this paper, we propose new methods based on graph clustering for subgoal extraction and acquisition of skills. Also, we present new criteria for evaluating skills, with the help of which, inappropriate skills for solving the problem are eliminated. Using these methods in a number of experimental environments shows a significant increase in learning speed. Manuscript profile
      • Open Access Article

        4 - Scheduling of IoT Application Tasks in Fog Computing Environment Using Deep Reinforcement Learning
        Pegah Gazori Dadmehr Rahbari Mohsen Nickray
        With the advent and development of IoT applications in recent years, the number of smart devices and consequently the volume of data collected by them are rapidly increasing. On the other hand, most of the IoT applications require real-time data analysis and low latency More
        With the advent and development of IoT applications in recent years, the number of smart devices and consequently the volume of data collected by them are rapidly increasing. On the other hand, most of the IoT applications require real-time data analysis and low latency in service delivery. Under these circumstances, sending the huge volume of various data to the cloud data centers for processing and analytical purposes is impractical and the fog computing paradigm seems a better choice. Because of limited computational resources in fog nodes, efficient utilization of them is of great importance. In this paper, the scheduling of IoT application tasks in the fog computing paradigm has been considered. The main goal of this study is to reduce the latency of service delivery, in which we have used the deep reinforcement learning approach to meet it. The proposed method of this paper is a combination of the Q-Learning algorithm, deep learning, experience replay, and target network techniques. According to experiment results, The DQLTS algorithm has improved the ASD metric by 76% in comparison to QLTS and 6.5% compared to the RS algorithm. Moreover, it has been reached to faster convergence time than QLTS. Manuscript profile
      • Open Access Article

        5 - Load Balancing in Fog Nodes using Reinforcement Learning Algorithm
        niloofar tahmasebi pouya Mehdi-Agha  Sarram
        Fog computing is an emerging research field for providing cloud computing services to the edges of the network. Fog nodes process data stream and user requests in real-time. In order to optimize resource efficiency and response time, increase speed and performance, task More
        Fog computing is an emerging research field for providing cloud computing services to the edges of the network. Fog nodes process data stream and user requests in real-time. In order to optimize resource efficiency and response time, increase speed and performance, tasks must be evenly distributed among the fog nodes. Therefore, in this paper, a new method is proposed to improve the load balancing in the fog computing environment. In the proposed algorithm, when a task is sent to the fog node via mobile devices, the fog node using reinforcement learning decides to process that task itself, or assign it to one of the neighbor fog nodes or cloud for processing. The evaluation shows that the proposed algorithm, with proper distribution of tasks between nodes, has less delay to tasks processing than other compared methods. Manuscript profile