Using limited memory to store the most recent action in XCS learning classifier systems in maze problems
Subject Areas : ICTAli Yousefi 1 , kambiz badie 2 , mohamad mehdi ebadzade 3 , Arash Sharifi 4
1 - .
2 - .
3 - .
4 - Science and Research Branch, Islamic Azad University, Tehran, Iran
Keywords: Learning classifier systems, XCS algorithm, limited memory, maze problems,
Abstract :
Nowadays, learning classifier systems have received attention in various applications in robotics, such as sensory robots, humanoid robots, intelligent rescue and rescue systems, and control of physical robots in discrete and continuous environments. Usually, the combination of an evolutionary algorithm or intuitive methods with a learning process is used to search the space of existing rules in assigning the appropriate action of a category. The important challenge to increase the speed and accuracy in reaching the goal in the maze problems is to use and choose the action that the stimulus is placed on the right path instead of repeatedly hitting the surrounding obstacles. For this purpose, in this article, an intelligent learning classifier algorithm of accuracy-based learning classifier systems (XCS) based on limited memory is used, which according to the input and actions applied to the environment and the reaction of the stimulus, the rules It is optimally identified and added as a new classifier set to the accuracy-based learning classifier systems (XCS) algorithm in the next steps. Among the achievements of this method, it can be based on reducing the number of necessary steps and increasing the speed of reaching the stimulus to the target compared to the accuracy-based learning classifier systems (XCS) algorithm.
[1] "Learning Classifier Systems, From Foundations to Applications," 2000.
[2] J. Holland, L. Booker, M. Colombetti, M. Dorigo, D. Goldberg, S. Forrest, et al., "What Is a Learning Classifier System?," in Learning Classifier Systems. vol. 1813, P. Lanzi, W. Stolzmann, and S. Wilson, Eds., ed: Springer Berlin Heidelberg, 2000, pp. 3-32.
[3] S. W. Wilson, "Classifier fitness based on accuracy," Evol. Comput., vol. 3, pp. 149-175, 1995.
[4] E. Bernad\, \#243, -Mansilla, and J. M. Garrell-Guiu, "Accuracy-based learning classifier systems: models, analysis and applications to classification tasks," Evol. Comput., vol. 11, pp. 209-238, 2003.
[5] J. H. Holmes, P. L. Lanzi, W. Stolzmann, and S. W. Wilson, "Learning classifier systems: New models, successful applications," Information Processing Letters, vol. 82, pp. 23-30, 2002.
[6] M. Shariat Panahi, A. Karkhaneh Yousefi, and M. Khorshidi, "Combining accuracy and success-rate to improve the performance of eXtended Classifier System (XCS) for data-mining and control applications," Engineering Applications of Artificial Intelligence, vol. 26, pp. 1930-1935, 2013.
[7] D. Mellor, "A Learning Classifier System Approach to Relational Reinforcement Learning," in Learning Classifier Systems. vol. 4998, J. Bacardit, E. Bernadó-Mansilla, M. Butz, T. Kovacs, X. Llorà, and K. Takadama, Eds., ed: Springer Berlin Heidelberg, 2008, pp. 169-188.
[8] P. Wawrzynski and A. K. Tanwani, "Autonomous reinforcement learning with experience replay," Neural Netw, vol. 41, pp. 156-67, 2013.
[9] Z. Zang, D. Li, J. Wang, and D. Xia, "Learning classifier system with average reward reinforcement learning," Knowledge-Based Systems, vol. 40, pp. 58-71, 2013.
[10] M. Studley and L. Bull, "X-TCS: accuracy-based learning classifier system robotics," in Evolutionary Computation, 2005. The 2005 IEEE Congress on, 2005, pp. 2099-2106 Vol. 3.
[11] M. Butz and D. Goldberg, "Generalized State Values in an Anticipatory Learning Classifier System," in Anticipatory Behavior in Adaptive Learning Systems. vol. 2684, M. Butz, O. Sigaud, and P. Gérard, Eds., ed: Springer Berlin Heidelberg, 2003, pp. 282-301.
[12] M. V. Butz, T. Kovacs, P. L. Lanzi, and S. W. Wilson, "Toward a theory of generalization and learning in XCS," Evolutionary Computation, IEEE Transactions on, vol. 8, pp. 28-46, 2004.
[13] P. Gérard and O. Sigaud, "YACS: Combining Dynamic Programming with Generalization in Classifier Systems," in Advances in Learning Classifier Systems. vol. 1996, P. Luca Lanzi, W. Stolzmann, and S. Wilson, Eds., ed: Springer Berlin Heidelberg, 2001, pp. 52-69.
[14] J. H. Holland, "Escaping brittleness: the possibilities of general-purpose learning algorithms applied to parallel rule-based systems," in Computation & intelligence, F. L. George, Ed., ed: American Association for Artificial Intelligence, 1995, pp. 275-304.
[15] P. L. Lanzi, "An analysis of generalization in the xcs classifier system," Evol. Comput., vol. 7, pp. 125-149, 1999.
[16] P. L. Lanzi, D. Loiacono, S. W. Wilson, and D. E. Goldberg, "Generalization in the XCSF Classifier System: Analysis, Improvement, and Extension," Evol. Comput., vol. 15, pp. 133-168, 2007.
[17] M. Iqbal, W. Browne, and M. Zhang, "XCSR with Computed Continuous Action," in AI 2012: Advances in Artificial Intelligence. vol. 7691, M. Thielscher and D. Zhang, Eds., ed: Springer Berlin Heidelberg, 2012, pp. 350-361.
[18] M. Iqbal, W. N. Browne, and Z. Mengjie, "Reusing Building Blocks of Extracted Knowledge to Solve Complex, Large-Scale Boolean Problems," Evolutionary Computation, IEEE Transactions on, vol. 18, pp. 465-480, 2014.
[19] G. Bezerra, T. Barra, L. de Castro, and F. Von Zuben, "Adaptive Radius Immune Algorithm for Data Clustering," in Artificial Immune Systems. vol. 3627, C. Jacob, M. Pilat, P. Bentley, and J. Timmis, Eds., ed: Springer Berlin Heidelberg, 2005, pp. 290-303.
[20] H.-P. Cheng, Z.-S. Lin, H.-F. Hsiao, and M.-L. Tseng, "Designing an Artificial Immune System-Based Machine Learning Classifier for Medical Diagnosis," in Information Computing and Applications. vol. 6377, R. Zhu, Y. Zhang, B. Liu, and C. Liu, Eds., ed: Springer Berlin Heidelberg, 2010, pp. 333-341.
[21] J. D. Farmer, N. H. Packard, and A. S. Perelson, "The immune system, adaptation, and machine learning," Physica D: Nonlinear Phenomena, vol. 22, pp. 187-204, 1986.
[22] F. Freschi and M. Repetto, "Multiobjective optimization by a modified artificial immune system algorithm," presented at the Proceedings of the 4th international conference on Artificial Immune Systems, Banff, Alberta, Canada, 2005.
[23] J. Timmis, P. Andrews, N. Owens, and E. Clark, "An interdisciplinary perspective on artificial immune systems," Evolutionary Intelligence, vol. 1, pp. 5-26, 2008/03/01 2008.
[24] P. Vargas, L. de Castro, and F. Von Zuben, "Mapping Artificial Immune Systems into Learning Classifier Systems," in Learning Classifier Systems. vol. 2661, P. Lanzi, W. Stolzmann, and S. Wilson, Eds., ed: Springer Berlin Heidelberg, 2003, pp. 163-186.
[25] L. Bull, "Towards a Mapping of Modern AIS and LCS," in Artificial Immune Systems. vol. 6825, P. Liò, G. Nicosia, and T. Stibor, Eds., ed: Springer Berlin Heidelberg, 2011, pp. 371-382.
[26] Z. Zang, D. Li, and J. Wang, "Learning classifier systems with memory condition to solve non-Markov problems," Soft Computing, vol. 19, pp. 1679-1699, 2015/06/01 2015.
[27] A. L. Thomaz and C. Breazeal, "Teachable robots: Understanding human teaching behavior to build more effective robot learners," Artificial Intelligence, vol. 172, pp. 716-737, 2008.
[28] L. M. Saksida, S. M. Raymond, and D. S. Touretzky, "Shaping robot behavior using principles from instrumental conditioning," Robotics and Autonomous Systems, vol. 22, pp. 231-249, 1997.
[29] M. Dorigo and M. Colombetti, "Robot shaping: developing autonomous agents through learning," Artificial Intelligence, vol. 71, pp. 321-370, 1994.
[30] S. Wilson, "Classifier systems and the animat problem," Machine Learning, vol. 2, pp. 199-228, 1987/11/01 1987.
[31] S. W. Wilson, "Knowledge Growth in an Artificial Animal," presented at the Proceedings of the 1st International Conference on Genetic Algorithms, 1985.
[32] B. G. Farley and W. Clark, "Simulation of self-organizing systems by digital computer," Information Theory, Transactions of the IRE Professional Group on, vol. 4, pp. 76-84, 1954.
[33] C. E. Shannon, "Programming a computer for playing chess," in Computer chess compendium, L. David, Ed., ed: Springer-Verlag New York, Inc., 1988, pp. 2-13.
[34] A. L. Samuel, "Some studies in machine learning using the game of checkers," IBM J. Res. Dev., vol. 3, pp. 210-229, 1959.
[35] A. L. Samuel, "Some Studies in Machine Learning Using the Game of Checkers. II—Recent Progress," IBM Journal of Research and Development, vol. 11, pp. 601-617, 1967.
[36] J. H. Holland, "Properties of the Bucket Brigade," presented at the Proceedings of the 1st International Conference on Genetic Algorithms, 1985.
[37] G. E. P. Box, "Evolutionary operation: a method for increasing industrial productivity," Applied statistics : a journal of the Royal Statistical Society, vol. 6, pp. 81-101, 1957.
[38] S. W. Wilson and D. E. Goldberg, "A Critical Review of Classifier Systems," presented at the Proceedings of the 3rd International Conference on Genetic Algorithms, 1989.
[39] L. B. Booker, "Intelligent behavior as an adaptation to the task environment," University of Michigan, 1982.
[40] L. B. Booker, "Improving the Performance of Genetic Algorithms in Classifier Systems," presented at the Proceedings of the 1st International Conference on Genetic Algorithms, 1985.
[41] L. B. Booker, "Classifier systems that learn internal world models," Mach. Lang., vol. 3, pp. 161-192, 1988.
[42] L. B. Booker, "Triggered Rule Discovery in Classifier Systems," presented at the Proceedings of the 3rd International Conference on Genetic Algorithms, 1989.
[43] S. W. Wilson, "Zcs: A zeroth level classifier system," Evol. Comput., vol. 2, pp. 1-18, 1994.
[44] R. S. Sutton and A. G. Barto, "Toward a modern theory of adaptive networks: expectation and prediction," Psychol Rev, vol. 88, pp. 135-70, 1981.
[45] S. W. Wilson, "Classifiers that approximate functions," vol. 1, pp. 211-234, 2002.
[46] L. Bull, "Two Simple Learning Classifier Systems," in Foundations of Learning Classifier Systems. vol. 183, L. Bull and T. Kovacs, Eds., ed: Springer Berlin Heidelberg, 2005, pp. 63-89.
[47] L. Bull, "A brief history of learning classifier systems: from CS-1 to XCS and its variants," Evolutionary Intelligence, pp. 1-16, 2015/01/29 2015.
[48] A. Hamzeh, S. Hashemi, A. Sami, and A. Rahmani, "A Recursive Classifier System for Partially Observable Environments," Fundam. Inform., vol. 97, pp. 15-40, 2009.
[49] A. Hamzeh and A. Rahmani, "A New Architecture for Learning Classifier Systems to Solve POMDP Problems," Fundam. Inform., vol. 84, pp. 329-351,2008.
[50] R. Preen and L. Bull, "Discrete and fuzzy dynamical genetic programming in the XCSF learning classifier system," Soft Computing, vol. 18, pp. 153-167, 2014/01/01 2014.