معرفي يک روش جديد خوشهيابي خودکار بر مبناي الگوريتم ايمني مصنوعي
محورهای موضوعی : مهندسی برق و کامپیوتر
1 - دانشگاه بیرجند
کلید واژه: خوشهبندي خودكارالگوريتم ايمني مصنوعيكنترلكننده فازيمحاسبه نرم,
چکیده مقاله :
در اين تحقيق يک روش جديد خوشهيابي خودکار مبتني بر الگوريتم ايمني مصنوعي ارائه شده است. در روش پيشنهادشده طول سلولهاي تدافعي پويا بوده و بر اساس فواصل درونخوشهاي و بينخوشهاي بهوسيله يک کنترلکننده فازي تعيين ميشود. حاصل اين تدبير دستيابي به تعداد مناسب خوشهها بدون انجام آزمايشات مکرر است که بهتبع آن يک خوشهيابي مؤثر و کارآمد (بهصورت خودکار) حاصل خواهد شد. البته تنظيم دستي تعداد خوشهها (مانند ساير روشهاي معمول خوشهيابي) نيز پيشبيني شده است تا امکان دسترسي به نتايج مورد نظر (و دلخواه) کاربران فراهم باشد. روش ارائهشده بر روي انواع مختلفي از دادههاي مصنوعي و دادههاي مشهور در پردازش الگو (با تنوع در ابعاد فضاي ويژگي و تعداد نمونهها) آزمايش شده است. نتايج بهدست آمده برتري نسبتاً قابل توجهي را در عملکرد اين روش نسبت به روش k means (بهعنوان يك روش خوشهيابي مرسوم) نشان ميدهد. اين برتري در مواجهه با حجم دادههاي زياد، بيشتر به چشم ميخورد. همچنين اين نتايج نشان ميدهند كه روش پيشنهادشده در مقايسه با روش خوشهيابي وراثتي (بهعنوان يك روش خوشهيابي جديد) داراي عملكردي مشابه و در مواردي بهتر از آن ميباشد.
In this paper a novel technique for automatic data clustering based on the artificial immune algorithm is proposed. The lengths of the antibodies are dynamically changed based on inter-clusters and intra-clusters distances by means of a fuzzy controller which has been added to the immune algorithm to provide, also, a soft computing approach for data clustering. This idea leads to proper number of clusters and effective and powerful clustering process without any additional try and error efforts. Also the manual setting of the number of clusters is available in the proposed algorithm (like other unsupervised clustering approaches) after removing the fuzzy controller from the proposed clustering system. The method has been tested on the different kinds of the complex artificial data sets and well known benchmarks. The experimental results show that the performance of the proposed technique is much better than the k-means clustering algorithm (as a conventional one), specially for huge data sets with large feature vector dimensions. Furthermore, it is found that the performance of the proposed approach is comparable, sometimes better than the genetic algorithm based clustering technique (as an evolutionary clustering algorithm).
[1] S. M. Thayer and S. P. N. Singh, "Development of an immunology -based multi - robot coordination algorithm for exploration and mapping domains," in the Proc. of Intelligent Robots and System,International Conf. (IEEE/RSJ 2002), vol. 3, pp. 2735-2739, 30Sep.-5 Oct. 2002.
[2] D. R. Carvalho and A. A. Freitas, "An immunological algorithm for discovering small-disjunct rules in data mining," in Proc. Workshop GECCO’2001, pp. 401-404, San Francisco, California, US,Jul. 2001.
[3] R. Canham, A. H. Jackson, and A. Tyrrell, "Robot error detection using an artificial immune system," in Proc. NASA/DOD Conf. on Evolvable Hardware, pp 199-207, Jul. 2003.
[4] I. E. Evangelou, D. G. Hadjimitsis, A. A. Lazakidou, and C. Clayton,"Data mining and knowledge discovery in complex image data using artificial neural networks," in Proc. Workshop on Complex Reasoning and Geographical Data, Cyprus, 2001.
[5] T. Lillesand and R. Keifer, Remote Sensing and Image Interpretation, John Wiley & Sons, 1994.
[6] J. A. Hartigan, Clustering Algorithms, John Wiley & Sons, 1975.
[7] U. Maulik and S. Bandyopadhyay, "Genetic algorithm-based clustering technique," J. of the Pattern Recognition, vol. 33, no. 9,pp. 1455-1465, 2000.
[8] L. Y. Tseng and S. B. Yang, "A genetic approach to the automatic clustering problem," J. of the Pattern Recognition, vol. 34, no. 2, pp. 415-424, Feb. 2001.
[9] D. W. van der Merwe and A. P. Engelbrecht, "Data clustering using particle swarm optimization," in Proc. of the 2003 Congress on Evolutionary Computation, vol. 1, pp. 215-220, 8-12 Dec. 2003.
[10] N. Tang and V. R. Vemuri, "An artificial immune system approach to document clustering," in Proc. of ACM Symp. on Applied computing, SAC'05, pp. 918-922, Santa Fe, New Mexico, US, 13-17Mar. 2005.
[11] L. Jia, L. Yang, Q. Kong, and S. Lin, "Study of artificial immune clustering algorithm and its application to urban traffic control," Int. J. of Inf. Tech., vol. 12, no. 3, 2006.
[12] R. A. Fisher, "The use of multiple measurements in taxonomic problems," Ann. Eugen, vol. 7, no. 2, pp. 179-188, 1936.
[13] University of California, Irvine, via anonymous ftp ftp.ics.uci.edu/pub/machine-learning-databases.
[14] http://www.ics.uci.edu/~mlearn/MLRepository.html.
[15] DARPA TIMIT Acoustic-Phonic Continuous Speech, National Institute of Standards and Technology, Speech Disc 1-1.1, 1990.