دسته‌بندی داده‌های دو رده‌ای با ابرمستطیل موازی محورهای مختصات

محورهای موضوعی : عمومى

1 - دانشجو
2 - -

تاریخ دریافت : 1392/01/16 تاریخ پذیرش : 1392/01/16 تاریخ انتشار : 1395/02/13

کلید واژه: یادگیری ماشین, دسته‌بندی, درخت تصمیم, هندسه محاسباتی, تفکیک‌پذیری, مستطیل,

چکیده مقاله :

یکی از روشهای یادگیری در یادگیری ماشین و شناسایی الگو، یادگیری با ناظر است. در یادگیری با ناظر و در مسایل دو رده‌ای، برچسب دادههای آموزشی موجود و شامل دو رده مثبت و منفی میباشند. هدف الگوریتم یادگیری با ناظر، محاسبه فرضیه‌ای است که بتواند با کمترین مقدار خطا، داده‌های مثبت و منفی را از یکدیگر جدا کند. در این مقاله، از بین کلیه الگوریتمهای یادگیری با ناظر، بر عملکرد درخت‌های تصمیم متمرکز می‌شویم. دیدگاه هندسی درخت تصمیم ما را به مفهوم تفکیک‌پذیری در هندسه محاسباتی نزدیک می‌کند. از بین کلیه الگوریتم‌های تفکیک‌پذیری موجود و مرتبط با درخت تصمیم، مساله محاسبه مستطیل با حداکثر اختلاف دو رنگ را مطرح می‌کنیم و الگوریتم را در یک، دو، سه و m بعد پیادهسازی می‌کنیم که m تعداد ویژگیهای دادهها را نشان میدهد. نتیجه پیاده‌سازی نشان‌دهنده آن است که این الگوریتم، الگوریتمی قابل رقابت با الگوریتم شناخته شده C4.5 است.

چکیده انگلیسی:

One of the learning methods in machine learning and pattern recognition is supervised learning. In supervised learning and in two-category problems, the available educational data labels include positive and negative categories. The goal of the supervised learning algorithm is to calculate a hypothesis that can separate positive and negative data with the least amount of error. In this article, among all supervised learning algorithms, we focus on the performance of decision trees. The geometric view of the decision tree brings us closer to the concept of separability in computational geometry. Among all the available resolution algorithms related to the decision tree, we raise the problem of calculating the rectangle with the maximum difference of two colors and implement the algorithm in one, two, three and m dimensions, where m represents the number of data features. The implementation result shows that this algorithm is competitive with the well-known C4.5 algorithm.

منابع و مأخذ:

T. M. Mitchell, Machine learning, McGraw-Hill, 1997.
2.P.L. Hammer, A. Kogan, B. Simeone, and S. Szedmak, “Pareto-optimal patterns in logical analysis of data,” Discrete Applied Mathematics, vol.144, pp.79-102, 2004.
3.M. Kreveld, T. Lankveld, and R. Veltkamp, “Identifying well-covered minimal bounding rectangles in 2D point data,” in 25th European Workshop on Computational Geometry, EWCG, 2009, pp.277-280.
4.J. Eckstein, P. Hammer, Y. Liu, M. Nediak, and B. Simeone, “The maximum box problem and its application to data analysis,” Journal of Computational Optimization and Application, vol.23, pp. 85-98, 2002.
5.D. P. Dobkin, D. Gunopulos, and W. Maass, “Computing the maximum bichromatic discrepancy with applications to computer graphics and machine learning,” Journal of Computer and System Science, vol.52, pp. 453–470, 1996.
6.Y. Liu, and M. Nediak, “Planar case of the maximum box and related problems,” in Proceedings 15th Canadian Conferance of Computational Geometry, CCCG, 2003, pp.14-18.
7.C. Cortés, J. Díaz-Báòez, P. Pérez-Lantero, C. Seara, J. Urrutia, and I. Ventura, “Bichromatic separability with two boxes: A general approach,” Journal of Algorithms, vol.64, pp.79-88. 2009.
8.C. Cortés, J. Díaz-Báòez, and J. Urrutia, “Finding enclosing boxes with empty intersection,” in Proceedings 23rd European Workshop on Computational Geometry, EWCG, 2006, pp.185-188.
9.ز. مصلحی، " تفکیک‌پذیری نقاط با اشیای هندسی در فضای دوبعدی،" پایان نامه کارشناسی ارشد دانشکده مهندسی کامپیوتر و فناوری اطلاعات، دانشگاه صنعتی امیرکبیر، 1391.
10.ز. مصلحی، ع. باقری، " تفکیک پذیري سري نقاط دو رنگ با دو مستطیل مجزا و موازي محورهاي مختصات،" مجله علمی پژوهشی رایانش نرم و فناوري اطلاعات، جلد 1، شماره 2، صفحه 35-42، 1391.
11.S. Cabello, J. M. Díaz-Báñez, C. Seara, J. A. Sellarès, J. Urrutia, and I. Ventura, “Covering point sets with two convex objects,” in 21st European Workshop on Computational Geometry, EWCG, 2005, pp. 195-206.
12.S. Cabello, J. M. Díaz-Báñez, C. Seara, J. Urrutia, and I. Ventura, “Covering point sets with two disjoint disks or squares,” Computational Geometry: Theory and Application, vol.40, pp. 195-206, 2008.
13.D.P. Dobkin, and D. Gunopulos, “Geometric problems in machine learning, in Lecture Notes in Computer Science, LNCS, 1996, vol.1148, pp.121-132.
14.M. D. Berg, O. Cheong, M. V. Kreveld, and M. Overmars, Computational geometry: algorithms and applications, 3rd Edition, TELOS, Santa Clara, CA, USA, 2008.
15.K. Bache and M. Lichman. (2013). UCI Machine Learning Repository [Online]. Available: http://archive.ics.uci.edu/ml
16.I. H. Witten, E. Frank, and M. A. Hall, Data Mining: Practical machine learning tools and techniques, 3rd Edition, Morgan Kaufmann, San Francisco, 2011.

اشتراک گذاری

آدرس مقاله

دسته‌بندی داده‌های دو رده‌ای با ابرمستطیل موازی محورهای مختصات

رایمگ

پیوندهای سایت

مراکز مرتبط

پشتیبانی

صفحات رسمی