تشخیص اسپم در شبکه اجتماعی توییتر با استفاده از رویکرد یادگیری ترکیبی

محورهای موضوعی : مهندسی برق و کامپیوتر

مریم فصیحی ¹ , محمدجواد شایگان فرد ² , زهرا سادات حسینی مقدم ³ , زهرا سجده ⁴

1 - گروه مهندسی کامپیوتر، دانشگاه علم و فرهنگ
2 - گروه مهندسی کامپیوتر، دانشگاه علم و فرهنگ
3 - گروه مهندسی کامپیوتر، دانشگاه علم و فرهنگ
4 - گروه مهندسی کامپیوتر، دانشگاه علم و فرهنگ

تاریخ دریافت : 1401/06/13 تاریخ پذیرش : 1402/06/06 تاریخ انتشار : 1403/01/29

کلید واژه: توییتر, شناسایی اسپم, شبکه عصبی, Autoencoder, Softmax,

چکیده مقاله :

امروزه شبکه‌های اجتماعی، نقش مهمی در گسترش اطلاعات در سراسر جهان دارند. توییتر یکی از محبوب‌ترین شبکه‌های اجتماعی است که در هر روز 500 میلیون توییت در این شبکه ارسال می‌شود. محبوبیت این شبکه در میان کاربران منجر شده تا اسپمرها از این شبکه برای انتشار پست‌های هرزنامه استفاده کنند. در این مقاله برای شناسایی اسپم در سطح توییت از ترکیبی از روش‌های یادگیری ماشین استفاده شده است. روش پیشنهادی، چارچوبی مبتنی بر استخراج ویژگی است که در دو مرحله انجام می‌شود. در مرحله اول از Stacked Autoencoder برای استخراج ویژگی‌ها استفاده شده و در مرحله دوم، ویژگی‌های مستخرج از آخرین لایه Stacked Autoencoder به‌‌عنوان ورودی به لایه softmax داده می‌شوند تا این لایه پیش‌بینی را انجام دهد. روش پیشنهادی با برخی روش‌های مشهور روی پیکره متنی Twitter Spam Detection با معیارهای Accuracy، -Score1F، Precision و Recall مورد مقایسه و ارزیابی قرار گرفته است. نتایج تحقیق نشان می‌دهند که دقت کشف روش پیشنهادی به 1/78% می‌رسد. در مجموع، این روش با استفاده از رویکرد اکثریت آرا با انتخاب سخت در یادگیری ترکیبی، توییت‌های اسپم را با دقت بالاتری نسبت به روش‌های CNN، LSTM و SCCL تشخیص می‌دهد.

چکیده انگلیسی:

Today, social networks play a crucial role in disseminating information worldwide. Twitter is one of the most popular social networks, with 500 million tweets sent on a daily basis. The popularity of this network among users has led spammers to exploit it for distributing spam posts. This paper employs a combination of machine learning methods to identify spam at the tweet level. The proposed method utilizes a feature extraction framework in two stages. In the first stage, Stacked Autoencoder is used for feature extraction, and in the second stage, the extracted features from the last layer of Stacked Autoencoder are fed into the softmax layer for prediction. The proposed method is compared and evaluated against some popular methods on the Twitter Spam Detection corpus using accuracy, precision, recall, and F1-score metrics. The research results indicate that the proposed method achieves a detection of 78.1%. Overall, the proposed method, using the majority voting approach with a hard selection in ensemble learning, outperforms CNN, LSTM, and SCCL methods in identifying spam tweets with higher accuracy.

منابع و مأخذ:

[1] S. Madisetty and M. S. Desarkar, “A Neural Network-Based Ensemble Approach for Spam Detection in Twitter,” IEEE Trans. Comput. Soc. Syst., vol. 5, no. 4, pp. 973–984, Dec. 2018.
[2] M. McCord and M. Chuah, “Spam detection on twitter using traditional classifiers,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2011, vol. 6906 LNCS, pp. 175–186.
[3] X. Zhang, S. Zhu, and W. Liang, “Detecting spam and promoting campaigns in the Twitter social network,” in Proceedings - IEEE International Conference on Data Mining, ICDM, 2012, pp. 1194–1199.
[4] A. T. Kabakus and R. Kara, “A Survey of Spam Detection Methods on Twitter,” International Journal of Advanced Computer Science and Applications, 8(3), pp.29-38, 2017.
[5] X. Zheng, Z. Zeng, Z. Chen, Y. Yu, and C. Rong, “Detecting spammers on social networks,” Neurocomputing, vol. 159, no. 1, pp. 27–34, Jul. 2015.
[6] J. Martinez-Romo and L. Araujo, “Detecting malicious tweets in trending topics using a statistical analysis of language,” Expert Syst. Appl., vol. 40, no. 8, pp. 2992–3000, Jun. 2013.
[7] A. M. Al-Zoubi, H. Faris, J. Alqatawna, and M. A. Hassonah, “Evolving Support Vector Machines using Whale Optimization Algorithm for spam profiles detection on online social networks in different lingual contexts,” Knowledge-Based Syst., vol. 153, pp. 91–104, Aug. 2018.
[8] S. B. S. Ahmad, M. Rafie, and S. M. Ghorabie, “Spam detection on Twitter using a support vector machine and users’ features by identifying their interactions,” Multimed. Tools Appl., vol. 80, no. 8, pp. 11583–11605, Mar. 2021.
[9] Z. Alom, B. Carminati, and E. Ferrari, “A deep learning model for Twitter spam detection,” Online Soc. Networks Media, vol. 18, p. 100079, Jul. 2020.
[10] X. Ban, C. Chen, S. Liu, Y. Wang, and J. Zhang, “Deep-learnt features for Twitter spam detection,” 2018 Int. Symp. Secur. Priv. Soc. Networks Big Data, Soc. 2018, pp. 22–26, Dec. 2018.
[11] Y. Liu, L. Wang, T. Shi, and J. Li, “Detection of spam reviews through a hierarchical attention architecture with N-gram CNN and Bi-LSTM,” Inf. Syst., vol. 103, p. 101865, Jan. 2022.
[12] G. Jain, M. Sharma, and B. Agarwal, “Optimizing semantic LSTM for spam detection,” Int. J. Inf. Technol., vol. 11, no. 2, pp. 239–250, Jun. 2019.
[13] G. Jain, M. Sharma, B. A.-A. of M. and Artificial, and undefined 2019, “Spam detection in social media using convolutional and long short term memory neural network,” Springer, 2019.
[14] T. Wu, S. Liu, J. Zhang, and Y. Xiang, “Twitter spam detection based on deep learning,” ACM Int. Conf. Proceeding Ser., Jan. 2017.
[15] G. M. Shahariar, S. Biswas, F. Omar, F. M. Shah, and S. Binte Hassan, “Spam Review Detection Using Deep Learning,” 2019 IEEE 10th Annu. Inf. Technol. Electron. Mob. Commun. Conf. IEMCON 2019, pp. 27–33, Oct. 2019.
[16] A.T.Kabakus, and R .Kara, “‘TwitterSpamDetector’: A Spam Detection Framework for Twitter,” International Journal of Knowledge and Systems Science (IJKSS), 10(3), pp.1-14.2019.
[17] H. Shen, F. Ma, X. Zhang, L. Zong, X. Liu, and W. Liang, “Discovering social spammers from multiple views,” Neurocomputing, vol. 225, pp. 49–57, Feb. 2017.
[18] K. Lee, J. Caverlee, and S. Webb, “Uncovering social spammers: Social honeypots + machine learning,” in SIGIR 2010 Proceedings - 33rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2010, pp. 435–442.
[19] C. Grier, K. Thomas, V. Paxson, and M. Zhang, “@Spam: The underground on 140 characters or less,” in Proceedings of the ACM Conference on Computer and Communications Security, 2010, pp. 27–37.
[20] S. Saumya and J. P. Singh, “Spam review detection using LSTM autoencoder: an unsupervised approach,” Electron. Commer. Res., vol. 22, no. 1, pp. 113–133, Mar. 2022.
[21] J. V Lochter, T. A. Almeida, and T. C. Alberto, “Tubespam: Comment spam filtering on youtube,” ieeexplore.ieee.org.
[22] V. B. Semwal, A. Gupta, and P. Lalwani, “An optimized hybrid deep learning model using ensemble learning approach for human walking activities recognition,” J. Supercomput. 2021, pp. 1–24, Apr. 2021.
[23] M. Usama et al., “Unsupervised Machine Learning for Networking: Techniques, Applications and Research Challenges,” IEEE Access, vol. 7, pp. 65579–65615, 2019.
[1] S. Madisetty and M. S. Desarkar, "A neural network-based ensemble approach for spam detection in Twitter," IEEE Trans. Comput. Soc. Syst., vol. 5, no. 4, pp. 973-984, Dec. 2018.
[2] M. McCord and M. Chuah, "Spam detection on twitter using traditional classifiers," Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. LNCS6906, pp. 175-186, Sept. 2011.
[3] X. Zhang, S. Zhu, and W. Liang, "Detecting spam and promoting campaigns in the Twitter social network," in Proc. IEEE International Conf. on Data Mining, ICDM, pp. 1194-1199, Brussels, Belgium , 10-13 Dec. 2012.
[4] A. T. Kabakus and R. Kara, "A survey of spam detection methods on Twitter," International J. of Advanced Computer Science and Applications, vol. 8, no. 3, pp. 29-38, 2017.
[5] X. Zheng, Z. Zeng, Z. Chen, Y. Yu, and C. Rong, "Detecting spammers on social networks," Neurocomputing, vol. 159, no. 1, pp. 27-34, Jul. 2015.
[6] J. Martinez-Romo and L. Araujo, "Detecting malicious tweets in trending topics using a statistical analysis of language," Expert Syst. Appl., vol. 40, no. 8, pp. 2992-3000, Jun. 2013.
[7] A. M. Al-Zoubi, H. Faris, J. Alqatawna, and M. A. Hassonah, "Evolving support vector machines using whale optimization algorithm for spam profiles detection on online social networks in different lingual contexts," Knowledge-Based Syst., vol. 153, pp. 91-104, Aug. 2018.
[8] S. B. S. Ahmad, M. Rafie, and S. M. Ghorabie, "Spam detection on Twitter using a support vector machine and users' features by identifying their interactions," Multimed. Tools Appl., vol. 80, no. 8, pp. 11583-11605, Mar. 2021.
[9] Z. Alom, B. Carminati, and E. Ferrari, "A deep learning model for Twitter spam detection," Online Soc. Networks Media, vol. 18, Article ID: 100079, Jul. 2020.
[10] X. Ban, C. Chen, S. Liu, Y. Wang, and J. Zhang, "Deep-learnt features for Twitter spam detection," in Proc. Int. Symp. Secur. Priv. Soc. Networks Big Data, pp. 22-26, Santa Clara, CA, USA, 10-11 Dec. 2018.
[11] Y. Liu, L. Wang, T. Shi, and J. Li, "Detection of spam reviews through a hierarchical attention architecture with N-gram CNN and Bi-LSTM," Inf. Syst., vol. 103, Article ID: 101865, Jan. 2022.
[12] G. Jain, M. Sharma, and B. Agarwal, "Optimizing semantic LSTM for spam detection," Int. J. Inf. Technol., vol. 11, no. 2, pp. 239-250, Jun. 2019.
[13] G. Jain, M. Sharma, and B. Agarwal, "Spam detection in social media using convolutional and long short term memory neural network," Annals of Mathematics and Artificial Intelligence, vol. 85, no. 1, pp. 21-44, 2019.
[14] T. Wu, S. Liu, J. Zhang, and Y. Xiang, "Twitter spam detection based on deep learning," in Proc. ACM Int. Conf. Proc. Ser., 8 pp., Geelong, Australia, 30 Jan.-3 Feb 2017.
[15] G. M. Shahariar, S. Biswas, F. Omar, F. M. Shah, and S. Binte Hassan, "Spam review detection using deep learning," in Proc. IEEE 10th Annu. Inf. Technol. Electron. Mob. Commun. Conf., IEMCON’19, pp. 27-33, Vancouver, Canada, 17-19 Oct. 2019.
[16] A. T. Kabakus and R. Kara, "‘TwitterSpamDetector’: a spam detection framework for twitter," International J. of Knowledge and Systems Science, vol. 10, no. 3, pp. 1-14, Jul. 2019.
[17] H. Shen, et al., "Discovering social spammers from multiple views," Neurocomputing, vol. 225, pp. 49-57, Feb. 2017.
[18] K. Lee, J. Caverlee, and S. Webb, "Uncovering social spammers: social honeypots + machine learning," in Proc. SIGIR Proc.-33rd Annual International ACM SIGIR Conf. on Research and Development in Information Retrieval, pp. 435-442, Geneva, Switzerland, 19-23 Jul. 2010.
[19] C. Grier, K. Thomas, V. Paxson, and M. Zhang, "@spam: the underground on 140 characters or less," in Proc. of the ACM Conf. on Computer and Communications Security, pp. 27-37, Chicago, IL, USA, 4-8 Oct. 2010.
[20] S. Saumya and J. P. Singh, "Spam review detection using LSTM autoencoder: an unsupervised approach," Electron. Commer. Res., vol. 22, no. 1, pp. 113-133, Mar. 2022.
[21] J. V. Lochter, T. A. Almeida, and T. C. Alberto, "TubeSpam: comment spam filtering on YouTube," in Proc. IEEE 14th Int, Conf. on Machine Learning and Applications, pp. 138-143, Miami, FL, USA, 9-11 Dec. 2015.
[22] M. M. Abdulhasan, H. Alchilibi, M. A. Mohammed, and R. Nair, "Real-time sentiment analysis and spam detection using machine learning and deep learning," in Proc. 3rd Int. Conf. on Data Science and Big Data Analytics, pp. 507-533, Indore, India, 16-17 Jun. 2023.
[23] A. Ahraminezhad, M. Mojarad, and H. Arfaeinia, "An intelligent ensemble classification method for spam diagnosis in social networks," International J. of Intelligent Systems and Applications, vol. 14, no. 1, pp. 24-31, Feb. 2022.
[24] Z. Alom, B. Carminati, and E. Ferrari, "A deep learning model for Twitter spam detection," Online Social Networks and Media, Article ID: 100079, Jul. 2020.
[25] S. Liu, Y. Wang, J. Zhang, C. Chen, and Y. Xiang, "Addressing the class imbalance problem in twitter spam detection using ensemble learning," Computers & Security, vol. 69, pp. 35-49, Aug. 2017.
[26] C. Zhao, Y. Xin, X. Li, Y. Yang, and Y. Chen, "A heterogeneous ensemble learning framework for spam detection in social networks with imbalanced data," Applied Sciences, vol. 10, no. 3, Article ID” 936, Jan. 2020.
[27] M. Usama, et al., "Unsupervised machine learning for networking: techniques, applications and research challenges," IEEE Access, vol. 7, pp. 65579-65615, 2019.

اشتراک گذاری

آدرس مقاله

تشخیص اسپم در شبکه اجتماعی توییتر با استفاده از رویکرد یادگیری ترکیبی

رایمگ

پیوندهای سایت

مراکز مرتبط

پشتیبانی

صفحات رسمی