Social Networks Embedding Based on the Employment of Community Recognition and Latent Semantic Feature Extraction Approaches
Subject Areas : electrical and computer engineeringMohadeseh Taherparvar 1 , Fateme Ahmadi abkenari 2 , Peyman bayat 3
1 - Department of Computer Engineering, Rasht Branch, Islamic Azad University, Rasht, Iran
2 - Department of Computer Engineering, Payame Nour University, Tehran, Iran
3 - Department of Computer Engineering, Rasht Branch, Islamic Azad University, Rasht, Iran
Keywords: BTM topic model, community detection, deep learning, network embedding, overlapping social networks, semantic features,
Abstract :
The purpose of embedding social networks, which has recently attracted a lot of attention, is to learn to display in small dimensions for each node in the network while maintaining the structure and characteristics of the network. In this paper, we propose the effect of identifying communities in different situations such as community detection during or before the process of random walking and also the effect of semantic textual information of each node on network embedding. Then two main frameworks have been proposed with community and context aware network embedding and community and semantic feature-oriented network embedding. In this paper, in community and context aware network embedding, the detection of communities before the random walk process, is performed through using the EdMot non-overlapping method and EgoNetSplitter overlapping method. However, in community and semantic feature-oriented network embedding, the recognition of communities during a random walk event is conducted using a Biterm topic model. In all the proposed methods, text analysis is examined and finally, the final display is performed using the Skip-Gram model in the network. Experiments have shown that the methods proposed in this paper work better than the superior network embedding methods such as Deepwalk, CARE, CONE, and COANE and have reached an accuracy of nearly 0.9 and better than other methods in terms of edge prediction criteria in the network.
[1] P. Goyal and E. Ferrara, "Graph embedding techniques, applications, and performance: a survey," Knowl.-Based Syst., vol. 151, pp. 78-94, Jul. 2018.
[2] H. Cai, V. W. Zheng, and K. Chang, "A comprehensive survey of graph embedding: problems, techniques and applications," IEEE Trans. Knowl. Data Eng., vol. 30, no. 9, pp. 1616-1637, Sept. 2018.
[3] I. Brugere, B. Gallagher, and T. Y. BergerWolf, "Network structure inference, a survey: motivations, methods, and applications," ACM Comput. Surv., vol. 51, no. 2, Article ID: 24, 39 pp., Mar. 2019.
[4] F. Huang, X. Zhang, J. Xu, C. Li, and Z. Li, "Network embedding by fusingmultimodal contents and links," Knowl.-Based Syst., vol. 171, pp. 44-55, May 2019.
[5] J. Liao, S. Wang, D. Li, and X. Li, "FREERL: fusion relation embedded representation learning framework for aspect extraction," Knowl. Based Syst., vol. 135, pp. 9-17, Nov. 2017.
[6] L. Boratto, S. Carta, G. Fenu, and R. Saia, "Using neural word embeddings to model user behavior and detect user segments," Knowl.-Based Syst., vol. 108, pp. 5-14, Sept. 2016.
[7] M. Ji, J. Han, and M. Danilevsky, "Ranking-based classification of heterogeneous information networks," in Proc. of the 17th ACM SIGKDD In. Conf. on Knowledge Discovery and Data Mining, pp. 1298-1306, San Diego, CA, USA, 21-24 Aug. 2011.
[8] R. A. Sinoara, J. Camachocollados, R. Rossi, R. Navigli, and S. O. Rezende, "Knowledge-enhanced document embeddings for text classification," Knowl.-Based Syst. vol. 163, pp. 955-971, Jan. 2019.
[9] D. Liben-Nowell and J. Kleinberg, "The link-prediction problem for social networks," J. Am. Soc. Inf. Sci. Technol., vol. 58, no. 7, pp. 1019-1031, May 2007.
[10] T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean, "Distributed representations of words and phrases and their compositionality," in Proc. of the 26th Int. Conf. on Neural Information Processing Systems, vol. 2, pp. 3111-3119, Lake Tahoe, NA, USA, 5-10 Dec. 2013.
[11] B. Perozzi, R. AlRfou, and S. Skiena, "Deepwalk: online learning of social representations," in Proc. of the 20th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, pp. 701-710, New York, NY, USA, 24-27 Aug. 2014.
[12] A. Grover and J. Leskovec, "node2vec: scalable feature learning for networks," in Proc. of the 22nd ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, pp. 855-864, San Francisco, CA, USA, 13-17 Aug. 2016.
[13] J. Tang, et al., "LINE: large-scale information network embedding," in Proc. of the 24th Int. Conf. on World Wide Web, pp. 1067-1077, Florence Italy, 18-22 May 2015.
[14] W. Hamilton, Z. Ying, and J. Leskovec, "Inductive representation learning on large graphs," in Proc. of the 31st Int. Conf. on Neural Information Processing Systems, pp. 1024-1034, Long Beach, CA, USA, 4-9 Dec. 2017.
[15] H. Gao and H. Huang, "Deep attributed network embedding," in Proc. of the 27th Int. Joint Conf. on Artificial Intelligence, pp. 3364-3370, Stockholm Sweden, 13-19 Jul. 2018.
[16] X. Huang, J. Li, and X. Hu, "Label informed attributed network embedding," in Proc. of the 10th ACM Int. Conf. on Web Search and Data Mining, pp. 731-739, Cambridge, UK, 6-10 Feb. 2017.
[17] J. Liang, P. Jacobs, J. Sun, and S. Parthasarathy, "Semi-supervised embedding in attributed networks with outliers," in Proc. of the SIAM Int. Conf. on Data Mining, pp. 153-161, 2018.
[18] L. Yang, et al., "Modularity based community detection with deep learning," in Proc. of the 25th Int. Joint Conf. on Artificial Intelligence, pp. 2252-2258, New York, NY, USA, 9-15 Jul. 2016.
[19] X. Wang, P. Cui, J. Wang, J. Pei, W. Zhu, and S. Yang, "Community preserving network embedding," Proc. of the 31st AAAI Conf. on Artificial Intelligence, pp. 203-209, Washington, DC, USA, 4-7 Feb. 2017.
[20] S. Ismail, and R. Ismail, "Modularity approach for community detection in complex networks," in Proc. the 11th Int, Conf. Ubiquitous Information Management and Communication, 6 pp., Beppu, Japan, 5-7 Jan. 2017.
[21] S. Fortunato and M. Barthelemy, "Resolution limit in community detection," in Proc. Natl. Acad. Sci. USA, vol. 104, no. 1, pp. 36-41, Jan. 2007.
[22] G. Salton and C. Buckley, "Term-weighting approaches in automatic text retrieval," Inf. Process. Manag., vol. 24, no. 5, pp. 513-523, 1988.
[23] D. M. Blei, A. Y. Ng, and M. I. Jordan, "Latent dirichlet allocation," J. Mach. Learn. Res., vol. 3, no. 1, pp. 993-1022, 2003.
[24] M. M. Keikha, M. Rahgozar, and M. Asadpour, "Community aware random walk for network embeding," Knowl. Based Syst., vol. 148, pp. 47-54, 2018.
[25] V. D. Blondel, J. L. Guillaume, R. Lambiotte, and E. Lefebvre, "Fast unfolding of communities in large networks," J. Stat. Mech., vol. 10, Article ID: P10008, 2008.
[26] P. Li, L. Huang, C. Wang, and J. Lai, "EdMot: an edge enhancement approach for motif-aware community detection," in Proc. of the 25th ACM SIGKDD Int. Conf. on Knowledge Discovery & Data Mining, pp. 479-487, Anchorage, AK, USA, 4-8 Aug. 2019.
[27] A. Epasto, S. La. anzi, R. Leme, "Ego-Spli.ing framework: from non-overlapping to overlapping clusters," in Proc. of the 23th ACM SIGKDD Int. Conf. on Knowledge Discovery & Data Mining, Halifax, Canada, 13-17 Aug. 2017.
[28] L. Tang and H. Liu, "Leveraging social media networks for classification," Data Min. Knowl. Discov., vol. 23, no. 3, pp. 447-478, Nov. 2011.
[29] J. B. Tenenbaum, V. de Silva, and J. C. Langford, "A global geometric framework for nonlinear dimensionality reduction," Science, vol. 250, no. 5500, pp. 2319-2323, 22 Dec. 2000.
[30] A. Ahmed, N. Shervashidze, S. Narayanamurthy, V. Josifovski, and A. J. Smola, "Distributed large-scale natural graph factorization," in Proc. of the 22nd Int. Conf. on World Wide Web, pp. 37-48, Rio de Janeiro, Brazil, 13-17 May 2013.
[31] T. F. Cox and M. A. Cox, Multidimensional Scaling, CRC Press, 2000.
[32] S. Yan, D. Xu, B. Zhang, H. J. Zhang, Q. Yang, and S. Lin, "Graph embedding and extensions: a general framework for dimensionality reduction," IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 29, no. 1, pp. 40-51, Jan. 2007.
[33] S. Fortunato, "Community detection in graphs," Phys. Rep., vol. 486, no. 3-5, pp. 75-174, Feb. 2010.
[34] K. Henderson, et al., "RolX: structural role extraction & mining in large graphs," in Proc. of the 18th ACM SIGKDD Int. Conf. on Knowledge Discovery & Data Mining, Beijing, China, 12-16 Aug. 2012.
[35] C. Yang, Z. Liu, D. Zhao, M. Sun, and E. Y. Chang, "Network representation learning with rich text information," in Proc. of the 24th Int. Joint Conf. on Artificial Intelligence, pp. 2111-2117, Buenos Aires, Argentina, 25-31 Jul. 2015.
[36] Z. Chen, T. Cai, C. Chen, Z. Zheng, and G. Ling, "SINE: side information network embedding," in Proc. of the 24th Int. Conf. on Database Systems for Advanced Applications, pp. 692-708, Chiang Mai, Thailand, 22-25 Apr. 2019.
[37] D. Wang, P. Cui, and W. Zhu, "Structural deep network embedding," in Proc. of the 22nd ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, pp. 1225-1234, San Francisco, CA, USA, 13-17 Aug. 2016.
[38] X. Wang, D. Jin, X. Cao, L. Yang, and W. Zhang, "Semantic community identification in large attribute networks," in Proc. of the 30th AAAI Conf. on Artificial Intelligence, pp. 265-271, Phoenix, AZ, USA, 12-17 Feb. 2016.
[39] M. Li, J. Liu, P. Wu, and X. Teng, "Evolutionary network embedding preserving both local proximity and community structure," IEEE Trans. Evol. Comput., vol. 24, no. 3, pp. 523-535, Jun. 2019.
[40] J. Chen, Q. Zhang, and X. Huang, "Incorporate group information to enhance network embedding," in Proc. of the 25th ACM Int. Conf. on Information and Knowledge Management, pp. 1901-1904, Indianapolis, IN, USA, 24-28 Oct. 2016.
[41] X. Xia, et al., "Improving automated bug triaging with specialized topic model," IEEE Trans. Softw. Eng., vol. 43, no. 3, pp. 272-297, Mar. 2017.
[42] T. Hofmann, "Probabilistic latent semantic indexing," in Proc. of the 22nd Annual Int. ACM SIGIR Conf. on Research and Revelopment in Information Retrieval, pp. 50-57, Berkeley, CA, USA, 15-19 Aug. 1999.
[43] M. Steyvers, P. Smyth, M. RosenZvi, and T. Griffiths, "Probabilistic author-topic models for information discovery," in Proc. of the 10th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, pp. 306-315, Seattle, WA, USA, 22-25 Aug. 2004.
[44] M. RosenZvi, T. Griffiths, M. Steyvers, and P. Smyth, "The author-topic model for authors and documents," in Proc. of the 20th Conf. on Uncertainty in Artificial Intelligence, pp. 487-494, Banff, Canada, 6-11 Jul. 2004.
[45] Q. Mei, D. Cai, D. Zhang, and C. Zhai, "Topic modeling with network regularization," in Proc. of the 17th Int. Conf. on World Wide Web, pp. 101-110, Beijing, China, 21-25 Apr. 2008.
[46] Y. Shi, M. Lei, H. Yang, and L. Niu, "Diffusion network embedding," Pattern Recognit., vol. 88, pp. 518-531, Apr. 2019.
[47] H. Chen, et al., "Exploiting centrality information with graph convolutions for network representation learning," in Proc. of the 35th IEEE Int. Conf. on Data Engineering, pp. 590-601, Macao, China ,8-11 Apr. 2019.
[48] W. Zhao, H. Ma, Z. Li, X. Ao, and N. Li, "SBRNE: an improved unified framework for social and behavior recommendations with network embedding," in Proc. of the 24th Int. Conf. on Database Systems for Advanced Applications, pp. 555-571, Chiang Mai, Thailand, 22-22 Apr. 2019.
[49] Q. Li, J. Zhong, Q. Li, Z. Cao, and C. Wang, "Enhancing network embedding with implicit clustering," in Proc. of the 24th Int. Conf. on Database Systems for Advanced Applications, pp. 452-467, Chiang Mai, Thailand, April 22-25, 2019.
[50] L. Wu, D. Wang, S. Feng, Y. Zhang, and G. Yu, "MDAL: multi-task dual attention LSTM model for semi-supervised network embedding," in Proc. of the 24th Int. Conf. on Database Systems for Advanced Applications, Chiang Mai, Thailand, April 22-25, 2019.
[51] Y. Gao, M. Gong, Y. Xie, and H. Zhong, "Community-oriented attributed network embedding," Knowledge-Based Systems, vol. 193, Article ID: 105418, Apr. 2019.
[52] X. Cheng, X. Yan, Y. Lan, and J. Guo, "BTM: topic modeling over short texts," IEEE Trans. on Knowledge and Data Engineering, vol. 26, no. 12, pp. 2928 - 2941, Dec. 2013.
[53] D. Whitley, "A genetic algorithm tutorial," Stat. Comput., vol. 4, no. 2, pp. 65-85, Jun. 1994.
[54] J. Tang, Z. Meng, X. Nguyen, Q. Mei, and M. Zhang, "Understanding the limiting factors of topic modeling via posterior contraction analysis," in Proc. of the 31st Int. Conf. on Machine Learning, pp. 190-198, Beijing China 21-26 Jun. 2014.
[55] A. K. McCallum, K. Nigam, J. Rennie, and K. Seymore, "Automating the construction of internet portals with machine learning," Information Retrieval. vol. 3, no. 2, pp. 127-163, Jun. 2000.
[56] -, DBLP Citation Network, https://www.aminer.org/citation