Overcoming the Link Prediction Limitation in Sparse Networks using Community Detection
محورهای موضوعی : Complex NetworksMohammad Pouya Salvati 1 , Jamshid Bagherzadeh Mohasefi 2 , Sadegh Sulaimany 3
1 - Faculty of Electrical and Computer Engineering, Urmia University, Iran
2 - Faculty of Electrical and Computer Engineering, Urmia University, Iran
3 - Faculty of Computer Engineering, University of Kurdistan, Iran
کلید واژه: link Prediction, Sparse Network, Clustering, Time Efficient,
چکیده مقاله :
Link prediction seeks to detect missing links and the ones that may be established in the future given the network structure or node features. Numerous methods have been presented for improving the basic unsupervised neighbourhood-based methods of link prediction. A major issue confronted by all these methods, is that many of the available networks are sparse. This results in high volume of computation, longer processing times, more memory requirements, and more poor results. This research has presented a new, distinct method for link prediction based on community detection in large-scale sparse networks. Here, the communities over the network are first identified, and the link prediction operations are then performed within each obtained community using neighbourhood-based methods. Next, a new method for link prediction has been carried out between the clusters with a specified manner for maximal utilization of the network capacity. Utilized community detection algorithms are Best partition, Link community, Info map and Girvan-Newman, and the datasets used in experiments are Email, HEP, REL, Wikivote, Word and PPI. For evaluation of the proposed method, three measures have been used: precision, computation time and AUC. The results obtained over different datasets demonstrate that extra calculations have been prevented, and precision has been increased. In this method, runtime has also been reduced considerably. Moreover, in many cases Best partition community detection method has good results compared to other community detection algorithms.
Link prediction seeks to detect missing links and the ones that may be established in the future given the network structure or node features. Numerous methods have been presented for improving the basic unsupervised neighbourhood-based methods of link prediction. A major issue confronted by all these methods, is that many of the available networks are sparse. This results in high volume of computation, longer processing times, more memory requirements, and more poor results. This research has presented a new, distinct method for link prediction based on community detection in large-scale sparse networks. Here, the communities over the network are first identified, and the link prediction operations are then performed within each obtained community using neighbourhood-based methods. Next, a new method for link prediction has been carried out between the clusters with a specified manner for maximal utilization of the network capacity. Utilized community detection algorithms are Best partition, Link community, Info map and Girvan-Newman, and the datasets used in experiments are Email, HEP, REL, Wikivote, Word and PPI. For evaluation of the proposed method, three measures have been used: precision, computation time and AUC. The results obtained over different datasets demonstrate that extra calculations have been prevented, and precision has been increased. In this method, runtime has also been reduced considerably. Moreover, in many cases Best partition community detection method has good results compared to other community detection algorithms.
[1] D. Caiyan, L. Chen, and B. Li, “Link prediction in complex network based on modularity,” Soft Comput., 2016, doi: 10.1007/s00500-016-2030-4.
[2] H. Yuan, Y. Ma, F. Zhang, M. Liu, and W. Shen, “A distributed link prediction algorithm based on clustering in dynamic social networks,” in 2015 IEEE International Conference on Systems, Man, and Cybernetics, 2015, pp. 1341–1345.
[3] P. Symeonidis and N. Mantas, “Spectral clustering for link prediction in social networks with positive and negative links,” Soc. Netw. Anal. Min., vol. 3, no. 4, pp. 1433–1447, 2013.
[4] D. Liben-Nowell and J. Kleinberg, “The link-prediction problem for social networks,” J. Am. Soc. Inf. Sci. Technol., vol. 58, no. 7, pp. 1019–1031, 2007.
[5] G. SALTON and M. J. MCGILL, “Introduction to Modern Information Retrieval (pp. paginas 400).” 1986.
[6] M. E. J. Newman, “Clustering and preferential attachment in growing networks,” Phys. Rev. E, vol. 64, no. 2, p. 25102, 2001.
[7] P. Wang, B. W. Xu, Y. R. Wu, and X. Y. Zhou, “Link prediction in social networks: the state-of-the-art,” Sci. China Inf. Sci., vol. 58, no. 1, pp. 1–38, 2014, doi: 10.1007/s11432-014-5237-y.
[8] M. K. Khouzani and S. Sulaimany, “Identification of the effects of the existing network properties on the performance of current community detection methods,” J. King Saud Univ. - Comput. Inf. Sci., Apr. 2020, doi: 10.1016/j.jksuci.2020.04.007.
[9] R. Guimerà, M. Sales-Pardo, and L. A. N. Amaral, “A network-based method for target selection in metabolic networks,” Bioinformatics, vol. 23, no. 13, pp. 1616–1622, 2007.
[10] N. Benchettara, R. Kanawati, and C. Rouveirol, “A supervised machine learning link prediction approach for academic collaboration recommendation,” in Proceedings of the fourth ACM conference on Recommender systems, 2010, pp. 253–256.
[11] G. W. Flake, S. Lawrence, C. L. Giles, and F. M. Coetzee, “Self-organization and identification of web communities,” Computer (Long. Beach. Calif)., vol. 35, no. 3, pp. 66–70, 2002.
[12] A. Clauset, C. Moore, and M. E. J. Newman, “Hierarchical structure and the prediction of missing links in networks,” Nature, vol. 453, no. 7191, pp. 98–101, 2008.
[13] Z. Liu, Q.-M. Zhang, L. Lü, and T. Zhou, “Link prediction in complex networks,” EPL (Europhysics Lett., vol. 96, no. 4, p. 48007, 2011.
[14] E. M. Airoldi, D. M. Blei, S. E. Fienberg, E. P. Xing, and T. Jaakkola, “Mixed membership stochastic block models for relational data with application to protein-protein interactions,” in Proceedings of the international biometrics society annual meeting, 2006, vol. 15.
[15] J. H. S Soundarajan, “Using community information to improve the precision of link prediction methods,” WWW (Companion Vol., vol. 2012, pp. 607–608, 2012.
[16] S. Yokoi, H. Kajino, and H. Kashima, “Link prediction in sparse networks by incidence matrix factorization,” J. Inf. Process., vol. 25, pp. 477–485, 2017.
[17] K. Shang, T. Li, M. Small, D. Burton, and Y. Wang, “Link prediction for tree-like networks,” Chaos An Interdiscip. J. Nonlinear Sci., vol. 29, no. 6, p. 61103, 2019.
[18] J. Zhang, J. Chen, S. Zhi, Y. Chang, S. Y. Philip, and J. Han, “Link prediction across aligned networks with sparse and low rank matrix estimation,” in 2017 IEEE 33rd International Conference on Data Engineering (ICDE), 2017, pp. 971–982.
[19] C. H. Nguyen and H. Mamitsuka, “Latent feature kernels for link prediction on sparse graphs,” IEEE Trans. neural networks Learn. Syst., vol. 23, no. 11, pp. 1793–1804, 2012.
[20] X. Feng, J. C. Zhao, and K. Xu, “Link prediction in complex networks: a clustering perspective,” Eur. Phys. J. B, vol. 85, no. 1, p. 3, 2012.
[21] H. Liu, Z. Hu, H. Haddadi, and H. Tian, “Hidden link prediction based on node centrality and weak ties,” EPL (Europhysics Lett., vol. 101, no. 1, p. 18004, Jan. 2013, doi: 10.1209/0295-5075/101/18004.
[22] V. D. Blondel, J.-L. Guillaume, R. Lambiotte, and E. Lefebvre, “Fast unfolding of communities in large networks,” J. Stat. Mech. theory Exp., vol. 2008, no. 10, p. P10008, 2008.
[23] Y.-Y. Ahn, J. P. Bagrow, and S. Lehmann, “Link communities reveal multiscale complexity in networks,” Nature, vol. 466, no. 7307, pp. 761–764, 2010.
[24] A. Lancichinetti and S. Fortunato, “Community detection algorithms: a comparative analysis,” Phys. Rev. E, vol. 80, no. 5, p. 56117, 2009.
[25] M. Rosvall and C. T. Bergstrom, “Maps of random walks on complex networks reveal community structure,” Proc. Natl. Acad. Sci., vol. 105, no. 4, pp. 1118–1123, 2008.
[26] K. Esders, “Link Prediction in Large-scale Complex Networks,” Bachelor’s Thesis Karlsruhe Inst. Technol., no. February, 2015, [Online]. Available: https://hackernoon.com/link-prediction-in-large-scale-networks-f836fcb05c88.
[27] M. Hajiabadi, H. Zare, and H. Bobarshad, “IEDC: An integrated approach for overlapping and non-overlapping community detection,” Knowledge-Based Syst., vol. 123, pp. 188–199, 2017.
[28] H. Zare, M. Hajiabadi, and M. Jalili, “Detection of community structures in networks with nodal features based on generative probabilistic approach,” IEEE Trans. Knowl. Data Eng., 2019.
[29] H. Zare, M. A. N. Pour, and P. Moradi, “Enhanced recommender system using predictive network approach,” Phys. A Stat. Mech. its Appl., vol. 520, pp. 322–337, 2019.