Survey on the Applications of the Graph Theory in the Information Retrieval
Subject Areas : ICTMaryam Piroozmand 1 , Amir Hosein Keyhanipour 2 , Ali Moeini 3
1 -
2 - University of Tehran
3 -
Keywords: Graph Theory, Information Retrieval, Learning to Rank, Knowledge Graph, Graph-based Dataset Representation,
Abstract :
Due to its power in modeling complex relations between entities, graph theory has been widely used in dealing with real-world problems. On the other hand, information retrieval has emerged as one of the major problems in the area of algorithms and computation. As graph-based information retrieval algorithms have shown to be efficient and effective, this paper aims to provide an analytical review of these algorithms and propose a categorization of them. Briefly speaking, graph-based information retrieval algorithms might be divided into three major classes: the first category includes those algorithms which use a graph representation of the corresponding dataset within the information retrieval process. The second category contains semantic retrieval algorithms which utilize the graph theory. The third category is associated with the application of the graph theory in the learning to rank problem. The set of reviewed research works is analyzed based on both the frequency as well as the publication time. As an interesting finding of this review is that the third category is a relatively hot research topic in which a limited number of recent research works are conducted.
[1] C. D. Manning, P. Raghavan, and H. Schütze, Introduction to Information Retrieval. Cambridge University Press, 2008. doi: 10.1017/cbo9780511809071.
[2] R. Baeza-Yates, Modern Information Retrieval: The Concepts and Technology Behind Search, Second Eit. Addison-Wesley Professional, 2011.
[3] J. Devezas and S. Nunes, “A Review of Graph-Based Models for Entity-Oriented Search,” SN Computer Science, vol. 2, no. 6. Springer, pp. 1–36, Nov. 01, 2021. doi: 10.1007/s42979-021-00828-w.
[4] C. Moreira, P. Calado, and B. Martins, “Learning to rank academic experts in the DBLP dataset,” Expert Syst., vol. 32, no. 4, pp. 477–493, Aug. 2015, doi: 10.1111/exsy.12062.
[5] Y. Zhang, D. Wang, and Y. Zhang, “Neural IR meets graph embedding: A ranking model for product search,” in The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019, May 2019, pp. 2390–2400. doi: 10.1145/3308558.3313468.
[6] T. Xu, H. Zhu, E. Chen, B. Huai, H. Xiong, and J. Tian, “Learning to annotate via social interaction analytics,” Knowl. Inf. Syst., vol. 41, no. 2, pp. 251–276, Oct. 2014, doi: 10.1007/s10115-013-0717-8.
[7] W. Yu and Z. Qin, “Spectrum-enhanced pairwise learning to rank,” in The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019, May 2019, pp. 2247–2257. doi: 10.1145/3308558.3313478.
[8] W. Wu, H. Li, and J. Xu, “Learning query and document similarities from click-through bipartite graph with metadata,” in WSDM 2013 - Proceedings of the 6th ACM International Conference on Web Search and Data Mining, 2013, pp. 687–696. doi: 10.1145/2433396.2433481.
[9] X. He, M. Gao, M. Y. Kan, and D. Wang, “BiRank: Towards Ranking on Bipartite Graphs,” IEEE Trans. Knowl. Data Eng., vol. 29, no. 1, pp. 57–71, Jan. 2017, doi: 10.1109/TKDE.2016.2611584.
[10] J. Sanz-Cruzado, P. Castells, C. Macdonald, and I. Ounis, “Effective contact recommendation in social networks by adaptation of information retrieval models,” Inf. Process. Manag., vol. 57, no. 5, p. 102285, Sep. 2020, doi: 10.1016/j.ipm.2020.102285.
[11] K. Yuan, L. Gao, Z. Jiang, and Z. Tang, “Formula Ranking within an Article,” in Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, May 2018, pp. 123–126. doi: 10.1145/3197026.3197061.
[12] H. Niu, I. Keivanloo, and Y. Zou, “Learning to rank code examples for code search engines,” Empir. Softw. Eng., vol. 22, no. 1, pp. 259–291, Feb. 2017, doi: 10.1007/s10664-015-9421-5.
[13] S. Jiang et al., “Learning query and document relevance from a web-scale click graph,” in SIGIR 2016 - Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, Jul. 2016, pp. 185–194. doi: 10.1145/2911451.2911531.
[14] X. Yang and B. Wang, “Local ranking and global fusion for personalized recommendation,” Appl. Soft Comput. J., vol. 96, p. 106636, Nov. 2020, doi: 10.1016/j.asoc.2020.106636.
[15] A. Ferraro, L. Porcaro, and X. Serra, “Balancing Exposure and Relevance in Academic Search,” 2020. Accessed: Nov. 14, 2022. [Online]. Available: https://fair-trec.github.io/2020/doc/guidelines-2020.pdf
[16] C. J. C. Burges, “From ranknet to lambdarank to lambdamart: An overview,” 2010. Accessed: Nov. 14, 2022. [Online]. Available: http://www.ccs.neu.edu/home/vip/teach/MLcourse/4_boosting/materials/msr-tr-2010-82.pdf
[17] V. D. Blondel, J. L. Guillaume, R. Lambiotte, and E. Lefebvre, “Fast unfolding of communities in large networks,” J. Stat. Mech. Theory Exp., vol. 2008, no. 10, p. P10008, Oct. 2008, doi: 10.1088/1742-5468/2008/10/P10008.
[18] A. Bertolino, A. Guerriero, B. Miranda, R. Pietrantuono, and S. Russo, “Learning-to-rank vs ranking-to-learn: Strategies for regression testing in continuous integration,” in Proceedings - International Conference on Software Engineering, Jun. 2020, pp. 1261–1272. doi: 10.1145/3377811.3380369.
[19] S. Maqsood, M. A. Islam, M. T. Afzal, and N. Masood, “A comprehensive author ranking evaluation of network and bibliographic indices,” Malaysian J. Libr. Inf. Sci., vol. 25, no. 1, pp. 31–45, Apr. 2020, doi: 10.22452/mjlis.vol25no1.2.
[20] J. Shi and X.-Y. Tian, “Learning to Rank Sports Teams on a Graph,” Appl. Sci., vol. 10, no. 17, p. 5833, Aug. 2020, doi: 10.3390/app10175833.
[21] S. Agarwal, “Learning to rank on graphs,” Mach. Learn., vol. 81, no. 3, pp. 333–357, Dec. 2010, doi: 10.1007/s10994-010-5185-8.
[22] X. Zou, “A Survey on Application of Knowledge Graph,” in Journal of Physics: Conference Series, Apr. 2020, vol. 1487, no. 1, p. 012016. doi: 10.1088/1742-6596/1487/1/012016.
[23] S. Ji, S. Pan, E. Cambria, P. Marttinen, and P. S. Yu, “A Survey on Knowledge Graphs: Representation, Acquisition, and Applications,” IEEE Trans. Neural Networks Learn. Syst., vol. 33, no. 2, pp. 494–514, Feb. 2022, doi: 10.1109/TNNLS.2021.3070843.
[24] M. Fernandez et al., “Semantic search meets the Web,” in Proceedings - IEEE International Conference on Semantic Computing 2008, ICSC 2008, 2008, pp. 253–260. doi: 10.1109/ICSC.2008.52.
[25] V. Lopez, M. Sabou, and E. Motta, “PowerMap; Mapping the real semantic web on the fly,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2006, vol. 4273 LNCS, pp. 414–427. doi: 10.1007/11926078_30.
[26] K. Byrne, “Populating the Semantic Web-Combining Text and Relational Databases as RDF Graphs,” The University of Edinburgh, 2008.
[27] K. Balog et al., “SaHaRa: Discovering Entity-Topic Associations in Online News,” 2009.
[28] P. Oza, L. D.-C. workshop Proceedings, and U. 2021, “Which entities are relevant for the story?,” in Text2Story’21 Workshop, 2021, pp. 41–48. Accessed: Apr. 03, 2023. [Online]. Available: https://par.nsf.gov/biblio/10300275
[29] U. Rashid, K. Saleem, and A. Ahmed, “MIRRE approach: nonlinear and multimodal exploration of MIR aggregated search results,” Multimed. Tools Appl., vol. 80, no. 13, pp. 20217–20253, May 2021, doi: 10.1007/s11042-021-10603-x.
[30] H. Gao, L. Wu, P. Hu, Z. Wei, F. Xu, and B. Long, “Graph-augmented Learning to Rank for Querying Large-scale Knowledge Graph,” Nov. 2021, Accessed: Mar. 17, 2023. [Online]. Available: http://arxiv.org/abs/2111.10541
[31] H. Hosseini and E. Bagheri, “Learning to rank implicit entities on Twitter,” Inf. Process. Manag., vol. 58, no. 3, p. 102503, May 2021, doi: 10.1016/j.ipm.2021.102503.
[32] H. Wu and F. J. Meng, “Research on the Application of Personalized Course Recommendation of Learn to Rank Based on Knowledge Graph,” in Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST, 2020, vol. 331, pp. 19–30. doi: 10.1007/978-3-030-62205-3_2.
[33] G. Maheshwari, P. Trivedi, D. Lukovnikov, N. Chakraborty, A. Fischer, and J. Lehmann, “Learning to Rank Query Graphs for Complex Question Answering over Knowledge Graphs,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2019, vol. 11778 LNCS, pp. 487–504. doi: 10.1007/978-3-030-30793-6_28.
[34] Y. Su et al., “Reducing Bug Triaging Confusion by Learning from Mistakes with a Bug Tossing Knowledge Graph,” in Proceedings - 2021 36th IEEE/ACM International Conference on Automated Software Engineering, ASE 2021, 2021, pp. 191–202. doi: 10.1109/ASE51524.2021.9678574.
[35] P. Jafarzadeh, Z. Amirmahani, and F. Ensan, “Learning to Rank Knowledge Subgraph Nodes for Entity Retrieval,” in SIGIR 2022 - Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Jul. 2022, pp. 2519–2523. doi: 10.1145/3477495.3531888.
[36] Y. Ni et al., “Semantic documents relatedness using concept graph representation,” in WSDM 2016 - Proceedings of the 9th ACM International Conference on Web Search and Data Mining, Feb. 2016, pp. 635–644. doi: 10.1145/2835776.2835801.
[37] N. Zhiltsov and E. Agichtein, “Improving entity search over linked data by modeling latent semantics,” in International Conference on Information and Knowledge Management, Proceedings, 2013, pp. 1253–1256. doi: 10.1145/2505515.2507868.
[38] O. Irrera and G. Silvello, “Background Linking: Joining Entity Linking with Learning to Rank Models.,” ceur-ws.org, vol. 2816, 2021, Accessed: Aug. 24, 2022. [Online]. Available: http://ceur-ws.org/Vol-2816/paper6.pdf
[39] M. Bendersky and W. B. Croft, “Modeling higher-order term dependencies in information retrieval using query hypergraphs,” in SIGIR’12 - Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval, Aug. 2012, pp. 941–950. doi: 10.1145/2348283.2348408.
[40] T. Menezes and C. Roth, “Semantic Hypergraphs,” Aug. 2019, Accessed: Apr. 07, 2023. [Online]. Available: http://arxiv.org/abs/1908.10784
[41] L. Dietz, “ENT rank: Retrieving entities for topical information needs through entity-neighbor-text relations,” in SIGIR 2019 - Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, Jul. 2019, pp. 215–224. doi: 10.1145/3331184.3331257.
[42] B. A., “ObjectRank : Authority-Based Keyword Search in Databases,” VLDB 2004, 2004.
[43] S. Chakrabarti, “Dynamic personalized pagerank in entity-relation graphs,” in 16th International World Wide Web Conference, WWW2007, May 2007, pp. 571–580. doi: 10.1145/1242572.1242650.
[44] L. Espín-Noboa, F. Lemmerich, S. Walk, M. Strohmaier, and M. Musen, “HopRank: How semantic structure influences teleportation in PageRank (a case study on bioPortal),” in The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019, May 2019, pp. 2708–2714. doi: 10.1145/3308558.3313487.
[45] R. Delbru, N. Toupikov, M. Catasta, G. Tummarello, and S. Decker, “Hierarchical link analysis for ranking web data,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2010, vol. 6089 LNCS, no. PART 2, pp. 225–239. doi: 10.1007/978-3-642-13489-0_16.
[46] C. Musto, G. Semeraro, M. de Gemmis, and P. Lops, “Tuning personalized pagerank for semantics-aware recommendations based on linked open data,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2017, vol. 10249 LNCS, pp. 169–183. doi: 10.1007/978-3-319-58068-5_11.
[47] H. Li, “Learning to Rank for Information Retrieval and Natural Language Processing, Second Edition,” Synth. Lect. Hum. Lang. Technol., vol. 7, no. 3, pp. 1–123, Oct. 2015, doi: 10.2200/S00607ED2V01Y201410HLT026.
[48] I. C. Dourado, D. C. G. Pedronette, and R. da S. Torres, “Unsupervised graph-based rank aggregation for improved retrieval,” Inf. Process. Manag., vol. 56, no. 4, pp. 1260–1279, Jul. 2019, doi: 10.1016/j.ipm.2019.03.008.
[49] M. Bałchanowski and U. Boryczka, “Aggregation of Rankings Using Metaheuristics in Recommendation Systems,” Electronics, vol. 11, no. 3, p. 369, Jan. 2022, doi: 10.3390/electronics11030369.
[50] Y. Zhang, Y. Xiao, J. Wu, and X. Lu, “Comprehensive world university ranking based on ranking aggregation,” Comput. Stat., vol. 36, no. 2, pp. 1139–1152, Jun. 2021, doi: 10.1007/s00180-020-01033-8.
[51] L. P. Valem and D. C. G. Pedronette, “Graph-based selective rank fusion for unsupervised image retrieval,” Pattern Recognit. Lett., vol. 135, pp. 82–89, Jul. 2020, doi: 10.1016/j.patrec.2020.03.032.
[52] J. Y. Yeh and C. J. Tsai, “A Graph-based Feature Selection Method for Learning to Rank Using Spectral Clustering for Redundancy Minimization and Biased PageRank for Relevance Analysis,” Comput. Sci. Inf. Syst., vol. 19, no. 1, pp. 141–164, Jan. 2022, doi: 10.2298/CSIS201220042Y.
[53] J. Y. Yeh and C. J. Tsai, “Graph-based Feature Selection Method for Learning to Rank,” in ACM International Conference Proceeding Series, Nov. 2020, pp. 70–73. doi: 10.1145/3442555.3442567.
[54] B. Geng, L. Yang, and X.-S. Hua, “Learning to Rank with Graph Consistency,” 2009. [Online]. Available: https://www.microsoft.com/en-us/research/wp-content/uploads/2009/08/MSRA-TR-CAR.pdf
[55] J. Fan, H. Luo, Y. Gao, and R. Jain, “Incorporating concept ontology for hierarchical video classification, annotation, and visualization,” IEEE Trans. Multimed., vol. 9, no. 5, pp. 939–957, Aug. 2007, doi: 10.1109/TMM.2007.900143.
[56] Y. Li et al., “LtrGCN: Large-Scale Graph Convolutional Networks-Based Learning to Rank for Web Search,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 14174 LNAI, pp. 635–651, 2023, doi: 10.1007/978-3-031-43427-3_38.
[57] A. H. Keyhanipour, “Graph-based comparative analysis of learning to rank datasets,” Int. J. Data Sci. Anal., vol. 17, no. 2, pp. 165–187, Mar. 2024, doi: 10.1007/S41060-023-00406-8/METRICS.
[58] M. Xu and J. Zhang, “MGL2Rank: Learning to rank the importance of nodes in road networks based on multi-graph fusion,” Inf. Sci. (Ny)., vol. 667, p. 120472, May 2024, doi: 10.1016/J.INS.2024.120472.