Information Retrieval Model with Query Expansion and User Preference Profile
Abstract
Understanding the user's search intention enables identifying and extracting the most relevant and personalized search results from the available information, according to the user's needs. This paper proposes an algorithm for relevant information retrieval that combines user preferences profile and query expansion to get relevant and personalized search results. The information retrieval process is validated using Precision, Recall and Mean Average Precision (MAP) metrics applied to a dataset that contains the standardized documents and preferences profiles. The results allowed us to demonstrate that the algorithm improves the information retrieval process by finding documents with better quality and greater relevance to the users' needs.
Keywords
Personalized information retrieval, query expansion, user profile, semantic annotation
References
- H. Viltres, P. Leyva, J. P. Febles, V. Sentí, “Information retrieval with semantic annotation,” in 17th LACCEI International Multi-Conference for Engineering, Education, and Technology, 2019. https://doi.org/10.18687/LACCEI2019.1.1.308 DOI: https://doi.org/10.18687/LACCEI2019.1.1.308
- T. Rafa, S. Kechid, “Semantic Representation of a Geo-Social User Profile for a Personalised Information Retrieval,” Journal of Information and Knowledge Management, vol. 20, no. 4, e2150044, 2021. https://doi.org/10.1142/S0219649221500441 DOI: https://doi.org/10.1142/S0219649221500441
- P. P. Joby, “Expedient information retrieval system for web pages using the natural language modeling,” Journal of Artificial Intelligence, vol. 2, no. 2, pp. 100-110, 2020. https://doi.org/10.36548/jaicn.2020.2.003 DOI: https://doi.org/10.36548/jaicn.2020.2.003
- S. Sengan, G. K. Kamalam, J. Vellingiri, J. Gopal, P. Velayutham, V. Subramaniyaswamy, “Medical information retrieval systems for e-Health care records using fuzzy based machine learning model,” Microprocessors and Microsystems, In-Press, e103344, 2020. https://doi.org/10.1016/j.micpro.2020.103344 DOI: https://doi.org/10.1016/j.micpro.2020.103344
- V. Suma, “A novel information retrieval system for distributed cloud using hybrid deep fuzzy hashing algorithm,” JITDW, vol. 2, no. 3, pp. 151-160, 2020. https://doi.org/10.36548/jitdw.2020.3.003 DOI: https://doi.org/10.36548/jitdw.2020.3.003
- A. Jalilifard, V.F. Caridá, A. F. Mansano, R. S. Cristo, F. P. C. da Fonseca, “Semantic sensitive TF-IDF to determine word relevance in documents,” in Advances in Computing and Network Communications, 2021, pp. 327-337. https://doi.org/10.1007/978-981-33-6987-0_27 DOI: https://doi.org/10.1007/978-981-33-6987-0_27
- S. Zhuang, H. LI, G. Zuccon, “Deep query likelihood model for information retrieval,” in European Conference on Information Retrieval, 2021. pp. 463-470. https://doi.org/10.1007/978-3-030-72240-1_49 DOI: https://doi.org/10.1007/978-3-030-72240-1_49
- X. Liao, Z. Zhao, “Unsupervised approaches for textual semantic annotation, a survey,” ACM Computing Surveys, vol 52, no. 4, pp. 1-45, 2019. https://doi.org/10.1145/3324473 DOI: https://doi.org/10.1145/3324473
- D. Di Caprio, F. J. Santos-Arteaga, M. Tavana, “An information retrieval benchmarking model of satisficing and impatient users' behavior in online search environments,” Expert Systems with Applications, vol. 191, e116352, 2022. https://doi.org/10.1016/j.eswa.2021.116352 DOI: https://doi.org/10.1016/j.eswa.2021.116352
- S. Albukhitan, A. Alnazer, T. Helmy, “Framework of semantic annotation of Arabic document using deep learning,” Procedia Computer Science, vol. 170, pp. 989-994, 2020. https://doi.org/10.1016/j.procs.2020.03.096 DOI: https://doi.org/10.1016/j.procs.2020.03.096
- W. Wei, Q. Wu, D. Chen, Y. Zhang, W. Liu, G. Duan, X. Luo, “Automatic image annotation based on an improved nearest neighbor technique with tag semantic extension model,” Procedia Computer Science, vol. 183, pp. 616-623, 2021. https://doi.org/10.1016/j.procs.2021.02.105 DOI: https://doi.org/10.1016/j.procs.2021.02.105
- H. K. Azad, A. Deepak, “Query expansion techniques for information retrieval: a survey,” Information Processing and Management, vol. 56, no. 5, pp. 1698-1735, 2019. https://doi.org/10.1016/j.ipm.2019.05.009 DOI: https://doi.org/10.1016/j.ipm.2019.05.009
- S. Dahir, A. “El Qadi, A query expansion method based on topic modelling and DBpedia features,” International Journal of Information Management Data Insights, vol. 1, no. 2, e100043, 2021. https://doi.org/10.1016/j.jjimei.2021.100043 DOI: https://doi.org/10.1016/j.jjimei.2021.100043
- S. Jain, K. R. Seeja, R Jindal, “A fuzzy ontology framework in information retrieval using semantic query expansion,” International Journal of Information Management Data Insights, vol. 1, no. 1, e100009, 2021. https://doi.org/10.1016/j.jjimei.2021.100009 DOI: https://doi.org/10.1016/j.jjimei.2021.100009
- S. Malik, U. Shoaib, S. A. C. Bukhari, H. El Sayed, M. A. Khan, “A hybrid query expansion framework for the optimal retrieval of the biomedical literature,” Smart Health, vol. 23, e100247, 2022. https://doi.org/10.1016/j.smhl.2021.100247 DOI: https://doi.org/10.1016/j.smhl.2021.100247
- S. Abri, R. Abri, S. Çetin, “Group-based personalization using topical user profile,” in 28th ACM Conference on User Modeling, Adaptation and Personalization, 2020, pp. 181-186. https://doi.org/10.1145/3386392.3399559 DOI: https://doi.org/10.1145/3386392.3399559
- Z. Ma, Z. Dou, Y. Zhu, H. Zhong, J. R Wen, “One Chatbot Per Person: Creating Personalized Chatbots based on Implicit User Profiles,” in Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021, pp. 555-564. https://doi.org/10.1145/3404835.3462828 DOI: https://doi.org/10.1145/3404835.3462828
- D. Zhou, X. Wu, W. Zhao, S. Lawless, J. Liu, “Query expansion with enriched user profiles for personalized search utilizing folksonomy data,” IEEE Transactions on Knowledge and Data Engineering, vol. 29, no. 7, pp. 1536-1548, 2017. https://doi.org/10.1109/TKDE.2017.2668419 DOI: https://doi.org/10.1109/TKDE.2017.2668419
- M. Bravo, A. Aldea, L. F. Hoyos-Reyes, “Automated Ontology Population and Enrichment of Scientific Publications,” Journal of Physics: Conference Series, vol. 1, e012139, 2021. https://doi.org/10.1088/1742-6596/1828/1/0121 DOI: https://doi.org/10.1088/1742-6596/1828/1/012139
- K. Gupta, N. Sachdeva, V. Pudi, “Explicit modelling of the implicit short term user preferences for music recommendation,” in European Conference on Information Retrieval, 2018. pp. 333-344. https://doi.org/10.1007/978-3-319-76941-7_25 DOI: https://doi.org/10.1007/978-3-319-76941-7_25
- J. Choudhary, D. S. Tomar, D. P. Singh, “An Efficient Hybrid User Profile Based Web Search Personalization Through Semantic Crawler,” National Academy Science Letters, vol. 42, no. 2, pp, 105-108, 2019. https://doi.org/10.1007/s40009-018-0686-2 DOI: https://doi.org/10.1007/s40009-018-0686-2
- F. Zarrinkalam, H. Fani, E. Bagheri, “Extracting, Mining and Predicting Users' Interests from Social Networks,” in Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2019, pp. 1407-1408. https://doi.org/10.1145/3292500.3332279 DOI: https://doi.org/10.1145/3331184.3331383
- S. Gauch, M. Speretta, A. Chandramouli, A. Micarelli, “User profiles for personalized information access,” in The adaptive Web, 2007, pp. 54-89. https://doi.org/10.1007/978-3-540-72079-9_2 DOI: https://doi.org/10.1007/978-3-540-72079-9_2
- E. Vicente-López, L. M. de Campos, J. M. Fernández-Luna, J. F. Huete, “Use of textual and conceptual profiles for personalized retrieval of political documents,” Knowledge-Based Systems, vol. 112, pp. 127-141, 2016. https://doi.org/10.1016/j.knosys.2016.09.005 DOI: https://doi.org/10.1016/j.knosys.2016.09.005
- A. K. Nandanwar, J. Choudhary, D. P. Singh, “Web search personalization based on the principle of the ant colony,” Procedia Computer Science, vol. 189, pp. 100-107, 2021. https://doi.org/10.1016/j.procs.2021.05.073 DOI: https://doi.org/10.1016/j.procs.2021.05.073
- F. T. da Silva, J. E. Maia, “Query Expansion in Text Information Retrieval with Local Context and Distributional Model,” Journal of Digital Information Management, vol. 17, no. 6, e313, 2019. https://10.6025/jdim/2019/17/6/313-320 DOI: https://doi.org/10.6025/jdim/2019/17/6/313-320
- M. Pereira, E. Etemad, F. Paulovich, “Iterative learning to rank from explicit relevance feedback,” in Proceedings of the 35th Annual ACM Symposium on Applied Computing, 2020, pp. 698-705. https://doi.org/10.1145/3341105.3374002 DOI: https://doi.org/10.1145/3341105.3374002
- J. Serrano-Guerrero, F. P. Romero, J. A. Olivas, “A relevance and quality-based ranking algorithm applied to evidence-based medicine,” Computer methods and programs in biomedicine, vol. 191, e105415, 2020. https://doi.org/10.1016/j.cmpb.2020.105415 DOI: https://doi.org/10.1016/j.cmpb.2020.105415
- J. Wang, M. Pan, T. He, X. Huang, X. Wang, X. Tu, “A Pseudo-relevance feedback framework combining relevance matching and semantic matching for information retrieval,” Information Processing and Management, vol. 57, no. 6, e102342, 2020. https://doi.org/10.1016/j.ipm.2020.102342 DOI: https://doi.org/10.1016/j.ipm.2020.102342
- S. Neji, T. Chenaina, A. M. Shoeb, L. B. Ayed, “HIR: a hybrid IR ranking model,” in IEEE 45th Annual Computers, Software, and Applications Conference, 2021. pp. 1717-1722. https://10.1109/COMPSAC51774.2021.00256 DOI: https://doi.org/10.1109/COMPSAC51774.2021.00256
- B. Selvalakshmi, M. Subramaniam, “Intelligent ontology based semantic information retrieval using feature selection and classification,” Cluster Computing, vol. 22, no. 5, pp. 12871-12881, 2019. https://doi.org/10.1007/s10586-018-1789-8 DOI: https://doi.org/10.1007/s10586-018-1789-8
- G. J. Hahm, M. Y. Yi, J. H. Lee, H. W. Suh, “A personalized query expansion approach for engineering document retrieval,“ Advanced Engineering Informatics, vol. 28, no 4, pp. 344-359, 2014. https://doi.org/10.1016/j.aei.2014.04.002 DOI: https://doi.org/10.1016/j.aei.2014.04.002
- B. Xu, H. Lin, L. Yang, K. Xu, Y. Zhang, D. Zhang, Z. Yang, J. Wang, Y. Lin, F. Yin, “A supervised term ranking model for diversity enhanced biomedical information retrieval,” BMC bioinformatics, vol. 20, no 16, e590, 2019. https://doi.org/10.1186/s12859-019-3080-2 DOI: https://doi.org/10.1186/s12859-019-3080-2