Automatic Extractive Single Document Summarization: A Systematic Mapping
Abstract
Automatic Extractive Single Document Summarization (AESDS) is a research area that aims to create a condensed version of a document with the most relevant information; it acquires more importance daily due to the need of users to obtain information on documents published on the Internet quickly. In automatic document summarization, each element must be evaluated and ranked to generate a summary. As such, there are three approaches considering the number of objectives they evaluate: single-objective, multi-objective, and many-objective. This systematic mapping aims to provide knowledge about the methods and techniques used in extractive techniques for AESDS, analyzing the number of objectives and characteristics evaluated, which can be helpful for future research. This mapping was carried out using a generic process for the realization of systematic reviews where a search string was built considering some research questions. A filter was then used with inclusion and exclusion criteria for selecting primary studies with which it will carry out the analysis. Additionally, these studies are sorted according to the relevance of their content. This process is summarized in three main steps: planning, execution, and result analysis. At the end of the mapping, the following observations were identified: (i) There is a preference for the use of machine learning methods and the use of clustering techniques, (ii) the importance of using both types of characteristics (statistics and semantics), and (iii) the need to explore the many-objective approach.
Keywords
Automatic text summarization, Extractive, Systematic mapping
References
- W. S. El-Kassas, C. R. Salama, A. A. Rafea, H. K. Mohamed, “Automatic text summarization: A comprehensive survey,” Expert Systems with Applications, vol. 165, e113679, 2021. https://doi.org/10.1016/j.eswa.2020.113679 DOI: https://doi.org/10.1016/j.eswa.2020.113679
- A. Nenkova, K. McKeown, “A Survey of Text Summarization Techniques,” in Mining Text Data, Boston, MA: Springer US, 2012, pp. 43–76. DOI: https://doi.org/10.1007/978-1-4614-3223-4_3
- P. Mian, T. Conte, A. Natali, J. Biolchini, G. Travassos, “A systematic review process to software engineering,” ESELAW, vol. 32, 2005.
- T. Marew, J. Kim, D. H. Bae, “Systematic Mapping Studies in Software,” International Journal of Software Engineering and Knowledge Engineering, vol. 17, no. 1, pp. 33–55, 2007. DOI: https://doi.org/10.1142/S0218194007003112
- B. Kitchenham, O. Pearl Brereton, D. Budgen, M. Turner, J. Bailey, S. Linkman, “Systematic literature reviews in software engineering - A systematic literature review,” Information and Software Technology, vol. 51, no. 1, pp. 7–15, 2009. https://doi.org/10.1016/j.infsof.2008.09.009 DOI: https://doi.org/10.1016/j.infsof.2008.09.009
- O. Kaiwartya et al., “Guidelines for performing Systematic Literature Reviews in Software Engineering,” IEEE Access, vol. 4, pp. 5356–5373, 2016. https://doi.org/10.1109/ACCESS.2016.2603219 DOI: https://doi.org/10.1109/ACCESS.2016.2603219
- M. Gambhir, V. Gupta, “Deep learning-based extractive text summarization with word-level attention mechanism,” Multimedia Tools and Applications, vol. 81, no. 15, pp. 20829–20852, 2022. https://doi.org/10.1007/s11042-022-12729-y DOI: https://doi.org/10.1007/s11042-022-12729-y
- X. Han, Q. Wang, Z. Chen, L. Hu, P. Hu, “OnSum: Extractive Single Document Summarization Using Ordered Neuron LSTM,” Lecture Notes in Computer Science, vol. 12837, pp. 605–615, 2021. https://doi.org/10.1007/978-3-030-84529-2_51 DOI: https://doi.org/10.1007/978-3-030-84529-2_51
- M. Rahul Raj, R. P. Haroon, N. V Sobhana, “A novel extractive text summarization system with self-organizing map clustering and entity recognition,” Sadhana., vol. 45, no. 1, e32, 2020. https://doi.org/10.1007/s12046-019-1248-0 DOI: https://doi.org/10.1007/s12046-019-1248-0
- A. Joshi, E. Fidalgo, E. Alegre, L. Fernández-Robles, “SummCoder: An unsupervised framework for extractive text summarization based on deep auto-encoders,” Expert Systems with Applications, vol. 129, pp. 200–215, 2019. https://doi.org/10.1016/j.eswa.2019.03.045 DOI: https://doi.org/10.1016/j.eswa.2019.03.045
- A. Qaroush, I. Abu Farha, W. Ghanem, M. Washaha, E. Maali, “An efficient single document Arabic text summarization using a combination of statistical and semantic features,” Journal of King Saud University - Computer and Information Sciences, vol. 33, no. 6, pp. 677–692, 2021. https://doi.org/10.1016/j.jksuci.2019.03.010 DOI: https://doi.org/10.1016/j.jksuci.2019.03.010
- A. Khurana, V. Bhatnagar, “Investigating Entropy for Extractive Document Summarization,” Expert Systems with Applications, vol. 187, e115820, 2022. https://doi.org/10.1016/j.eswa.2021.115820 DOI: https://doi.org/10.1016/j.eswa.2021.115820
- S. Agarwal, N. K. Singh, P. Meel, “Single-Document Summarization Using Sentence Embeddings and K-Means Clustering,” in Proceedings - IEEE 2018 International Conference on Advances in Computing, Communication Control and Networking, 2018, pp. 162–165. https://doi.org/10.1109/ICACCCN.2018.8748762 DOI: https://doi.org/10.1109/ICACCCN.2018.8748762
- A. Joshi, E. Fidalgo, E. Alegre, R. Alaiz-Rodriguez, “RankSum—An unsupervised extractive text summarization based on rank fusion,” Expert Systems with Applications, vol. 200, e116846, 2022. https://doi.org/10.1016/j.eswa.2022.116846 DOI: https://doi.org/10.1016/j.eswa.2022.116846
- N. Saini, S. Saha, A. Jangra, P. Bhattacharyya, “Extractive single document summarization using multi-objective optimization: Exploring self-organized differential evolution, grey wolf optimizer and water cycle algorithm,” Knowledge-Based Systems, vol. 164, pp. 45–67, 2019. https://doi.org/10.1016/j.knosys.2018.10.021 DOI: https://doi.org/10.1016/j.knosys.2018.10.021
- N. Saini, S. Saha, D. Chakraborty, P. Bhattacharyya, “Extractive single document summarization using binary differential evolution: Optimization of different sentence quality measures,” PLoS One, vol. 14, no. 11, e0223477, 2019. https://doi.org/10.1371/journal.pone.0223477 DOI: https://doi.org/10.1371/journal.pone.0223477
- F. S. Tabak, V. Evrim, “Event-based summarization of news articles,” Turkish Journal of Electrical Engineering and Computer Sciences, vol. 28, no. 2, pp. 850–864, 2020. https://doi.org/10.3906/elk-1904-98 DOI: https://doi.org/10.3906/elk-1904-98
- N. Kindo, G. Bhuyan, R. Padhy, A New Technique for Extrinsic Text Summarization, Springer, 2019. DOI: https://doi.org/10.1007/978-981-13-7150-9_4
- K. Arai, S. Kapoor, R. Bhatia, Single Document Extractive Text Summarization Using Neural Networks and Genetic Algorithm, Cham: Springer International Publishing, 2019.
- A. Sharaff, M. Jain, G. Modugula, “Feature based cluster ranking approach for single document summarization,” International Journal of Information Technology, vol. 14, no. 4, pp. 2057–2065, 2022. https://doi.org/10.1007/s41870-021-00853-1 DOI: https://doi.org/10.1007/s41870-021-00853-1
- W. S. El-Kassas, C. R. Salama, A. A. Rafea, H. K. Mohamed, “EdgeSumm: Graph-based framework for automatic text summarization,” Information Processing & Management, vol. 57, no. 6, e102264, 2020. https://doi.org/10.1016/j.ipm.2020.102264 DOI: https://doi.org/10.1016/j.ipm.2020.102264
- R. Srivastava, P. Singh, K. P. S. Rana, V. Kumar, “A topic modeled unsupervised approach to single document extractive text summarization,” Knowledge-Based Systems, vol. 246, e108636, 2022. https://doi.org/10.1016/j.knosys.2022.108636 DOI: https://doi.org/10.1016/j.knosys.2022.108636
- S. Kumar, M. Naveen, S. Sriparna, S. Pushpak, Scientific document summarization in multi-objective clustering framework,” 2021.
- X. Mao, H. Yang, S. Huang, Y. Liu, R. Li, “Extractive summarization using supervised and unsupervised learning,” Expert Systems with Applications, vol. 133, pp. 173–181, 2019. https://doi.org/10.1016/j.eswa.2019.05.011 DOI: https://doi.org/10.1016/j.eswa.2019.05.011
- A. Khurana, V. Bhatnagar, “Extractive Document Summarization using Non-negative Matrix Factorization,” in Lecture Notes in Computer Science, vol. 11707, pp. 76–90, 2019. DOI: https://doi.org/10.1007/978-3-030-27618-8_6
- D. Debnath, R. Das, P. Pakray, “Extractive single document summarization using multi-objective modified cat swarm optimization approach: ESDS-MCSO,” Neural Computing and Applications, vol. 4, e06337, 2021. https://doi.org/10.1007/s00521-021-06337-4 DOI: https://doi.org/10.1007/s00521-021-06337-4