Modelado de tópicos aplicado al análisis del papel del aprendizaje automático en revisiones sistemáticas
DOI:
https://doi.org/10.19053/20278306.v12.n2.2022.15271Palabras clave:
modelado de tópicos;, aprendizaje automático;, revisiones sistemáticas;, Asignación Latente de DirichletResumen
El objetivo de la investigación fue analizar el papel del aprendizaje automático de datos en las revisiones sistemáticas de literatura. Se aplicó la técnica de Procesamiento de Lenguaje Natural denominada modelado de tópicos, a un conjunto de títulos y resúmenes recopilados de la base de datos Scopus. Especificamente se utilizó la técnica de Asignación Latente de Dirichlet (LDA), a partir de la cual se lograron descubrir y comprender las temáticas subyacentes en la colección de documentos. Los resultados mostraron la utilidad de la técnica utilizada en la revisión exploratoria de literatura, al permitir agrupar los resultados por temáticas. Igualmente, se pudo identificar las áreas y actividades específicas donde más se ha aplicado el aprendizaje automático, en lo referente a revisiones de literatura. Se concluye que la técnica LDA es una estrategia fácil de utilizar y cuyos resultados permiten abordar una amplia colección de documentos de manera sistemática y coherente, reduciendo notablemente el tiempo de la revisión.
Descargas
Referencias
Aria, M., & Cuccurullo, C. (2017). bibliometrix: An R-tool for comprehensive science mapping analysis. Journal of Informetrics, 11(4), 959-975. https://doi.org/10.1016/j.joi.2017.08.007 DOI: https://doi.org/10.1016/j.joi.2017.08.007
Alamri, A., & Stevensony, M. (2015). Automatic identification of potentially contradictory claims to support systematic reviews. 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 930–937. https://doi.org/10.1109/BIBM.2015.7359808 DOI: https://doi.org/10.1109/BIBM.2015.7359808
Ambalavanan, A. K., & Devarakonda, M. V. (2020). Using the contextual language model BERT for multi-criteria classification of scientific articles. Journal of Biomedical Informatics, 112, 103578. https://doi.org/10.1016/j.jbi.2020.103578 DOI: https://doi.org/10.1016/j.jbi.2020.103578
Antons, D., Breidbach, C. F., Joshi, A. M., & Salge, T. O. (2021). Computational Literature Reviews: Method, Algorithms, and Roadmap. Organizational Research Methods, 1094428121991230. https://doi.org/10.1177/1094428121991230 DOI: https://doi.org/10.1177/1094428121991230
Arno, A., Elliott, J., Wallace, B., Turner, T., & Thomas, J. (2021). The views of health guideline developers on the use of automation in health evidence synthesis. Systematic Reviews, 10(1), 16. https://doi.org/10.1186/s13643-020-01569-2 DOI: https://doi.org/10.1186/s13643-020-01569-2
Asmussen, C. B., & Møller, C. (2019). Smart literature review: a practical topic modelling approach to exploratory literature review. Journal of Big Data, 6(1), 1-18.. https://doi.org/10.1186/s40537-019-0255-7 DOI: https://doi.org/10.1186/s40537-019-0255-7
Asuncion, A., Welling, M., Smyth, P., & Teh, Y. W. (2009). On smoothing and inference for topic models. Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence, UAI 2009. (pp. 25 - 36)
Bertolini, M., Mezzogori, D., Neroni, M., & Zammori, F. (2021). Machine Learning for industrial applications: a comprehensive literature review. Expert Systems with Applications, 175, 114820. https://doi.org/10.1016/j.eswa.2021.114820 DOI: https://doi.org/10.1016/j.eswa.2021.114820
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, (4–5). https://doi.org/10.1016/b978-0-12-411519-4.00006-9 DOI: https://doi.org/10.1016/B978-0-12-411519-4.00006-9
Chai, K. E. K., Lines, R. L. J., Gucciardi, D. F., & Ng, L. (2021). Research Screener: a machine learning tool to semi-automate abstract screening for systematic reviews. Systematic Reviews, 10(1), 93. https://doi.org/10.1186/s13643-021-01635-3 DOI: https://doi.org/10.1186/s13643-021-01635-3
Chishtie, J. A., Babineau, J., Bielska, I. A., Cepoiu-Martin, M., Irvine, M., Koval, A., Marchand, J.-S., Turcotte, L., Jeji, T., & Jaglal, S. (2019). Visual Analytic Tools and Techniques in Population Health and Health Services Research: Protocol for a Scoping Review. JMIR Research Protocols, 8(10), e14019. https://doi.org/10.2196/14019 DOI: https://doi.org/10.2196/14019
Cohen, A. M., Ambert, K., & McDonagh, M. (2009). Cross-topic learning for work prioritization in systematic review creation and update. Journal of the American Medical Informatics Association: JAMIA, 16(5), 690–704. https://doi.org/10.1197/jamia.M3162 DOI: https://doi.org/10.1197/jamia.M3162
Elliott, J. H., Synnot, A., Turner, T., Simmonds, M., Akl, E. A., McDonald, S., ... & Pearson, L. (2017). Living systematic review: 1. Introduction—the why, what, when, and how. Journal of Clinical Epidemiology, 91, 23-30. https://doi.org/10.1016/j.jclinepi.2017.08.010 DOI: https://doi.org/10.1016/j.jclinepi.2017.08.010
Gates, A., Guitard, S., Pillay, J., Elliott, S. A., Dyson, M. P., Newton, A. S., & Hartling, L. (2019). Performance and Usability of Machine Learning for Screening in Systematic Reviews: A Comparative Evaluation of Three Tools. https://doi.org/10.23970/ahrqepcmethmachineperformance DOI: https://doi.org/10.23970/AHRQEPCMETHMACHINEPERFORMANCE
Gates, A., Johnson, C., & Hartling, L. (2018). Technology-assisted title and abstract screening for systematic reviews: a retrospective evaluation of the Abstrackr machine learning tool. Systematic Reviews, 7(1), 45. https://doi.org/10.1186/s13643-018-0707-8 DOI: https://doi.org/10.1186/s13643-018-0707-8
Gates, A., Vandermeer, B., & Hartling, L. (2018). Technology-assisted risk of bias assessment in systematic reviews: a prospective cross-sectional evaluation of the RobotReviewer machine learning tool. Journal of Clinical Epidemiology, 96, 54–62). https://doi.org/10.1016/j.jclinepi.2017.12.015 DOI: https://doi.org/10.1016/j.jclinepi.2017.12.015
Genc, Y., Altuger-Genc, G., & Tatoglu, A. (2020). Systematic Review of ASEE Conference Proceedings (2007-2016) with A Machine Learning Approach. International Journal of Engineering Education, 36(5), 1722–1735.
Gorunescu, F. (2011). Data Mining: Concepts, models and techniques (Vol. 12). Springer Science & Business Media. https://doi.org/10.1007/978-3-642-19721-5 DOI: https://doi.org/10.1007/978-3-642-19721-5
Guler, S., Capkin, S., & Sezgin, E. A. (2021). The Evolution of Publications in the Field of Scoliosis: A Detailed Investigation of Global Scientific Output Using Bibliometric Approaches. Turkish Neurosurgery, 31(1). https://doi.org/10.5137/1019-5149.JTN.30216-20.2 DOI: https://doi.org/10.5137/1019-5149.JTN.30216-20.2
Hamel, C., Hersi, M., Kelly, S. E., Tricco, A. C., Straus, S., Wells, G., Pham, B., & Hutton, B. (2021). Guidance for using artificial intelligence for title and abstract screening while conducting knowledge syntheses. BMC Medical Research Methodology, 21(1), 285. https://doi.org/10.1186/s12874-021-01451-2 DOI: https://doi.org/10.1186/s12874-021-01451-2
Hamel, C., Kelly, S. E., Thavorn, K., Rice, D. B., Wells, G. A., & Hutton, B. (2020). An evaluation of DistillerSR’s machine learning-based prioritization tool for title/abstract screening - impact on reviewer-relevant outcomes. BMC Medical Research Methodology, 20(1), 256. https://doi.org/10.1186/s12874-020-01129-1 DOI: https://doi.org/10.1186/s12874-020-01129-1
Jelodar, H., Wang, Y., Yuan, C., Feng, X., Jiang, X., Li, Y., & Zhao, L. (2019). Latent Dirichlet allocation (LDA) and topic modeling: models, applications, a survey. Multimedia Tools and Applications, 78(11), 15169 - 14211. https://doi.org/10.1007/s11042-018-6894-4 DOI: https://doi.org/10.1007/s11042-018-6894-4
Jonnalagadda, S. R., Goyal, P., & Huffman, M. D. (2015). Automating data extraction in systematic reviews: A systematic review. Systematic Reviews, 4(1), 1 – 16. https://doi.org/10.1186/s13643-015-0066-7 DOI: https://doi.org/10.1186/s13643-015-0066-7
Jurafsky, D., & Martin, J. H. (2008). Speech and Language Processing: An introduction to speech recognition, computational linguistics and natural language processing. Upper Saddle River, NJ: Prentice Hall.
Kang, Z., Catal, C., & Tekinerdogan, B. (2020). Machine learning applications in production lines: A systematic literature review. Computers & Industrial Engineering, 149. https://doi.org/10.1016/j.cie.2020.106773 DOI: https://doi.org/10.1016/j.cie.2020.106773
Khamparia, A., & Singh, K. M. (2019). A systematic review on deep learning architectures and applications. Expert Systems, 36(3). https://doi.org/10.1111/exsy.12400 DOI: https://doi.org/10.1111/exsy.12400
Kherwa, P., & Bansal, P. (2018). Topic Modeling: A Comprehensive Review. ICST Transactions on Scalable Information Systems, 7(24). https://doi.org/10.4108/eai.13-7-2018.159623 DOI: https://doi.org/10.4108/eai.13-7-2018.159623
Klymenko, O., Braun, D., & Matthes, F. (2020). Automatic Text Summarization: A State-of-the-Art Review. En Proceedings of the 22nd International Conference on Enterprise Information Systems. https://doi.org/10.5220/0009723306480655 DOI: https://doi.org/10.5220/0009723306480655
Kowsari, K., Jafari-Meimandi, K., Heidarysafa, M., Mendu, S., Barnes, L., & Brown, D. (2019). Text classification algorithms: A survey. Information, 10(4). https://doi.org/10.3390/info10040150 DOI: https://doi.org/10.3390/info10040150
Kumeno, F. (2020). Sofware engneering challenges for machine learning applications: A literature review. Intelligent Decision Technologies, 13(4), 463 – 476. https://doi.org/10.3233/idt-190160 DOI: https://doi.org/10.3233/IDT-190160
Maier, D., Waldherr, A., Miltner, P., Wiedemann, G., Niekler, A., Keinert, A., Pfetsch, B., Heyer, G., Reber, U., Häussler, T., Schmid-Petri, H., & Adam, S. (2018). Applying LDA Topic Modeling in Communication Research: Toward a Valid and Reliable Methodology. Communication Methods and Measures, 12(2-3), 93–118. https://doi.org/10.1080/19312458.2018.1430754 DOI: https://doi.org/10.1080/19312458.2018.1430754
Marín-López, J., Robledo, S., & Duque-Méndez, N. (2017). Marketing Emprendedor: Una perspectiva cronológica utilizando Tree of Science. Revista Civilizar De Empresa Y Economía, 7(13), 113-123.
Marín-Velásquez, T. D., & Arrojas-Tocuyo, D. D. J. (2021). Revistas científicas de América Latina y el Caribe en SciELO, Scopus y Web of Science en el área de Ingeniería y Tecnología: su relación con variables socioeconómicas. Revista Española de Documentación Científica, 44(3). https://doi.org/10.3989/redc.2021.3.1812 DOI: https://doi.org/10.3989/redc.2021.3.1812
Marshall, I. J., Johnson, B. T., Wang, Z., Rajasekaran, S., & Wallace, B. C. (2020). Semi-Automated evidence synthesis in health psychology: current methods and future prospects. Health Psychology Review, 14(1), 145–158. https://doi.org/10.1080/17437199.2020.1716198 DOI: https://doi.org/10.1080/17437199.2020.1716198
Marshall, I. J., Kuiper, J., & Wallace, B. C. (2016). RobotReviewer: evaluation of a system for automatically assessing bias in clinical trials. Journal of the American Medical Informatics Association: JAMIA, 23(1), 193–201. https://doi.org/10.1093/jamia/ocv044 DOI: https://doi.org/10.1093/jamia/ocv044
Millard, T., Synnot, A., Elliott, J., Green, S., McDonald, S., & Turner, T. (2019). Feasibility and acceptability of living systematic reviews: results from a mixed-methods evaluation. Systematic Reviews, 8(1), 325. https://doi.org/10.1186/s13643-019-1248-5 DOI: https://doi.org/10.1186/s13643-019-1248-5
Millán, J. D., Polanco, F., Ossa, J. C., Béria, J. S., & Cudina, J. N. (2017). La cienciometría, su método y su filosofía: Reflexiones epistémicas de sus alcances en el siglo XXI. Revista Guillermo de Ockham, 15(2), 17-27. https://doi.org/10.21500/22563202.3492 DOI: https://doi.org/10.21500/22563202.3492
Moher, D., Liberati, A., Tetzlaff, J., Altman, D. G., & PRISMA, G. (2014). Ítems de referencia para publicar revisiones sistemáticas y metaanálisis: la Declaración PRISMA. Revista Española de Nutrición Humana y Dietética, 18(3), 172-181. https://doi.org/10.14306/renhyd.18.3.114 DOI: https://doi.org/10.14306/renhyd.18.3.114
O’Mara-Eves, A., Thomas, J., McNaught, J., Miwa, M., & Ananiadou, S. (2015). Using text mining for study identification in systematic reviews: a systematic review of current approaches. Systematic Reviews, 4(1), 1-22. https://doi.org/10.1186/2046-4053-4-5 DOI: https://doi.org/10.1186/2046-4053-4-5
Ouzzani, M., Hammady, H., Fedorowicz, Z., & Elmagarmid, A. (2016). Rayyan-a web and mobile app for systematic reviews. Systematic Reviews, 5(1), 210. https://doi.org/10.1186/s13643-016-0384-4 DOI: https://doi.org/10.1186/s13643-016-0384-4
Porteous, I., Newman, D., Ihler, A., Asuncion, A., Smyth, P., & Welling, M. (2008). Fast collapsed gibbs sampling for latent dirichlet allocation. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, 569-577. DOI: https://doi.org/10.1145/1401890.1401960
Prabhakaran, S. (2018). Topic Modeling with Gensim (Python). Machine Learning Plus.
Qiang, J., Qian, Z., Li, Y., Yuan, Y., & Wu, X. (2020). Short text topic modeling techniques, applications, and performance: a survey. IEEE Transactions on Knowledge and Data Engineering. https://doi.org/10.1109/tkde.2020.2992485 DOI: https://doi.org/10.1109/TKDE.2020.2992485
Ramírez-Carvajal, D., Toro-Cardona, A., & Grisales-Aguirre, A. (2021). Competencias en networking: perspectivas desde una revisión literaria. Revista de Ingenierías Interfaces, 4(1), 103 -127.
Ramos-Enríquez, V., Duque, P., & Salazar, J. A. V. (2021). Responsabilidad Social Corporativa y Emprendimiento: evolución y tendencias de investigación. Desarrollo Gerencial, 13(1), 1–34. https://doi.org/10.17081/dege.13.1.4210 DOI: https://doi.org/10.17081/dege.13.1.4210
Rethlefsen, M. L., Kirtley, S., Waffenschmidt, S., Ayala, A. P., Moher, D., Page, M. J., & Koffel, J. B. (2021). PRISMA-S: an extension to the PRISMA statement for reporting literature searches in systematic reviews. Systematic Reviews, 10(1), 1-19. https://doi.org/10.1186/s13643-020-01542-z DOI: https://doi.org/10.5195/jmla.2021.962
Robledo, S., Grisales-Aguirre, A. M., Hughes, M., & Eggers, F. (2021). “Hasta la vista, baby”–will machine learning terminate human literature reviews in entrepreneurship? Journal of Small Business Management, 1-30. https://doi.org/10.1080/00472778.2021.1955125 DOI: https://doi.org/10.1080/00472778.2021.1955125
Röder, M., Both, A., & Hinneburg, A. (2015). Exploring the space of topic coherence measures. In Proceedings of the eighth ACM international conference on Web search and data mining (pp. 399-408). https://doi.org/10.1145/2684822.2685324 DOI: https://doi.org/10.1145/2684822.2685324
Rodríguez-Jiménez, A., & Pérez-Jacinto, A. O. (2017). Métodos científicos de indagación y de construcción del conocimiento. Revista Escuela de Administración de negocios, (82), 175-195. https://doi.org/10.21158/01208160.n82.2017.1647 DOI: https://doi.org/10.21158/01208160.n82.2017.1647
Sangwan, N., & Bhatnagar, V. (2020). Comprehensive Contemplation of Probabilistic Aspects in Intelligent Analytics. International Journal of Service Science, Management, Engineering and Technology (IJSSMET), 11(1), 116–141. https://doi.org/10.4018/IJSSMET.2020010108 DOI: https://doi.org/10.4018/IJSSMET.2020010108
Sami, I. R. (2020). Automatic Contextual Storytelling in a Natural Language Corpus. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management (pp. 3249-3252). https://doi.org/10.1145/3340531.3418507 DOI: https://doi.org/10.1145/3340531.3418507
Sutton, A., & Marshall, C. (2017). Mapping The Systematic Review Toolbox. Value in Health, 20(9). https://doi.org/10.1016/j.jval.2017.08.2232 DOI: https://doi.org/10.1016/j.jval.2017.08.2232
Soboczenski, F., Trikalinos, T. A., Kuiper, J., Bias, R. G., Wallace, B. C., & Marshall, I. J. (2019). Machine learning to help researchers evaluate biases in clinical trials: a prospective, randomized user study. BMC Medical Informatics and Decision Making, 19(1), 96. https://doi.org/10.1186/s12911-019-0814-z DOI: https://doi.org/10.1186/s12911-019-0814-z
Tighe, P. J., Sannapaneni, B., Fillingim, R. B., Doyle, C., Kent, M., Shickel, B., & Rashidi, P. (2020). Forty-two Million Ways to Describe Pain: Topic Modeling of 200,000 PubMed Pain-Related Abstracts Using Natural Language Processing and Deep Learning-Based Text Generation. Pain Medicine , 21(11), 3133–3160. https://doi.org/10.1093/pm/pnaa061 DOI: https://doi.org/10.1093/pm/pnaa061
Tranfield, D., Denyer, D., & Smart, P. (2003). Towards a methodology for developing evidence‐informed management knowledge by means of systematic review. British Journal of Management, 14(3), 207-222. https://doi.org/10.1111/1467-8551.00375 DOI: https://doi.org/10.1111/1467-8551.00375
Tsou, A. Y., Treadwell, J. R., Erinoff, E., & Schoelles, K. (2020). Machine learning for screening prioritization in systematic reviews: comparative performance of Abstrackr and EPPI-Reviewer. Systematic Reviews, 9(1), 73. https://doi.org/10.1186/s13643-020-01324-7 DOI: https://doi.org/10.1186/s13643-020-01324-7
Urrútia, G., & Bonfill, X. (2010). Declaración PRISMA: una propuesta para mejorar la publicación de revisiones sistemáticas y metaanálisis. Medicina clínica, 135(11), 507-511. https://doi.org/10.1016/j.medcli.2010.01.015 DOI: https://doi.org/10.1016/j.medcli.2010.01.015
Valencia-Hernández, D. S., Robledo, S., Pinilla, R., Duque-Méndez, N. D., & Olivar-Tost, G. (2020). Sap algorithm for citation analysis: An improvement to tree of science. Ingeniería e Investigación, 40(1). https://doi.org/10.15446/ing.investig.v40n1.77718 DOI: https://doi.org/10.15446/ing.investig.v40n1.77718
Vinkers, C. H., Lamberink, H. J., Tijdink, J. K., Heus, P., Bouter, L., Glasziou, P., Moher, D., Damen, J. A., Hooft, L., & Otte, W. M. (2021). The methodological quality of 176,620 randomized controlled trials published between 1966 and 2018 reveals a positive trend but also an urgent need for improvement. PLoS Biology, 19(4), e3001162. https://doi.org/10.1371/journal.pbio.3001162 DOI: https://doi.org/10.1371/journal.pbio.3001162
Waffenschmidt, S., Hausner, E., Sieben, W., Jaschinski, T., Knelangen, M., & Overesch, I. (2018). Effective study selection using text mining or a single-screening approach: a study protocol. Systematic Reviews, 7(1), 166. https://doi.org/10.1186/s13643-018-0839-x DOI: https://doi.org/10.1186/s13643-018-0839-x
Walker, V. R., Schmitt, C. P., Wolfe, M. S., Nowak, A. J., Kulesza, K., Williams, A. R., Shin, R., Cohen, J., Burch, D., Stout, M. D., Shipkowski, K. A., & Rooney, A. A. (2022). Evaluation of a semi-automated data extraction tool for public health literature-based reviews: Dextr. Environment International, 159, 107025. https://doi.org/10.1016/j.envint.2021.107025 DOI: https://doi.org/10.1016/j.envint.2021.107025
Wallace, B. C. (2018). Automating biomedical evidence synthesis: Recent work and directions forward. BIRNDL@ SIGIR. https://openreview.net/pdf?id=Hkby3SWO-B
Wang, C., Paisley, J., & Blei, D. (2011). Online variational inference for the hierarchical Dirichlet process. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics 752-760. JMLR Workshop and Conference Proceedings.
Wei, J., Han, S., & Zou, L. (2020). Vision-kg: Topic-centric visualization system for summarizing knowledge graph. In Proceedings of the 13th International Conference on Web Search and Data Mining, 857-860. https://doi.org/10.1145/3336191.3371863 DOI: https://doi.org/10.1145/3336191.3371863
Weißer, T., Saßmannshausen, T., Ohrndorf, D., Burggräf, P., & Wagner, J. (2020). A clustering approach for topic filtering within systematic literature reviews. MethodsX, 7, 100831. https://doi.org/10.1016/j.mex.2020.100831 DOI: https://doi.org/10.1016/j.mex.2020.100831
Xie, T., Qin, P., & Zhu, L. (2018). Study on the Topic Mining and Dynamic Visualization in View of LDA Model. Modern Applied Science, 13(1), 204. https://doi.org/10.5539/mas.v13n1p204 DOI: https://doi.org/10.5539/mas.v13n1p204
Zhang, C., Li, Z., & Zhang, J. (2018). A survey on visualization for scientific literature topics. Journal of Visualization, 21(2), 321-335. https://doi.org/10.1007/s12650-017-0462-2 DOI: https://doi.org/10.1007/s12650-017-0462-2
Zhao, H., Phung, D., Huynh, V., Jin, Y., Du, L., & Buntine, W. (2021). Topic Modelling Meets Deep Neural Networks: A Survey. https://doi.org/10.24963/ijcai.2021/638 DOI: https://doi.org/10.24963/ijcai.2021/638
Zimmerman, J., Soler, R. E., Lavinder, J., Murphy, S., Atkins, C., Hulbert, L., Lusk, R., & Ng, B. P. (2021). Iterative guided machine learning-assisted systematic literature reviews: a diabetes case study. Systematic Reviews, 10(1), 97. https://doi.org/10.1186/s13643-021-01640-6 DOI: https://doi.org/10.1186/s13643-021-01640-6
Zuluaga, M., Robledo, S., Osorio-Zuluaga, G. A., Yathe, L., Gonzalez, D., & Taborda, G. (2016). Metabolomics and pesticides: systematic literature review using graph theory for analysis of references. Nova, 14(25), 121-138. https://doi.org/10.22490/24629448.1735 DOI: https://doi.org/10.22490/24629448.1735
Publicado
-
Resumen135
-
PDF152
-
XML2
Cómo citar
Número
Sección
Licencia
Los artículos aquí publicados están protegidos bajo una licencia Licencia Creative Commons Atribución 4.0 Internacional. El contenido de los artículos es responsabilidad de cada autor y no compromete, de ninguna manera, a la revista o a la institución. Se permite la divulgación y reproducción de títulos, resúmenes y contenido total, con fines académicos, científicos, culturales y/o comerciales, siempre y cuando se cite la respectiva fuente.