Topic Modeling from Bibliometric Research of the 2019–2023 Data through Scopus

Authors

  • Pornnisa Wattanasiri

Keywords:

Topic modeling, Bibliometric, Data analysis, Mapping Science

Abstract

This article aims to analyze the titles and indexed keywords of research articles in bibliometric research using topic modeling techniques. The analysis was based on articles collected from the Scopus database between 2019 and 2023, focusing on identifying the most frequently occurring words in the titles and indexed keywords of the articles to reveal the relationships or occurring words among the topics studied in this field in R Studio and Python. The study identified 3,931 relevant articles and topic models based on the most frequently occurring words within the analyzed titles and indexed keywords. These topics primarily encompassed health sciences and medicine, with additional management, social sciences, education, and information technology coverage. The study also revealed that the most commonly indexed keywords did not differ significantly from the frequently occurring words in the analyzed titles, further emphasizing environmental studies. This study was valuable for researchers, academics, and others interested in analyzing research articles using topic modeling or other related techniques. 

References

Agarwal, A., Durairajanayagam, D., Tatagari, S., Esteves, S. C., Harlev, A., Henkel, R., Roychoudhury, S., Homa,

S., Puchalt, N. G., Ramasamy, R., Majzoub, A., Ly, K. D., Tvrda, E., Assidi, M., Kesari, K., Sharma, R., Banihani, S., Ko, E., Abu-Elmagd,

M., Gosalvez, J., & Bashir, A. (2016). Bibliometrics: tracking research impact by selecting the appropriate metrics. Asian Journal of

Andrology, 18(2), 296-309.

Bedekar, M., & Desai, S. (2022). Analysis of research paper titles containing covid-19 keyword using various

visualization techniques. Smart Innovation, Systems and Technologies, 302, 115–122.

Chander, R., Dhar, M., & Bhatt, K. (2022). Bibliometric analysis of studies on library security issues in academic

Institutions. Journal of Access Services, 19, 86-104.

Choi, Y. - J., & Um, Y. - J. (2023). Topic models to analyze disaster-related newspaper articles: focusing on

COVID-19. International Journal of Mental Health Promotion, 25(3), 421-431.

Crossley, S., Salsbury, T., Titak, A., & McNamara, D. (2014). Frequency effects and second language lexical

acquisition word types, word tokens, and word production. International Journal of Corpus Linguistics, 19(3), 301-332.

Deshmukh, D., Patil, R., Chafale, S., & Basutiya, S. (2019). Twitter sentiment analysis using R-studio. Journal

of Emerging Technologies and Innovative Research, 6(5), 7-12.

Donthu, N., Kumar, S., Mukherjee, D., Pandey, N., & Lim, W. M. (2021). How to conduct a bibliometric analysis:

An overview and guidelines. Journal of Business Research, 133, 285–296.

Ellegaard, O., & Wallin, J.A. (2015). The bibliometric analysis of scholarly production: How great is the

impact?. Scientometrics, 105, 1809–1831.

Garcia-Zorita, C., & Pacios, A. R. (2018). Topic modelling characterization of Mudejar art based on document

titles. Digital Scholarship in the Humanities, 33(3), 529–539.

Grün, B., & Hornik, K. (2011). Topicmodels: An R package for fitting topic models. Journal of Statistical

Software, 40(13), 1-30.

Grün, B., & Hornik, K. (2023, April 14). Topicmodels: An R package for fitting topic models.

https://cran.r-project.org/web/packages/topicmodels/vignettes/topicmodels.pdf

Guo, Y., Barnes, S. J., & Jia, Q. (2017). Mining meaning from online ratings and reviews: Tourist satisfaction

analysis using latent Dirichlet allocation. Tourism Management, 59, 467–483.

Isoaho, K., Gritsenko, D., & Makela, E. (2021). Topic modeling and text analysis for qualitative policy research.

Policy Studies Journal, 49(1), 300-324.

Kwok, S. W. H., Vadde, S. K., & Wang, G. (2021). Tweet topics and sentiments relating to COVID-19 vaccination

among Australian twitter users: Machine learning analysis. Journal of Medical Internet Research,

(5), e26953.

Liu, X., Zhang, L., & Hong, S. (2011). Global biodiversity research during 1900–2009: A bibliometric analysis.

Biodiversity and Conservation, 20(4), 807-826.

Natukunda, A., & Muchene, L. K. (2023). Unsupervised title and abstract screening for systematic review: a

retrospective case-study using topic modelling methodology. Systematic Reviews, 12(1), 1-16.

Pritchard, A. (1969). Statistical bibliography or bibliometrics. Journal of Documentation, 25(4), 348-349.

Qi, P., Zhang, Y., Zhang, Y., Bolton, J., & Manning, C. D. (2020, July 5-10). Stanza: A Python natural language

processing toolkit for many human languages. [Conference session]. The 58th Annual Meeting of the Association for Computational Linguistics, Seattle, Washington, USA.

Saxton, M. D. (2018). A gentle introduction to topic modeling using python. Theological Librarianship, 11(1),

-27.

Sievert, C. & Shirley, K. (2014, June). LDAvis: A method for visualizing and interpreting topics. In Proceedings

of the workshop on interactive language learning, visualization, and interfaces. (pp. 63-70).

Sweileh, W. M. (2020). Bibliometric analysis of peer-reviewed literature on climate change and human

health with an emphasis on infectious diseases. Globalization and Health, 16(1), 44.

Wu, Y. - C., Chen, C. - S., & Chan, Y. - J. (2020). The outbreak of COVID-19: An overview. Journal of the

Chinese Medical Association, 83(3), 217–220.

Yeo Jin, J., & Youngmin, K. (2023). Research trends of sustainability and marketing research, 2010–2020:

Topic modeling analysis. Heliyon, 9(3), e14208.

Zhang, S., Ly, L., Mach, N., & Amaya, C. (2022). Topic modeling and sentiment analysis of yelp restaurant

reviews. International Journal of Information Systems in the Service Sector, 14(1), Article no.72.

Zupic, I., & Čater, T. (2015). Bibliometric methods in management and organization. Organizational Research

Methods, 18(3), 429-472.

Additional Files

Published

20-06-2024

Issue

Section

Research Articles