Evaluating the Performance of Topic Modeling Techniques for Bibliometric Analysis Research: An LDA-based Approach

Bibliometric LDA Topic Modeling Topic Trends Performance Evaluation.

Authors

  • Lan Thi Nguyen Department of Information Science, Faculty of Humanities and Social Sciences, Khon Kaen University, Khon Kaen 40002,, Thailand
  • Wirapong Chansanam
    wirach@kku.ac.th
    Department of Information Science, Faculty of Humanities and Social Sciences, Khon Kaen University, Khon Kaen 40002,, Thailand http://orcid.org/0000-0001-5546-8485
  • Nalatpa Hunsapun Department of Information Science, Faculty of Humanities and Social Sciences, Khon Kaen University, Khon Kaen 40002,, Thailand
  • Vispat Chaichuay Department of Information Science, Faculty of Humanities and Social Sciences, Khon Kaen University, Khon Kaen 40002,, Thailand
  • Suparp Kanyacome Faculty of Science and Engineering, Kasetsart University, Sakon Nakhon 47000,, Thailand
  • Akkharawoot Takhom Faculty of Engineering, Thammasat School of Engineering, Thammasat University, Pathum Thani 12120,, Thailand
  • Yuttana Jaroenruen Informatics Innovative Center of Excellence, Walailak University, Thai Buri, Nakhon Si Thammarat 80160,, Thailand
  • Chunqiu Li School of Government, Beijing Normal University, Beijing 100875,, China
Vol. 5 No. 2 (2024): June
Research Articles

Downloads

Digital technologies have been used for a vast amount of bibliometric analysis research. Although these technologies have made scientific investigation more accessible and efficient, scholars now face the daunting task of sifting through an overwhelming number of documents. This study aims to identify bibliometric research analysis's primary topics, categories, and latent topics from a global perspective. This study utilized topic modeling techniques to analyze the abstracts of 16,039 eligible papers published between 1977 and 2023 in the Scopus database. Through the use of Latent Dirichlet Allocation (LDA) topic modeling, the study was able to identify four distinct research topics and observe how they have evolved over time. The research topic has shifted its focus from individual concepts and words to relationships between nodes and conceptual, intellectual, and social structures. The study's findings have significant implications for bibliometric analysis-related research, providing valuable insights into trends and patterns in bibliometric analysis content within large digital article archives. The LDA has proven to be an efficient tool for analyzing these trends and patterns quickly. This study's novel approach considers factors for word embedding usage and optimal topic numbers. It focuses on a full understanding of the LDA results and combines statistical analysis, domain knowledge, and temporal exploration to better understand how data structures work.

 

Doi: 10.28991/HIJ-2024-05-02-07

Full Text: PDF