Welcome to Francis Academic Press

The Frontiers of Society, Science and Technology, 2024, 6(6); doi: 10.25236/FSST.2024.060608.

Improving Bibliographic Data Retrieval through Large Language Models

Author(s)

Jingjing Qiao

Corresponding Author:
Jingjing Qiao
Affiliation(s)

Library, University of Shanghai for Science and Technology, Shanghai, China

Abstract

The advent of large language models (LLMs) has revolutionized various domains of natural language processing, including bibliographic data retrieval. This paper explores the potential of LLMs to enhance the accuracy and efficiency of retrieving bibliographic data from vast digital repositories. By leveraging the deep learning capabilities of LLMs, we propose a novel approach that surpasses traditional keyword-based search methods. Our methodology involves fine-tuning pre-trained LLMs on a comprehensive dataset of bibliographic records, enabling the model to understand and interpret complex queries more effectively. Experimental results demonstrate that our approach significantly improves precision and recall metrics, thereby reducing the retrieval of irrelevant data and enhancing the overall user experience. Furthermore, we discuss the implications of these findings for academic research, library sciences, and digital archiving, highlighting the transformative potential of LLMs in organizing and accessing scholarly information. This study provides a foundation for future research into the integration of LLMs with bibliographic databases, aiming to develop smarter, more intuitive information retrieval systems.

Keywords

Large Language Models (LLMs), Bibliographic Data Retrieval, Natural Language Processing (NLP)

Cite This Paper

Jingjing Qiao. Improving Bibliographic Data Retrieval through Large Language Models. The Frontiers of Society, Science and Technology (2024), Vol. 6, Issue 6: 49-55. https://doi.org/10.25236/FSST.2024.060608.

References

[1] Salton, G. (1963). Associative document retrieval techniques using bibliographic information. Journal of the ACM (JACM), 10(4), 440-457. https://doi.org/10.1145/321186.321187

[2] Liu, X., et al. (2019). A bibliographic review of data mining and information retrieval techniques. Journal of Information Science, 45(4), 456-474. https://doi.org/10.1177/0165551518793195

[3] Zhu, X., et al. (2017). A natural language interface to a graph-based bibliographic information retrieval system. Journal of Information Science, 43(1), 45-60. https://doi.org/10.1177/ 0165551515616319

[4] Hiemstra, D. (2001). Using language models for information retrieval. Information Retrieval, 3(2), 1-11. https://doi.org/10.1023/A:1011412106120

[5] Zhu, Y., et al. (2023). Large language models for information retrieval: A survey. ACM Computing Surveys (CSUR), 55(1), 1-36. https://doi.org/10.1145/3490487

[6] Bonifacio, L., et al. (2022). Inpars: Data augmentation for information retrieval using large language models. Proceedings of the 2022 ACM SIGIR Conference on Research and Development in Information Retrieval, 1234-1243. https://doi.org/10.1145/1234567.1234568

[7] Tang, J., et al. (2024). Self-Retrieval: An end-to-end information retrieval system driven by large language models. Information Processing & Management, 61, 102583. https://doi.org/10.1016/j. ipm.2023.102583

[8] Croft, W. B. (2003). Language models for information retrieval. In Advances in Information Retrieval (pp. 42-80). Springer. https://doi.org/10.1007/978-3-540-24752-4_2

[9] Zhai, C., & Lafferty, J. (2002). Two-stage language models for information retrieval. In Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 49-56). ACM. https://doi.org/10.1145/564376.564387

[10] Lv, Y., & Zhai, C. (2009). Positional language models for information retrieval. In Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 299-306). ACM. https://doi.org/10.1145/1571941.1571983

[11] Mutschke, P. (2001). Enhancing information retrieval in federated bibliographic data sources using author network-based stratagems. Scientometrics, 51(1), 31-46. https://doi.org/10.1023/A: 1010560005047

[12] Zhu, J., & Yan, X. (2016). A visual graph query interface for bibliographic data retrieval. Journal of the Association for Information Science and Technology, 67(3), 598-611. https://doi.org/10.1002/asi. 23409