Peiying Liu1, Pin Wang2, Haidong Liu1
1College of Science, Tianjin University of Commerce, Tianjin, China
2College of Big Data Statistics, Guizhou University of Finance and Economics, Guiyang, China
The rapid development of the Internet has brought more convenience to people's lives. Consumers begin to choose to obtain services through the Internet, and tourism has gradually become an important member of e-commerce. Most tourists choose to learn about scenic spots online in advance and buy tickets through tourism websites. In the face of a large number of consumers, scenic spot managers should pay attention to online feedback information so as to improve all aspects of the scenic spot. In this paper, the characteristic tourist attractions of Guizhou are taken as an example, the octopus collector is used to climb the comment information of major tourism websites, and the comment information is processed by python and R, the word cloud map of positive and negative emotion words is constructed, and the feature words with high attention are obtained. The topic extraction is carried out by the LDA model, and the comment information is further analyzed by combining the positive and negative emotion words. The paper focuses on exploring the concerns of tourists, and puts forward suggestions with commercial value for Guizhou characteristic tourist attractions.
Tourism, Python. R, LDA topic model, Management decision
Peiying Liu, Pin Wang, Haidong Liu. Tourism Review Text Mining in Guizhou Province Based on LDA Topic Model. Academic Journal of Business & Management (2023) Vol. 5, Issue 8: 141-150. https://doi.org/10.25236/AJBM.2023.050824.
 Jun Li. A Survey of Sentiment Analysis and Opinion Mining on Product Reviews. Modern Computer, (2013) 7, 11-16.
 Hongmei Yin, Kangning Xiong, Zaimei Mei. A Study of Scenery Characteristics and Tourist Exploiting System in the Karst Reservoir Areas, Guizhou Province. Carsologica Sinica. (2002) 2, 131-136.
 Lianchao Cui. Research on sentiment Analysis of Internet reviews. Shandong University. (2015).
 Bo Wang, Shengbo Liu, Zeyuan Liu. Patent content analysis method based on LDA topic model. Science Research Management. (2015, 36) 3, 111-117.
 Ge Xu, Hongfeng Wang. The Development of Topic Models in Natural Language Processing. Chinese Journal of Computers. (2011, 34) 8, 1423-1436.
 Wenjuan Wei, Jiaxin Han, Haiyang Xia. Research on Text Classification Based on Python Natural Language Processing. Journal of Fujian Computer. (2016, 32) 7, 4-5+8.
 Xuanjing Huang, Qi Zhang, Yuanbin Wu. A Survey on Sentiment Analysis. Journal of Chinese Information Processing. (2011, 25) 6,118-126.
 Yulin Liu, Lirong Jian. Data Mining of E-commerce Online Reviews Based on Sentiment Analysis. Journal of Statistics and Information. (2018, 33) 6,119-124.
 Ying Li. Research on the Text Pretreatment Based on Part of Speech Selection. Information Science. (2009, 27) 5,717-719+738.
 Xinxiang Cao. A Comparison Study on the Development Potential of Transprovincial Tourism Industry in China. Human Geography. (2007) 1, 18-22.
 Shiming Yang, Peixuan Yang. Analysis of Guizhou tourism service management quality optimization path under the background of big data. China Journal of Commerce. (2018), 63-64.