by Roman Egger & Enes Gokce
Salzburg University of Applied Sciences, Innovation and Management in Tourism
Pennsylvania State University
With the increase in internet usage, the amount of available textual data has also continued to increase rapidly. In addition, the development of stronger computers has enabled the processing of data to become much easier. The tourism field has a strong potential to utilise such data available on the internet; yet, on the other hand, a high proportion of available data is unlabelled and unprocessed. In order to use them effectively, new methods and new approaches are needed. In this regard, the area of Natural Language Processing (NLP) helps researchers to utilise textual data and develop an understanding of text analysis. By using machine learning approaches, text mining potential can expand enormously, leading to deeper insights, a better understanding of social phenomena, and, thus, also a better basis for decision making. As such, this chapter will provide the reader with the basics of NLP as well as present the text pre-processing procedure in detail.