by Andrei P. Kirilenko, Svetlana Stepchenkova & Luyu Wang
University of Florida
Sentiment analysis attempts to measure peoples’ feelings such as joy, sadness, anger, and similar, usually expressed in a written text. Since the decision-making process is largely based on knowledge about opinions of others, the sentiment analysis has become central in all areas of business dealing with customers and in many social sciences. While traditionally sentiment analysis was conducted manually by trained raters, the growth of the Big Data, and especially the product reviews in the social media prompted development of the computer sentiment analysis algorithms. Two main approaches to sentiment analysis are lexicon based and machine learning. The lexicon-based approach is based on a pre-existing dictionary of words together with their sentiment orientation and strength; it is fast in implementation yet returns modest performance. In a machine learning approach, a sample of documents from the same domain is manually classified according to the sentiment expressed and then used to train and validate a machine learning algorithm. While this approach requires significant initial expenses in terms of human time and computer resources, it also typically results in a better performing algorithm. Whichever approach is used, an accurate validation of model performance using appropriate indicators is a must.