Date of Graduation

12-2019

Document Type

Dissertation

Degree Name

Doctor of Philosophy in Computer Science (PhD)

Degree Level

Graduate

Department

Computer Science & Computer Engineering

Advisor/Mentor

Gauch, Susan E.

Committee Member

Luu, Khoa

Second Committee Member

Robinson, Samantha E.

Third Committee Member

Wu, Xintao

Fourth Committee Member

Sodero, Annibal

Keywords

Opinion shift detection; Semantic orientation; Sentiment analysis; Sentiment lexicons; Sentiment quantification; Sentiment shift detection; Text mining; Twitter

Abstract

This dissertation focuses on event detection within streams of Tweets based on sentiment quantification. Sentiment quantification extends sentiment analysis, the analysis of the sentiment of individual documents, to analyze the sentiment of an aggregated collection of documents. Although the former has been widely researched, the latter has drawn less attention but offers greater potential to enhance current business intelligence systems. Indeed, knowing the proportion of positive and negative Tweets is much more valuable than knowing which individual Tweets are positive or negative. We also extend our sentiment quantification research to analyze the evolution of sentiment over time to automatically detect a shift in sentiment with respect to a topic or entity.

We introduce a probabilistic approach to create a paired sentiment lexicon that models the positivity and the negativity of words separately. We show that such a lexicon can be used to more accurately predict the sentiment features for a Tweet than univalued lexicons. In addition, we show that employing these features with a multivariate Support Vector Machine (SVM) that optimizes the Hellinger Distance improves sentiment quantification accuracy versus other distance metrics. Furthermore, we introduce a mean of representing sentiment over time through sentiment signals built from the aforementioned sentiment quantifier and show that sentiment shift can be detected using geometric change-point detection algorithms. Finally, our evaluation shows that, of the methods implemented, a two-dimensional Euclidean distance measure, analyzed using the first and second order statistical moments, was the most accurate in detecting sentiment shift.

Share

COinS