Date of Graduation
5-2021
Document Type
Thesis
Degree Name
Bachelor of Science
Degree Level
Undergraduate
Department
Computer Science and Computer Engineering
Advisor/Mentor
Zhan, Justin
Committee Member/Reader
Patitz, Matthew
Committee Member/Second Reader
Primack, Brian
Abstract
Twitter is a microblogging website where any user can publicly release a message, called a tweet, expressing their feelings about current events or their own lives. This candid, unfiltered feedback is valuable in the spaces of healthcare and public health communications, where it may be difficult for cancer patients to divulge personal information to healthcare teams, and randomly selected patients may decline participation in surveys about their experiences. In this thesis, BERTweet, a state-of-the-art natural language processing (NLP) model, was used to predict sentiment and emotion labels for cancer-related tweets collected in 2019 and 2020. In longitudinal plots, trends in these emotions and sentiment values can be clearly linked to popular cancer awareness events, the beginning of stay-at-home mandates related to COVID-19, and the relative mortality rates of different cancer diagnoses. This thesis demonstrates the accuracy and viability of using state-of-the-art NLP techniques to advance the field of public health communications analysis.
Keywords
bertweet, transformers, twitter, cancer, nlp, sentiment
Citation
Baker, W. (2021). Using Large Pre-Trained Language Models to Track Emotions of Cancer Patients on Twitter. Computer Science and Computer Engineering Undergraduate Honors Theses Retrieved from https://scholarworks.uark.edu/csceuht/92
Included in
Data Science Commons, Health Communication Commons, Numerical Analysis and Scientific Computing Commons