Date of Graduation

5-2022

Document Type

Thesis

Degree Name

Bachelor of Science

Degree Level

Undergraduate

Department

Computer Science and Computer Engineering

Advisor/Mentor

Zhan, Justin

Committee Member/Reader

Gauch, Susan

Committee Member/Second Reader

Streeter, Lora

Abstract

Over the past two decades, online discussion has skyrocketed in scope and scale. However, so has the amount of toxicity and offensive posts on social media and other discussion sites. Despite this rise in prevalence, the ability to automatically moderate online discussion platforms has seen minimal development. Recently, though, as the capabilities of artificial intelligence (AI) continue to improve, the potential of AI-based detection of harmful internet content has become a real possibility. In the past couple years, there has been a surge in performance on tasks in the field of natural language processing, mainly due to the development of the Transformer architecture. One Google-developed Transformer-based model known as BERT has been used as a core part of many current research in the field of detecting toxic language. The methods presented in this paper propose to ensemble multiple BERT models trained on three classification tasks in order to improve capabilities of detecting abusive language in particular. This model uses sub-models using the BERT architecture trained on datasets labeled for hate speech, offensive language and abusive language. The approach presented in this paper is able to outperform the standard BERT model, and HateBERT, a re-trained variation of the BERT model used for detecting abusive language and other similar tasks.

Keywords

Machine Learning, Natural Language Processing, BERT, Abusive Language

Share

COinS