Date of Graduation
5-2020
Document Type
Thesis
Degree Name
Master of Science in Computer Science (MS)
Degree Level
Graduate
Department
Computer Science & Computer Engineering
Advisor/Mentor
Gauch, Susan E.
Committee Member
Li, Qinghua
Second Committee Member
Luu, Khoa
Keywords
Edit Distance; Information Retrieval; Quote Identification; String matching
Abstract
Quoting a borrowed excerpt of text within another literary work was infrequently done prior to the beginning of the eighteenth century. However, quoting other texts, particularly Shakespeare, became quite common after that. Our work develops automatic approaches to identify that trend. Initial work focuses on identifying exact and modified sections of texts taken from works of Shakespeare in novels spanning the eighteenth century. We then introduce a novel approach to identifying modified quotes by adapting the Edit Distance metric, which is character based, to a word based approach. This paper offers an introduction to previous uses of this metric within a multitude of fields, describes the implementation of the different methodologies used for quote identification and then shows how a combination of both Edit Distance methods can help achieve a higher accuracy in quote identification than any one method implemented alone with an overall increase of 10%: from 0.638 and 0.609 to 0.737. Although we demonstrate our approach using Shakespeare quotes in eighteenth century novels, the techniques can be generalized to locate exact and/or partial matches between any set of text targets in any corpus. This work would be of value to literary scholars who want to track quotations over time and could also be applied to other languages.
Citation
Chiariglione, M. P. (2020). Shakespeare in the Eighteenth Century: Algorithm for Quotation Identification. Graduate Theses and Dissertations Retrieved from https://scholarworks.uark.edu/etd/3580