Date of Graduation
5-2020
Document Type
Thesis
Degree Name
Bachelor of Science in Computer Engineering
Degree Level
Undergraduate
Department
Computer Science and Computer Engineering
Advisor/Mentor
Li, Qinghua
Committee Member/Reader
Li, Qinghua
Committee Member/Second Reader
Parkerson, James
Committee Member/Third Reader
Gauch, Susan
Abstract
Ever since technology (tech) companies realized that people's usage data from their activities on mobile applications to the internet could be sold to advertisers for a profit, it began the Big Data era where tech companies collect as much data as possible from users. One of the benefits of this new era is the creation of new types of jobs such as data scientists, Big Data engineers, etc. However, this new era has also raised one of the hottest topics, which is data privacy. A myriad number of complaints have been raised on data privacy, such as how much access most mobile applications require to function correctly, from having access to a user's contact list to media files. Furthermore, the level of tracking has reached new heights, from tracking mobile phone location, activities on search engines, to phone battery life percentage. However much data is collected, it is within the tech companies' right to collect the data because they provide a privacy policy that informs the user on the type of data they collect, how they use that data, and how they share that data. In addition, we find that all privacy policies used in this research state that by using their mobile application, the user agrees to their terms and conditions. Most alarmingly, research done on privacy policies has found that only 9% of mobile app users read legal terms and conditions [2] because they are too long, which is a worryingly low number. Therefore, in this thesis, we present two summarization programs that take in privacy policy text as input and produce a shorter summarized version of the privacy policy. The results from the two summarization programs show that both implementations achieve an average of at least 50%, 90%, and 85% on the same sentence, clear sentence, and summary score grading metrics, respectively.
Keywords
Privacy Policy; Natural Language Processing; Summarization; Summarization Algorithms; Ed Munson algorithm
Citation
Ishimwe, A. (2020). Identifying Privacy Policy in Service Terms Using Natural Language Processing. Computer Science and Computer Engineering Undergraduate Honors Theses Retrieved from https://scholarworks.uark.edu/csceuht/83