Date of Graduation


Document Type


Degree Name

Bachelor of Science in Industrial Engineering

Degree Level



Industrial Engineering


Zhang, Shengfan

Committee Member/Reader

Cassady, Richard


According to the Centers for Disease Control and Prevention (CDC), nearly one in four people are currently infected with human papillomavirus (HPV) in the United States. Although most people with HPV never experience symptoms, there is a risk of developing different types of HPV-related cancers after infection. These cancers and other related diseases result in almost $8 billion spent annually for treatment. Currently, all boys and girls ages 11 or 12 years are recommended to receive HPV vaccination. Catch-up vaccines are recommended for males and females through the age of 21 and 26, respectively, if they did not get vaccinated previously. However, the uptake rates among young adult females remain low in the United States.

This research seeks to create a risk prediction model with a focus on adult females that will assist these individuals to estimate the risk of HPV infection based on demographic, sexual behavior, and lifestyle factors. The focus of this thesis is on the impact diet and exercise have on risk of infection. A variety of predictive models were applied to the data collected to determine the best fit. These models include logistic regression, lasso regression, ridge regression, elastic net regression, and the random forest algorithm.

Our results corroborate findings in other studies. Similar factors are recognized as significant such as sexual partners, age at first sexual activity, alcohol use, smoking habits, poverty level, and marital status. This study also found daily nutrition and sedentary activity has a significant role in HPV infection but was not able to show significance of daily exercise due to data constraints.