Date of Graduation

8-2017

Document Type

Thesis

Degree Name

Master of Science in Computer Science (MS)

Degree Level

Graduate

Department

Computer Science & Computer Engineering

Advisor/Mentor

Dale R. Thompson

Committee Member

Jia Di

Second Committee Member

Qinghua Li

Keywords

Decision Trees, Internet Protocol Version 6 (IPV6), Machine Learning, Neural Networks, Operating System Identification, Random Forests

Abstract

Operating system (OS) identification tools, sometimes called fingerprinting tools, are essential for the reconnaissance phase of penetration testing. While OS identification is traditionally performed by passive or active tools that use fingerprint databases, very little work has focused on using machine learning techniques. Moreover, significantly more work has focused on IPv4 than IPv6. We introduce a collaborative neural network ensemble that uses a unique voting system and a random forest ensemble to deliver accurate predictions. This approach uses IPv6 features as well as packet metadata features for OS identification. Our experiment shows that our approach is valid and we achieve a neural network ensemble average accuracy of 85% over 100 sets of neural networks with a highest accuracy of 96%. Furthermore, we explore the impact of additional training for poor neural network accuracy, and we show that our system can achieve an average accuracy of 93%, which is an 8% improvement over the previous approach. A random forest of 30 decision trees attains an average accuracy of 93.6% and a best accuracy of 96% when given a dataset of Windows and Linux packets. Finally, as packets from the Mac OS is introduced into the dataset, the random forested performed with an average accuracy of 89.6% and a best accuracy of 93.2%.

Share

COinS