Date of Graduation
5-2023
Document Type
Dissertation
Degree Name
Doctor of Philosophy in Engineering (PhD)
Degree Level
Graduate
Department
Computer Science & Computer Engineering
Advisor/Mentor
Gauch, John M.
Committee Member
Gauch, Susan E.
Second Committee Member
Le, Thi Hoang Ngan
Third Committee Member
Zhang, Shengfan
Keywords
Attention Mechanisms; Generative Adverserial Networks; Image Classification; Machine Learning; Neural Networks; Semi-Supervised Learning
Abstract
Image classification is a sub-field of computer vision that focuses on identifying objects within digital images. In order to improve image classification we must address the following areas of improvement: 1) Single and Multi-View data quality using data pre-processing techniques. 2) Enhancing deep feature learning to extract alternative representation of the data. 3) Improving decision or prediction of labels. This dissertation presents a series of four published papers that explore different improvements of image classification. In our first paper, we explore the Siamese network architecture to create a Convolution Neural Network based similarity metric. We learn the priority features that differentiate two given input images. The metric proposed achieves state-of-the-art Fβ measure. In our second paper, we explore multi-view data classification. We investigate the application of Generative Adversarial Networks GANs on Multi-view data image classification and few-shot learning. Experimental results show that our method outperforms state-of-the-art research. In our third paper, we take on the challenge of improving ResNet backbone model. For this task, we focus on improving channel attention mechanisms. We utilize Discrete Wavelet Transform compression to address the channel representation problem. Experimental results on ImageNet shows that our method outperforms baseline SENet-34 and SOTA FcaNet-34 at no extra computational cost. In our fourth paper, we investigate further the potential of orthogonalization of filters for extraction of diverse information for channel attention. We prove that using only random constant orthogonal filters is sufficient enough to achieve good channel attention. We test our proposed method using ImageNet, Places365, and Birds datasets for image classification, MS-COCO for object detection, and instance segmentation tasks. Our method outperforms FcaNet, and WaveNet and achieves the state-of-the-art results.
Citation
Salman, H. K. (2023). Improving Classification in Single and Multi-View Images. Graduate Theses and Dissertations Retrieved from https://scholarworks.uark.edu/etd/5089