Date of Graduation
12-2024
Document Type
Dissertation
Degree Name
Doctor of Philosophy in Engineering (PhD)
Degree Level
Graduate
Department
Electrical Engineering and Computer Science
Advisor/Mentor
Luu, Khoa
Committee Member
Gauch, Susan E.
Second Committee Member
Dowling, Ashley P.G.
Third Committee Member
Gauch, John M.
Fourth Committee Member
Raj, Bhiksha
Keywords
Continual Learning; Domain Generalization; Fairness Learning; Robust Learning; Unsupervised Domain Adaptation; Video Understanding
Abstract
The rapid increase of large-scale data and high-performance computational hardware has promoted the development of data-driven machine vision approaches. Advanced deep learning approaches have achieved remarkable performance in various vision problems and are closing the capability gap between artificial intelligence (AI) and humans. However, towards the ultimate goal of AI, which replicates human ability in visual perception tasks, the machine vision learning methods still need to address several ill-posed challenges. First, while the current vision learning methods often rely on large-scale annotated data, the data annotation process is a costly and time-consuming process. Second, the unfaired predictions produced by vision models due to the imbalance of data distribution, known as fairness, pose a significant concern in practical deployment, especially in human-related applications. Third, since human perceptions interpret the world in the open-vocabulary approach with diverse categories and concepts, the current vision machine frameworks should be capable of continually learning new concepts. Fourth, although the current deep learning-based vision approaches achieved impressive performance, their knowledge representations are often uninterpretable. These challenges motivate this dissertation to develop novel approaches toward fairness and robustness in vision learning.
To address these challenges, the dissertation presents four key contributions toward fairness and robustness in vision learning. First, to address the problem of large-scale data requirements, the dissertation presents a novel Fairness Domain Adaptation approach derived from two major research findings. In particular, the thesis proposes a novel Bijective Maximum Likelihood to Unsupervised Domain Adaptation followed by introducing a novel Fairness Adaptation Learning Framework. Second, to enable the capability of open-world modeling of vision learning, this dissertation presents a novel Open-world Fairness Continual Learning Framework. The success of this research direction is the result of two research lines, i.e., Fairness Continual Learning and Open-world Continual Learning. Third, since visual data are often captured from multiple camera views, robust vision learning methods should be capable of modeling invariant features across views. To achieve this desired goal, the research in this thesis will present a novel Geometry-based Cross-view Adaptation framework to learn robust feature representations across views. Finally, with the recent increase in large-scale videos and multimodal data, understanding the feature representations and improving the robustness of large-scale visual foundation models is critical. Therefore, this thesis will present novel Transformer-based approaches to improve the robust feature representations against multimodal and temporal data. By introducing new self-attention mechanisms and learning objectives to Transformer networks for multimodal and video understanding, this research line has provided a comprehensive and better understanding of multimodal and temporal feature representations. Then, I will present a novel Domain Generalization Approach to improve the robustness of visual foundation models. My research's theoretical analysis and experimental results have shown the effectiveness of the proposed approaches, demonstrating their superior performance compared to prior studies. I am confident that the contributions in this dissertation have advanced the fairness and robustness of machine vision learning.
Citation
Truong, T. (2024). Towards Robust and Fair Vision Learning in Open-World Environments. Graduate Theses and Dissertations Retrieved from https://scholarworks.uark.edu/etd/5544