Date of Graduation

12-2024

Document Type

Dissertation

Degree Name

Doctor of Philosophy in Engineering (PhD)

Degree Level

Graduate

Department

Electrical Engineering and Computer Science

Advisor/Mentor

Luu, Khoa

Committee Member

Gauch, Susan E.

Second Committee Member

Dowling, Ashley P.G.

Third Committee Member

Gauch, John M.

Fourth Committee Member

Raj, Bhiksha

Keywords

Continual Learning; Domain Generalization; Fairness Learning; Robust Learning; Unsupervised Domain Adaptation; Video Understanding

Abstract

The rapid increase of large-scale data and high-performance computational hardware has promoted the development of data-driven machine vision approaches. Advanced deep learning approaches have achieved remarkable performance in various vision problems and are closing the capability gap between artificial intelligence (AI) and humans. However, towards the ultimate goal of AI, which replicates human ability in visual perception tasks, the machine vision learning methods still need to address several ill-posed challenges. First, while the current vision learning methods often rely on large-scale annotated data, the data annotation process is a costly and time-consuming process. Second, the unfaired predictions produced by vision models due to the imbalance of data distribution, known as fairness, pose a significant concern in practical deployment, especially in human-related applications. Third, since human perceptions interpret the world in the open-vocabulary approach with diverse categories and concepts, the current vision machine frameworks should be capable of continually learning new concepts. Fourth, although the current deep learning-based vision approaches achieved impressive performance, their knowledge representations are often uninterpretable. These challenges motivate this dissertation to develop novel approaches toward fairness and robustness in vision learning.

To address these challenges, the dissertation presents four key contributions toward fairness and robustness in vision learning. First, to address the problem of large-scale data requirements, the dissertation presents a novel Fairness Domain Adaptation approach derived from two major research findings. In particular, the thesis proposes a novel Bijective Maximum Likelihood to Unsupervised Domain Adaptation followed by introducing a novel Fairness Adaptation Learning Framework. Second, to enable the capability of open-world modeling of vision learning, this dissertation presents a novel Open-world Fairness Continual Learning Framework. The success of this research direction is the result of two research lines, i.e., Fairness Continual Learning and Open-world Continual Learning. Third, since visual data are often captured from multiple camera views, robust vision learning methods should be capable of modeling invariant features across views. To achieve this desired goal, the research in this thesis will present a novel Geometry-based Cross-view Adaptation framework to learn robust feature representations across views. Finally, with the recent increase in large-scale videos and multimodal data, understanding the feature representations and improving the robustness of large-scale visual foundation models is critical. Therefore, this thesis will present novel Transformer-based approaches to improve the robust feature representations against multimodal and temporal data. By introducing new self-attention mechanisms and learning objectives to Transformer networks for multimodal and video understanding, this research line has provided a comprehensive and better understanding of multimodal and temporal feature representations. Then, I will present a novel Domain Generalization Approach to improve the robustness of visual foundation models. My research's theoretical analysis and experimental results have shown the effectiveness of the proposed approaches, demonstrating their superior performance compared to prior studies. I am confident that the contributions in this dissertation have advanced the fairness and robustness of machine vision learning.

Share

COinS