Date of Graduation


Document Type


Degree Name

Doctor of Philosophy in Engineering (PhD)

Degree Level



Computer Science & Computer Engineering


Xintao Wu

Committee Member

Brajendra Panda

Second Committee Member

Qinghua Li

Third Committee Member

Zhenghui Sha


machine learning model bias, fairness-aware machine learning, personalized learning, weighted learning, adversarial training


Machine learning algorithms have been widely used in real world applications. The development of these techniques has brought huge benefits for many AI-related tasks, such as natural language processing, image classification, video analysis, and so forth. In traditional machine learning algorithms, we usually assume that the training data and test data are independently and identically distributed (iid), indicating that the model learned from the training data can be well applied to the test data with good prediction performance. However, this assumption is quite restrictive because the distribution shift can exist from the training data to the test data in many scenarios. In addition, the goal of traditional machine learning model is to maximize the prediction performance, e.g., accuracy, based on the historical training data, which may tend to make unfair predictions for some particular individual or groups. In the literature, researchers either focus on building robust machine learning models under data distribution shift or achieving fairness separately, without considering to solve them simultaneously.

The goal of this dissertation is to solve the above challenging issues in fair machine learning under distribution shift. We start from building an agnostic fair framework in federated learning as the data distribution is more diversified and distribution shift exists from the training data to the test data. Then we build a robust framework to address the sample selection bias for fair classification. Next we solve the sample selection bias issue for fair regression. Finally, we propose an adversarial framework to build a personalized model in the distributed setting where the distribution shift exists between different users.

In this dissertation, we conduct the following research for fair machine learning under distribution shift. • We develop a fairness-aware agnostic federated learning framework (AgnosticFair) to deal with the challenge of unknown testing distribution; • We propose a framework for robust and fair learning under sample selection bias; • We develop a framework for fair regression under sample selection bias when dependent variable values of a set of samples from the training data are missing as a result of another hidden process; • We propose a learning framework that allows an individual user to build a personalized model in a distributed setting, where the distribution shift exists among different users.