Causal Modeling-Based Discrimination Discovery and Removal: Criteria, Bounds, and Algorithms

Document Type


Publication Date



Mathematical model, Predictive models, Measurement, Law, Data models, Prediction algorithms, Task analysis, Discrimination discovery and removal, Direct and indirect discrimination, Causal modeling, Path-specific effect


Anti-discrimination is an increasingly important task in data science. In this paper, we investigate the problem of discovering both direct and indirect discrimination from the historical data, and removing the discriminatory effects before the data are used for predictive analysis (e.g., building classifiers). The main drawback of existing methods is that they cannot distinguish the part of influence that is really caused by discrimination from all correlated influences. In our approach, we make use of the causal graph to capture the causal structure of the data. Then we model direct and indirect discrimination as the path-specific effects, which accurately identify the two types of discrimination as the causal effects transmitted along different paths in the graph. For certain situations where indirect discrimination cannot be exactly measured due to the unidentifiability of some path-specific effects, we develop an upper bound and a lower bound to the effect of indirect discrimination. Based on the theoretical results, we propose effective algorithms for discovering direct and indirect discrimination, as well as algorithms for precisely removing both types of discrimination while retaining good data utility. Experiments using the real dataset show the effectiveness of our approaches.


Principal Investigator: Xintao Wu

Acknowledgements: This paper is a significant extension of the 7-page IJCAI’17 paper [47]. This work was supported in part by NSF 1646654.