Title

DPWeka: Achieving Differential Privacy in WEKA

Document Type

Article

Publication Date

2017

Keywords

Data privacy, Privacy, Data analysis, Logistics, Genomics, Bioinformatics, data mining, regression analysis, interactive exploratory data analysis, DPWeka, genome wide association studies, differential privacy mechanisms, differentially private prototype, practical data analysis, computation blocks, test statistics calculation, data mining software, differential privacy, WEKA

Abstract

In this paper, we present DPWeka, a differentially private prototype based on a widely used data mining software WEKA, for practical data analysis. DPWeka includes a suite of differential privacy preserving computation blocks which support a variety of data analysis tasks including test statistics calculation, regression analysis, and interactive exploratory data analysis. We illustrate the use of DPWeka on genome wide association studies that include privately selecting significant SNPs and running logistic regression based on various differential privacy mechanisms.

Comments

Principal Investigator: Xintao Wu

Acknowledgements: This work is supported in part by U.S. National Institute of Health (lR01GM103309) and National Science Foundation (DGE-1523115 and IIS-1502273).

Share

COinS