Infringement of Individual Privacy via Mining Differentially Private GWAS Statistics
Bayesian Network, Genome Wide Association Study, Inference Attack, Differential Privacy, Private Statistic
Individual privacy in genomic era is becoming a growing concern as more individuals get their genomes sequenced or genotyped. Infringement of genetic privacy can be conducted even without raw genotypes or sequencing data. Studies have reported that summary statistics from Genome Wide Association Studies (GWAS) can be exploited to threat individual privacy. In this study, we show that even with differentially private GWAS statistics, there is still a risk for leaking individual privacy. Specifically, we constructed a Bayesian network through mining public GWAS statistics, and evaluated two attacks, namely trait inference attack and identity inference attack, for infringement of individual privacy not only for GWAS participants but also regular individuals. We used both simulation and real human genetic data from 1000 Genome Project to evaluate our methods. Our results demonstrated that unexpected privacy breaches could occur and attackers can derive identity information and private information by utilizing these algorithms. Hence, more methodological studies should be invested to understand the infringement and protection of genetic privacy.
Wang Y., Wen J., Wu X., Shi X. (2016) Infringement of Individual Privacy via Mining Differentially Private GWAS Statistics. In: Wang Y., Yu G., Zhang Y., Han Z., Wang G. (eds) Big Data Computing and Communications. BigCom 2016. Lecture Notes in Computer Science, vol 9784. Springer, Cham