Date of Graduation

8-2019

Document Type

Thesis

Degree Name

Master of Science in Statistics and Analytics (MS)

Degree Level

Graduate

Department

Statistics and Analytics

Advisor/Mentor

Jyotishka Datta

Committee Member

John Tipton

Second Committee Member

Qingyang Zhang

Keywords

High-Dimensional Data, Multiple Testing, Statistics

Abstract

High dimensional data with sparsity is routinely observed in many scientific disciplines. Filtering out the signals embedded in noise is a canonical problem in such situations requiring multiple testing. The Benjamini--Hochberg procedure using False Discovery Rate control is the gold standard in large scale multiple testing. In Majumder et al. (2009) an internally cross-validated form of the procedure is used to avoid a costly replicate study and the complications that arise from population selection in such studies (i.e. extraneous variables). I implement this procedure and run extensive simulation studies under increasing levels of dependence among parameters and different data generating distributions and compare results with other common techniques. I illustrate that the internally cross-validated Benjamini--Hochberg procedure results in a significantly reduced false discovery rate, while maintaining a reasonable, though increased, false negative rate, and in a reduction to inherent variability under strong dependence structures when compared with the usual Benjamini--Hochberg procedure. In the discussion section, I describe some possibilities for relevant applications and future studies.

Share

COinS