Date of Graduation

8-2019

Document Type

Thesis

Degree Name

Master of Science in Statistics and Analytics (MS)

Degree Level

Graduate

Department

Graduate School

Advisor

Jyotishka Datta

Committee Member

John Tipton

Second Committee Member

Qingyang Zhang

Keywords

High-Dimensional Data, Multiple Testing, Statistics

Abstract

High dimensional data with sparsity is routinely observed in many scientific disciplines. Filtering out the signals embedded in noise is a canonical problem in such situations requiring multiple testing. The Benjamini--Hochberg procedure using False Discovery Rate control is the gold standard in large scale multiple testing. In Majumder et al. (2009) an internally cross-validated form of the procedure is used to avoid a costly replicate study and the complications that arise from population selection in such studies (i.e. extraneous variables). I implement this procedure and run extensive simulation studies under increasing levels of dependence among parameters and different data generating distributions and compare results with other common techniques. I illustrate that the internally cross-validated Benjamini--Hochberg procedure results in a significantly reduced false discovery rate, while maintaining a reasonable, though increased, false negative rate, and in a reduction to inherent variability under strong dependence structures when compared with the usual Benjamini--Hochberg procedure. In the discussion section, I describe some possibilities for relevant applications and future studies.

Share

COinS