Date of Graduation

5-2025

Document Type

Thesis

Degree Name

Master of Science in Statistics and Analytics (MS)

Degree Level

Graduate

Department

Statistics and Analytics

Advisor/Mentor

Plummer, Sean

Committee Member

Zhang, Qingyang

Second Committee Member

Petris, Giovanni G.

Keywords

Bayesian Statistics; Community Detection; Network Analysis; Nonparametric Bayesian Methods

Abstract

Network analysis is becoming an increasingly popular interdisciplinary area of study, with emerging interest in fields like sociology, biology, economics, and ecology. Within the niche of network analysis, capturing the community structure of a network is one important achievement that many statisticians have been working toward over recent decades. The most popular modeling technique for latent community detection is the Stochastic Block Model (SBM), which falls into the category of latent variable models and will serve as the baseline model throughout this thesis. SBM is widely regarded as the most effective community detection method as it detects latent community membership among individuals in a network, and the probability that two individuals have a relationship is based only on community structure. Though effective, SBM has significant limitations. Along with other traditional latent variable models, SBM requires a pre-specified number of communities. The community structure, which would lend access to the number of groups, is often unknown in application. Furthermore, traditional SBMs are often insufficient in modeling networks with block structures that do not have well-separated communities or low within-group probabilities. Due to these limitations, recent developments of SBMs have included extensions of the traditional SBM to the infinite parameter space. The partition structure of the network is then modeled using nonparametric Bayesian methods. In this thesis, I will explore three nonparametric Bayesian methods for community detection in the context of SBM, including the Dirichlet Process, the Gnedin Process, and a Mixture of Finite Mixtures approach. I aim to provide supporting evidence through model comparison that nonparametric methods outperform traditional community detection methods among networks with complex structures and an unknown number of communities.

Share

COinS