Date of Graduation

5-2026

Document Type

Thesis

Degree Name

Bachelor of Science in Data Science

Degree Level

Undergraduate

Department

Data Science

Advisor/Mentor

Karl D Schubert

Committee Member

Karl D Schubert

Second Committee Member

Eric Specking

Third Committee Member

Kelly M Sullivan

Abstract

This thesis examines substitutability within Walmart apparel as a foundation for modular category optimization. Using large-scale item-level data, I develop an attribute-based framework that aggregates products to the fineline level, constructs a structured feature space, and identifies candidate substitute relationships through similarity-based matching within relevant merchandise groupings. The results show that Walmart item master data contains sufficient structure to support scalable substitute generation across a high-variety assortment. However, substitutability is not uniform: many item pairs exhibit high similarity but low observed demand transfer, indicating that structural similarity alone does not guarantee substitution. To address this, the framework is positioned within a two-stage approach that combines similarity-based candidate generation with sales-based validation. Empirical results show that a smaller subset of relationships demonstrates meaningful substitution, with up to 11% observed lift and high similarity among strong pairs.

Keywords

Substitutability Analysis; Apparel Analytics; Business Analytics

Share

COinS