Date of Graduation

5-2024

Document Type

Thesis

Degree Name

Bachelor of Science in Data Science

Degree Level

Undergraduate

Department

Data Science

Advisor/Mentor

Schubert, Karl

Committee Member

Buttle, Casey

Second Committee Member

Mitchell, Rachael

Abstract

Coca-Cola is a popular soft drink brand with sales occurring in every Walmart store across the world, which generates large quantities of data and requires a robust supply chain system. However, the company does not currently have a sophisticated, automated, and/or prescriptive system for detecting where, when, and why inventory outages occur and applying preventative measures to avoid loss of revenue from the absence of inventory on store shelves. This thesis proposes and applies a novel, prescriptive system for this purpose. An inventory outage can be seen as a ‘negative’ statistical outlier in a time series of inventory for an item at a Walmart location, to this end a method of outlier tagging called Isolation Forest is used. To address the “where” and “when” (spatiotemporal) aspect of the problem, Dynamic Time Warp (DTW) Clustering is performed to create groups of Coca-Cola brands that behave similarly regarding inventory fluctuations in Walmart stores across the United States. To address the “why” aspect of the problem a Random Forest model is created for each cluster and predicts on the binary indicator of ‘negative’ outlier to create an importance list for why inventory outages occur (decomposition). The entire process is automated in Alteryx workflows using Coca-Cola DataBricks computing power so that supply chain managers can take preventative action based on identified variables with minimal upkeep to ensure consistent product shelf presence in Walmart, and potentially other, retail stores.

Keywords

Machine Learning; Supply Chain; Inventory Management; Consumer-Packaged Goods; Spatiotemporal; Outlier

Available for download on Thursday, May 01, 2025

Share

COinS