Date of Graduation


Document Type


Degree Name

Bachelor of Science in Data Science

Degree Level



Data Science


Schubert, Karl

Committee Member/Reader

Buttle, Casey

Committee Member/Second Reader

Mitchell, Rachael


Coca-Cola is a popular soft drink brand with sales occurring in every Walmart store across the world, which generates large quantities of data and requires a robust supply chain system. However, the company does not currently have a sophisticated, automated, and/or prescriptive system for detecting where, when, and why inventory outages occur and applying preventative measures to avoid loss of revenue from the absence of inventory on store shelves. This thesis proposes and applies a novel, prescriptive system for this purpose. An inventory outage can be seen as a ‘negative’ statistical outlier in a time series of inventory for an item at a Walmart location, to this end a method of outlier tagging called Isolation Forest is used. To address the “where” and “when” (spatiotemporal) aspect of the problem, Dynamic Time Warp (DTW) Clustering is performed to create groups of Coca-Cola brands that behave similarly regarding inventory fluctuations in Walmart stores across the United States. To address the “why” aspect of the problem a Random Forest model is created for each cluster and predicts on the binary indicator of ‘negative’ outlier to create an importance list for why inventory outages occur (decomposition). The entire process is automated in Alteryx workflows using Coca-Cola DataBricks computing power so that supply chain managers can take preventative action based on identified variables with minimal upkeep to ensure consistent product shelf presence in Walmart, and potentially other, retail stores.


Machine Learning, Supply Chain, Inventory Management, Consumer-Packaged Goods, Spatiotemporal, Outlier

Available for download on Thursday, May 01, 2025