Date of Graduation

5-2026

Document Type

Thesis

Degree Name

Bachelor of Science in Data Science

Degree Level

Undergraduate

Department

Data Science

Advisor/Mentor

Karl Schubert

Committee Member

Raj Rao

Abstract

Fresh-item forecasting at warehouse retailers must balance two competing risks: under-ordering, which produces empty shelves and lost sales, and over-ordering, which drives shrink and margin loss. Sam’s Club currently produces a daily demand forecast for every fresh stock-keeping unit (SKU) at every club, yet store managers see only a single number with no rationale. When that number is wrong, managers cannot easily diagnose why or decide how to react. This thesis presents the design, implementation, and evaluation of an explainable, conversational decision-support agent that augments the existing forecast with anomaly detection, retrieval-augmented generation (RAG), and a large language model (LLM) reasoning layer. Using a 388-row, 106-feature dataset built from Sam’s Club’s production forecast pipeline (Store 4109, item 53061055 “SANDWICH CROISSANTS 12CT,” December 2024 through March 2026), the system compares four candidate forecasts (Original, Naive, Prophet, and XGBoost) and selects the best one for each SKU-day. The Streamlit application presents this evidence to store managers through a five-tab interface, and a local Llama 3.2 model accessed through Ollama and orchestrated with LangChain provides natural-language explanations grounded in a FAISS vector store of historical anomaly summaries. On the modeled subset (n = 121 SKU-days), the system kept actual sales within the prediction interval on 97.5% of days, identified an over- or under-forecast in 19.8% of days, and if managers had followed the selected-model recommendation would have converted an estimated $1,508 of original lost sales into a $4,949 surplus through Prophet-led downward adjustments. The deliverable is a working, end-to-end pipeline that demonstrates how prompt engineering, RAG, and LLM orchestration can translate opaque forecast numbers into auditable, actionable explanations for non-technical store operators.

Keywords

AI; Time-Series; Retail; LLM; RAG

Available for download on Friday, May 04, 2029

Included in

Data Science Commons

Share

COinS