Date of Graduation
5-2025
Document Type
Thesis
Degree Name
Bachelor of Science in Data Science
Degree Level
Undergraduate
Department
Data Science
Advisor/Mentor
Dr. Karl D. Schubert
Committee Member
Dr. Elizabeth Keiffer
Second Committee Member
Dr. Carole Shook
Abstract
Traditional static pricing models in last-mile delivery platforms often fail to adapt to fluctuating market conditions, leading to driver shortages or inflated operational costs. To address this challenge, we propose a Multi-Objective, Multi-Armed Learning System (MOMAB) for dynamic driver pricing that learns from real-time market feedback. Leveraging a Multi-Armed Bandit (MAB) framework with an epsilon-greedy exploration strategy, the system iteratively adjusts driver incentive multipliers to balance cost efficiency with driver acceptance rates. This adaptive pricing mechanism produces a responsive regionally adaptive pricing structure aligned with current demand across diverse delivery environments, such as those seen in other crowd-sourced platforms like DoorDash or Uber.
Due to limited access to proprietary data from Company A, we generated a synthetic dataset to simulate realistic delivery scenarios. This dataset captures key trip-level attributes, including regional conditions, delivery complexity, and fluctuating demand, providing a controlled model development and testing environment. The study involved training a baseline MAB agent with a multi-objective reward function optimized for cost and driver acceptance. Phase 2 extended the model to combine with our Synthetic Data, Environment, and Plotting classes. Performance was assessed through cumulative regret and pull counts, emphasizing maintaining a minimum acceptance rate while minimizing cost.
Results demonstrate that the learned policy meets the required driver acceptance threshold while significantly reducing regret compared to a static pricing baseline. Overall, this research delivers a technically grounded, data-driven framework for dynamic pricing in last-mile logistics, offering a scalable solution for improving efficiency, enhancing driver engagement, and maintaining operational cost-effectiveness across on-demand delivery systems.
Keywords
Multi-Objective; Multi-Armed; Bandit; Epsilon-Greedy; Pricing; Optimization
Citation
Boston, W. J. (2025). Dynamic Driver Pricing Optimization: A Multi-Objective, Multi-Armed Bandit Approach in Last-Mile Logistics. Data Science Undergraduate Honors Theses Retrieved from https://scholarworks.uark.edu/dtscuht/44