Data Science Undergraduate Honors Theses

Dynamic Driver Pricing Optimization: A Multi-Objective, Multi-Armed Bandit Approach in Last-Mile Logistics

Winston J. BostonFollow

Date of Graduation

5-2025

Document Type

Thesis

Degree Name

Bachelor of Science in Data Science

Degree Level

Undergraduate

Department

Data Science

Advisor/Mentor

Dr. Karl D. Schubert

Committee Member

Dr. Elizabeth Keiffer

Second Committee Member

Dr. Carole Shook

Abstract

Traditional static pricing models in last-mile delivery platforms often fail to adapt to fluctuating market conditions, leading to driver shortages or inflated operational costs. To address this challenge, we propose a Multi-Objective, Multi-Armed Learning System (MOMAB) for dynamic driver pricing that learns from real-time market feedback. Leveraging a Multi-Armed Bandit (MAB) framework with an epsilon-greedy exploration strategy, the system iteratively adjusts driver incentive multipliers to balance cost efficiency with driver acceptance rates. This adaptive pricing mechanism produces a responsive regionally adaptive pricing structure aligned with current demand across diverse delivery environments, such as those seen in other crowd-sourced platforms like DoorDash or Uber.

Due to limited access to proprietary data from Company A, we generated a synthetic dataset to simulate realistic delivery scenarios. This dataset captures key trip-level attributes, including regional conditions, delivery complexity, and fluctuating demand, providing a controlled model development and testing environment. The study involved training a baseline MAB agent with a multi-objective reward function optimized for cost and driver acceptance. Phase 2 extended the model to combine with our Synthetic Data, Environment, and Plotting classes. Performance was assessed through cumulative regret and pull counts, emphasizing maintaining a minimum acceptance rate while minimizing cost.

Results demonstrate that the learned policy meets the required driver acceptance threshold while significantly reducing regret compared to a static pricing baseline. Overall, this research delivers a technically grounded, data-driven framework for dynamic pricing in last-mile logistics, offering a scalable solution for improving efficiency, enhancing driver engagement, and maintaining operational cost-effectiveness across on-demand delivery systems.

Keywords

Multi-Objective; Multi-Armed; Bandit; Epsilon-Greedy; Pricing; Optimization

Citation

Boston, W. J. (2025). Dynamic Driver Pricing Optimization: A Multi-Objective, Multi-Armed Bandit Approach in Last-Mile Logistics. Data Science Undergraduate Honors Theses Retrieved from https://scholarworks.uark.edu/dtscuht/44

Data Science Undergraduate Honors Theses

Dynamic Driver Pricing Optimization: A Multi-Objective, Multi-Armed Bandit Approach in Last-Mile Logistics

Date of Graduation

Document Type

Degree Name

Degree Level

Department

Advisor/Mentor

Committee Member

Second Committee Member

Abstract

Keywords

Citation

Included in

Search

Links

Browse

Contact Us

Data Science Undergraduate Honors Theses

Dynamic Driver Pricing Optimization: A Multi-Objective, Multi-Armed Bandit Approach in Last-Mile Logistics

Author

Date of Graduation

Document Type

Degree Name

Degree Level

Department

Advisor/Mentor

Committee Member

Second Committee Member

Abstract

Keywords

Citation

Included in

Share

Search

Links

Browse

Contact Us