Date of Graduation


Document Type


Degree Name

Bachelor of Science

Degree Level



Computer Science and Computer Engineering


Andrews, David

Committee Member/Reader

Gauch, John

Committee Member/Second Reader

Streeter, Lora

Committee Member/Third Reader

Andrews, David


The focus of this project was to shorten the time it takes to train reinforcement learning agents to perform better than humans in a sparse reward environment. Finding a general purpose solution to this problem is essential to creating agents in the future capable of managing large systems or performing a series of tasks before receiving feedback. The goal of this project was to create a transition function between an imitation learning algorithm (also referred to as a behavioral cloning algorithm) and a reinforcement learning algorithm. The goal of this approach was to allow an agent to first learn to do a task by mimicking human actions through the imitation learning algorithm and then learn to do the task better or faster than humans by training with the reinforcement learning algorithm. This project utilizes Unity3D to model a sparse reward environment and allow use of the ml-agents toolkit provided by Unity3D. The toolkit provided by Unity3D is an open source project that does not maintain documentation for past versions of the software, so recent large changes to the use of the ml-agents tools in code caused significant delays in the achievement of the larger goal of this project. Therefore, this paper outlines the theoretical approach to the problem and some of its implementation in Unity3D. This will provide a comprehensive overview of the common tools used to train agents in sparse reward environments particularly in a video game environment.


game design, machine learning, reinforcement learning, sparse reward, Unity3D