Date of Graduation
5-2018
Document Type
Thesis
Degree Name
Master of Science in Computer Science (MS)
Degree Level
Graduate
Department
Computer Science & Computer Engineering
Advisor/Mentor
Gashler, Michael S.
Committee Member
Beavers, M. Gordon
Second Committee Member
Wu, Xintao
Keywords
Deep Learning; Exploration Strategy; Machine Learning; Reinforcement Learning
Abstract
We propose a simple and efficient modification to the Asynchronous Advantage Actor Critic (A3C)
algorithm that improves training. In 2016 Google’s DeepMind set a new standard for state-of-theart
reinforcement learning performance with the introduction of the A3C algorithm. The goal of
this research is to show that A3C can be improved by the use of a new novel exploration strategy we
call “Follow then Forage Exploration” (FFE). FFE forces the agents to follow the best known path
at the beginning of a training episode and then later in the episode the agent is forced to “forage”
and explores randomly. In tests against A3C implemented using OpenAI’s Universe-Starter-Agent,
FFE was able to show on average that it reached the maximum score faster.
Citation
Holliday, J. B. (2018). Improving Asynchronous Advantage Actor Critic with a More Intelligent Exploration Strategy. Graduate Theses and Dissertations Retrieved from https://scholarworks.uark.edu/etd/2689
Included in
Artificial Intelligence and Robotics Commons, Graphics and Human Computer Interfaces Commons