Date of Graduation
5-2018
Document Type
Thesis
Degree Name
Master of Science in Computer Science (MS)
Degree Level
Graduate
Department
Computer Science & Computer Engineering
Advisor
Michael Gashler
Committee Member
Merwin Beavers
Second Committee Member
Xintao Wu
Keywords
Deep Learning, Exploration Strategy, Machine Learning, Reinforcement Learning
Abstract
We propose a simple and efficient modification to the Asynchronous Advantage Actor Critic (A3C)
algorithm that improves training. In 2016 Google’s DeepMind set a new standard for state-of-theart
reinforcement learning performance with the introduction of the A3C algorithm. The goal of
this research is to show that A3C can be improved by the use of a new novel exploration strategy we
call “Follow then Forage Exploration” (FFE). FFE forces the agents to follow the best known path
at the beginning of a training episode and then later in the episode the agent is forced to “forage”
and explores randomly. In tests against A3C implemented using OpenAI’s Universe-Starter-Agent,
FFE was able to show on average that it reached the maximum score faster.
Citation
Holliday, J. B. (2018). Improving Asynchronous Advantage Actor Critic with a More Intelligent Exploration Strategy. Graduate Theses and Dissertations Retrieved from https://scholarworks.uark.edu/etd/2689
Included in
Artificial Intelligence and Robotics Commons, Graphics and Human Computer Interfaces Commons