Term of Award
Master of Science in Mathematics (M.S.)
Document Type and Release Option
Thesis (open access)
Department of Mathematical Sciences
Committee Member 1
Committee Member 2
Committee Member 3
Committee Member 3 Email
In Markov decision processes an operator exploits known data regarding the environment it inhabits. The information exploited is learned from random exploration of the state-action space. This paper proposes to optimize exploration through the implementation of quasi-random sequences in both discrete and continuous state-action spaces. For the discrete case a permutation is applied to the indices of the action space to avoid repetitive behavior. In the continuous case sequences of low discrepancy, such as Halton sequences, are utilized to disperse the actions more uniformly.
Walker, Samuel. "Quasi-Random Action Selection In Markov Decision Processes". November, 2017.
Research Data and Supplementary Material