Term of Award

Fall 2017

Degree Name

Master of Science in Mathematics (M.S.)

Document Type and Release Option

Thesis (open access)


Department of Mathematical Sciences

Committee Chair

Stephen Carden

Committee Member 1

Arpita Chatterjee

Committee Member 2

Scott Kersey

Committee Member 3

Ionut Iacob

Committee Member 3 Email



In Markov decision processes an operator exploits known data regarding the environment it inhabits. The information exploited is learned from random exploration of the state-action space. This paper proposes to optimize exploration through the implementation of quasi-random sequences in both discrete and continuous state-action spaces. For the discrete case a permutation is applied to the indices of the action space to avoid repetitive behavior. In the continuous case sequences of low discrepancy, such as Halton sequences, are utilized to disperse the actions more uniformly.

Research Data and Supplementary Material