Regression Tree Construction for Reinforcement Learning Problems With a General Action Space

Anthony S. Bush Jr, Georgia Southern University


Part of the implementation of Reinforcement Learning is constructing a regression of values against states and actions and using that regression model to optimize over actions for a given state. One such common regression technique is that of a decision tree; or in the case of continuous input, a regression tree. In such a case, we fix the states and optimize over actions; however, standard regression trees do not easily optimize over a subset of the input variables\cite{Card1993}. The technique we propose in this thesis is a hybrid of regression trees and kernel regression. First, a regression tree splits over state variables at a macro level, then kernel regression models the effects of actions with a smooth function at a micro level. Then non-linear optimization is used to optimize the kernel regressed function to find the best action and get a precise prediction of its value for any given state. This ``best action" is then stored in the tree and is instantly retrieved upon making decisions. This is not only more appropriate for problems with continuous output, but also for problems with a discrete output since it also generalizes the knowledge over actions as well as states, providing for smarter decision-making. The capabilities of this technique are observed for a time series constructed to realistically model a stock problem.