26/10/2020

Solving K-MDPs

Jonathan Ferrer-Mestres, Thomas G. Dietterich, Olivier Buffet, Iadine Chadès

Keywords: state abstraction, interpretability, MDP, computational sustainability

Abstract: Markov Decision Processes (MDPs) are employed to model sequential decision-making problems under uncertainty. Traditionally, algorithms to solve MDPs have focused on solving large state or action spaces. With increasing applications of MDPs to human-operated domains such as conservation of biodiversity and health, developing easy-to-interpret solutions is of paramount importance to increase uptake of MDP policies. Here, we define the problem of solving $K$-MDPs, i.e., given an original MDP and a constraint on the number of states ($K$), generate a reduced state space MDP that minimizes the difference between the original optimal MDP value function and the reduced optimal $K$-MDP value function. Building on existing non-transitive and transitive approximate state abstraction functions, we propose a family of three algorithms based on binary search with sub-optimality bounded polynomially in a precision parameter: \KILP{}, \Qd{} and \ad{}. We compare these algorithms to a greedy algorithm (\greedy) and clustering approach (\kmeans). On randomly generated MDPs and two computational sustainability MDPs, \ad{} outperformed all algorithms when it could find a feasible solution. While numerous state abstraction problems have been proposed in the literature, we believe this is the first time that the general problem of solving $K$-MDPs is suggested. We hope that our work will generate future research aiming at increasing the interpretability of MDP policies in human-operated domains.

 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at ICAPS 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers