Markov Decision Processes (MDPs) are extensively used to encode sequences of decisions with probabilistic effects. Markov Decision Processes with Imprecise Probabilities (MDPIPs) encode sequences of decisions whose effects are modeled using sets of probability distributions. In this paper we examine the computation of Gamma-maximin policies for MDPIPs using multilinear and integer programming. We discuss the application of our algorithms to ``factored'' models and to a recent proposal, Markov Decision Processes with Set-valued Transitions (MDPSTs), that unifies the fields of probabilistic and ``nondeterministic'' planning in artificial intelligence research.
Keywords. Markov Decision Processes with Imprecise Probabilities, maximin criterion, multilinear and integer programming.
Paper Download
The paper is availabe in the following formats:
Authors addresses:
Ricardo Shirota Filho
Laboratório de Tomada de Decisão
A/C Prof. Dr. Fabio G. Cozman
Escola Politécnica da USP
Av. Prof. Mello Moraes, 2231
CEP: 05356-000
São Paulo, SP, BRAZIL
Fabio Cozman
Av. Prof. Mello Moraes, 2231
Cidade Univesitaria, CEP 05508-900
Sao Paulo, SP - BRAZIL
Felipe Trevizan
Despatx 398
Departamento de Tecnologia - UPF
Paseo de Circunvalacion, 8
08003 Barcelona, Spain
Cassio Campos
University of Sao Paulo
Sao Paulo, SP, Brazil
Leliane Barros
Departamento de Computacao
Instituto de Matematica e Estatistica
Cidade Universitaria, CEP 05508900
Sao Paulo, SP
E-mail addresses:
Ricardo Shirota Filho | ricardo.shirota@poli.usp.br |
Fabio Cozman | fgcozman@usp.br |
Felipe Trevizan | felipe.trevizan@upf.edu |
Cassio Campos | cassio@ime.usp.br |
Leliane Barros | leliane@ime.usp.br |