The variance-penalized stochastic shortest path problem

Publikation: Beitrag in Buch/Konferenzbericht/Sammelband/GutachtenBeitrag in KonferenzbandBeigetragenBegutachtung

Beitragende

Abstract

The stochastic shortest path problem (SSPP) asks to resolve the non-deterministic choices in a Markov decision process (MDP) such that the expected accumulated weight before reaching a target state is maximized. This paper addresses the optimization of the variance-penalized expectation (VPE) of the accumulated weight, which is a variant of the SSPP in which a multiple of the variance of accumulated weights is incurred as a penalty. It is shown that the optimal VPE in MDPs with non-negative weights as well as an optimal deterministic finite-memory scheduler can be computed in exponential space. The threshold problem whether the maximal VPE exceeds a given rational is shown to be EXPTIME-hard and to lie in NEXPTIME. Furthermore, a result of interest in its own right obtained on the way is that a variance-minimal scheduler among all expectation-optimal schedulers can be computed in polynomial time.

Details

OriginalspracheEnglisch
Titel49th EATCS International Conference on Automata, Languages, and Programming, ICALP 2022
Redakteure/-innenMikołaj Bojańczyk, Emanuela Merelli, David P. Woodruff
Herausgeber (Verlag)Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
Seiten129:1–129:19
Seitenumfang19
ISBN (elektronisch)9783959772358
ISBN (Print)978-3-95977-235-8
PublikationsstatusVeröffentlicht - 1 Juli 2022
Peer-Review-StatusJa

Publikationsreihe

Reihe49th EATCS International Conference on Automata, Languages, and Programming (ICALP 2022) ; Vol. 229
Band229
ISSN1868-8969

Konferenz

Titel49th EATCS International Colloquium on Automata, Languages and Programming
KurztitelICALP 2022
Dauer4 - 8 Juli 2022
BekanntheitsgradInternationale Veranstaltung
Orthybrid
StadtParis
LandFrankreich

Externe IDs

Scopus 85133468318
dblp conf/icalp/PiribauerSB22
Mendeley 4b556c64-0cd5-3b56-a6aa-bf19263a8017
ORCID /0000-0002-5321-9343/work/142236696

Schlagworte

Forschungsprofillinien der TU Dresden

DFG-Fachsystematik nach Fachkollegium

Fächergruppen, Lehr- und Forschungsbereiche, Fachgebiete nach Destatis

ASJC Scopus Sachgebiete

Schlagwörter

  • Markov decision process, variance, stochastic shortest path problem