Solving Robust Markov Decision Processes: Generic, Reliable, Efficient
Publikation: Beitrag in Fachzeitschrift › Konferenzartikel › Beigetragen › Begutachtung
Beitragende
Abstract
Markov decision processes (MDP) are a well-established model for sequential decision-making in the presence of probabilities. In *robust* MDP (RMDP), every action is associated with an *uncertainty set* of probability distributions, modelling that transition probabilities are not known precisely. Based on the known theoretical connection to stochastic games, we provide a framework for solving RMDPs that is generic, reliable, and efficient. It is *generic* both with respect to the model, allowing for a wide range of uncertainty sets, including but not limited to intervals, L1- or L2-balls, and polytopes; and with respect to the objective, including long-run average reward, undiscounted total reward, and stochastic shortest path. It is *reliable*, as our approach not only converges in the limit, but provides precision guarantees at any time during the computation. It is *efficient* because -- in contrast to state-of-the-art approaches -- it avoids explicitly constructing the underlying stochastic game. Consequently, our prototype implementation outperforms existing tools by several orders of magnitude and can solve RMDPs with a million states in under a minute.
Details
| Originalsprache | Englisch |
|---|---|
| Seiten (von - bis) | 26631-26641 |
| Seitenumfang | 11 |
| Fachzeitschrift | Proceedings of the AAAI Conference on Artificial Intelligence |
| Jahrgang | 39 |
| Ausgabenummer | 25 |
| Publikationsstatus | Veröffentlicht - 11 Apr. 2025 |
| Peer-Review-Status | Ja |
Konferenz
| Titel | 39th AAAI Conference on Artificial Intelligence |
|---|---|
| Kurztitel | AAAI-25 |
| Veranstaltungsnummer | 39 |
| Dauer | 25 Februar - 4 März 2025 |
| Webseite | |
| Bekanntheitsgrad | Internationale Veranstaltung |
| Ort | Pennsylvania Convention Center |
| Stadt | Philadelphia |
| Land | USA/Vereinigte Staaten |
Externe IDs
| unpaywall | 10.1609/aaai.v39i25.34865 |
|---|---|
| Scopus | 105003911147 |