Download PDFOpen PDF in browser

Learning to Plan from Raw Data in Grid-based Games

14 pagesPublished: September 17, 2018

Abstract

An agent that autonomously learns to act in its environment must acquire a model of the domain dynamics. This can be a challenging task, especially in real-world domains, where observations are high-dimensional and noisy. Although in automated planning the dynamics are typically given, there are action schema learning approaches that learn sym- bolic rules (e.g. STRIPS or PDDL) to be used by traditional planners. However, these algorithms rely on logical descriptions of environment observations. In contrast, recent methods in deep reinforcement learning for games learn from pixel observations. However, they typically do not acquire an environment model, but a policy for one-step action selec- tion. Even when a model is learned, it cannot generalize to unseen instances of the training domain. Here we propose a neural network-based method that learns from visual obser- vations an approximate, compact, implicit representation of the domain dynamics, which can be used for planning with standard search algorithms, and generalizes to novel domain instances. The learned model is composed of submodules, each implicitly representing an action schema in the traditional sense. We evaluate our approach on visual versions of the standard domain Sokoban, and show that, by training on one single instance, it learns a transition model that can be successfully used to solve new levels of the game.

Keyphrases: learning from raw data, neural networks, state space search, transition model learning

In: Daniel Lee, Alexander Steen and Toby Walsh (editors). GCAI-2018. 4th Global Conference on Artificial Intelligence, vol 55, pages 54--67

Links:
BibTeX entry
@inproceedings{GCAI-2018:Learning_to_Plan_from,
  author    = {Andrea Dittadi and Thomas Bolander and Ole Winther},
  title     = {Learning to Plan from Raw Data in Grid-based Games},
  booktitle = {GCAI-2018. 4th Global Conference on Artificial Intelligence},
  editor    = {Daniel Lee and Alexander Steen and Toby Walsh},
  series    = {EPiC Series in Computing},
  volume    = {55},
  pages     = {54--67},
  year      = {2018},
  publisher = {EasyChair},
  bibsource = {EasyChair, https://easychair.org},
  issn      = {2398-7340},
  url       = {https://easychair.org/publications/paper/QtZn},
  doi       = {10.29007/s8jk}}
Download PDFOpen PDF in browser