Skip to content

Code for the paper "Reactive Exploration to Cope with Non-Stationarity in Lifelong Reinforcement Learning"

License

Notifications You must be signed in to change notification settings

vihangp/reactive-exploration

 
 

Repository files navigation

Reactive Exploration to Cope with Non-Stationarity in Lifelong Reinforcement Learning

Christian Steinparz2, Thomas Schmied1, Fabian Paischer1, Marius-Constantin Dinu1,3, Vihang Patil1, Angela Bitto-Nemling1,4, Hamid Eghbal-Zadeh1, Sepp Hochreiter1,4

1ELLIS Unit Linz and LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Austria
2Visual Data Science Lab, Institute of Compute Graphics, Johannes Kepler University Linz, Austria
3Dynatrace Research, Linz, Austria
4Institute of Advanced Research in Artificial Intelligence (IARAI), Vienna, Austria

This repository contains the source code for our paper "Reactive Exploration to Cope with Non-Stationarity in Lifelong Reinforcement Learning" accepted at CoLLAs 2022.

Reactive exploration

Installation

Python 3.9 Pytorch Licence

First, clone the repository and install the conda environment from the repository root (using either linux or windows config file):

conda env create -f environment_linux.yaml
conda activate reactive_exploration 

Then follow the Jelly-Bean-World installation instructions. We use the following version:

git clone https://github.com/eaplatanios/jelly-bean-world.git
cd jelly-bean-world
git checkout 9bb16780e72d9d871384f9bcefd3b4e029a7b0ef
git submodule update --init --recursive
cd api/python
python setup.py install

From the project root directory install and update the submodule for ICM+PPO:

git submodule update --init --recursive

Running experiments

This codebase relies on Hydra, which configures experiments via .yaml files. Hydra automatically creates the log folder structure for a given run, as specified in the respective config.yaml file.

Running single experiments

By default, Hydra uses the configuration in configs/config.yaml. This config file defines how Hydra generates the directory structures for executed experiments under the block hydra.

The config.yaml contains the default parameters. The file references the respective default parameter files under the block defaults.

By default, main.py trains a PPO+ICM agent on the Colour-Swap task from the paper by referencing the environment configuration configs/env_params/colour_swap.yaml and agent configuration configs/agent_params/icm.yaml To execute this configuration, run:

python main.py

For other configurations regarding agents and environments reference the files in configs/. For instance, to execute PPO+ICM on the Rotation task, run:

python main.py env_params=rotator

Experiment logs are synced to wandb. To execute experiments without wandb-logging, run:

python main.py use_wandb=False

Running multiple experiments

All hyperparameters specified in the .yaml configuration files can be manipulated from the commandline. For example, to execute ICM and RND on the Colour-Swap task and Rotation task (configs/env_params/rotator.yaml) using 5 seeds, run:

python -m main.py agent_params=icm,rnd env_params=colour_swap,rotator seed=1,2,3,4,5

Citation

This paper has been accepted to the Conference on Lifelong Learning Agents (CoLLAs) 2022. While the conference proceedings do not yet exist, we recommend the following citation:

@misc{steinparz2022reactiveexp,
  title={Reactive Exploration to Cope with Non-Stationarity in Lifelong Reinforcement Learning},
  author={Steinparz, Christian and Schmied, Thomas and Paischer, Fabian and Dinu, Marius-Constantin and Patil, Vihang and 
          Bitto-Nemling, Angela and Eghbal-zadeh, Hamid and Hochreiter, Sepp},
  journal={arXiv preprint, accepted to Conference on Lifelong Learning Agents 2022},
  year={2022},
  eprint={X}
}

About

Code for the paper "Reactive Exploration to Cope with Non-Stationarity in Lifelong Reinforcement Learning"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages

  • Python 100.0%