Reactive Exploration to Cope with Non-Stationarity in Lifelong Reinforcement Learning

Christian Steinparz², Thomas Schmied¹, Fabian Paischer¹, Marius-Constantin Dinu^1,3, Vihang Patil¹, Angela Bitto-Nemling^1,4, Hamid Eghbal-Zadeh¹, Sepp Hochreiter^1,4

¹ELLIS Unit Linz and LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Austria
²Visual Data Science Lab, Institute of Compute Graphics, Johannes Kepler University Linz, Austria
³Dynatrace Research, Linz, Austria
⁴Institute of Advanced Research in Artificial Intelligence (IARAI), Vienna, Austria

This repository contains the source code for our paper "Reactive Exploration to Cope with Non-Stationarity in Lifelong Reinforcement Learning" accepted at CoLLAs 2022.

Installation

First, clone the repository and install the conda environment from the repository root (using either linux or windows config file):

conda env create -f environment_linux.yaml
conda activate reactive_exploration

Then follow the Jelly-Bean-World installation instructions. We use the following version:

git clone https://github.com/eaplatanios/jelly-bean-world.git
cd jelly-bean-world
git checkout 9bb16780e72d9d871384f9bcefd3b4e029a7b0ef
git submodule update --init --recursive
cd api/python
python setup.py install

From the project root directory install and update the submodule for ICM+PPO:

git submodule update --init --recursive

Running experiments

This codebase relies on Hydra, which configures experiments via .yaml files. Hydra automatically creates the log folder structure for a given run, as specified in the respective config.yaml file.

Running single experiments

By default, Hydra uses the configuration in configs/config.yaml. This config file defines how Hydra generates the directory structures for executed experiments under the block hydra.

The config.yaml contains the default parameters. The file references the respective default parameter files under the block defaults.

By default, main.py trains a PPO+ICM agent on the Colour-Swap task from the paper by referencing the environment configuration configs/env_params/colour_swap.yaml and agent configuration configs/agent_params/icm.yaml To execute this configuration, run:

python main.py

For other configurations regarding agents and environments reference the files in configs/. For instance, to execute PPO+ICM on the Rotation task, run:

python main.py env_params=rotator

Experiment logs are synced to wandb. To execute experiments without wandb-logging, run:

python main.py use_wandb=False

Running multiple experiments

All hyperparameters specified in the .yaml configuration files can be manipulated from the commandline. For example, to execute ICM and RND on the Colour-Swap task and Rotation task (configs/env_params/rotator.yaml) using 5 seeds, run:

python -m main.py agent_params=icm,rnd env_params=colour_swap,rotator seed=1,2,3,4,5

Citation

This paper has been accepted to the Conference on Lifelong Learning Agents (CoLLAs) 2022. While the conference proceedings do not yet exist, we recommend the following citation:

@misc{steinparz2022reactiveexp,
  title={Reactive Exploration to Cope with Non-Stationarity in Lifelong Reinforcement Learning},
  author={Steinparz, Christian and Schmied, Thomas and Paischer, Fabian and Dinu, Marius-Constantin and Patil, Vihang and 
          Bitto-Nemling, Angela and Eghbal-zadeh, Hamid and Hochreiter, Sepp},
  journal={arXiv preprint, accepted to Conference on Lifelong Learning Agents 2022},
  year={2022},
  eprint={X}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
configs		configs
img		img
src		src
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
environment_linux.yaml		environment_linux.yaml
environment_windows.yml		environment_windows.yml
main.py		main.py
visualize_env.py		visualize_env.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reactive Exploration to Cope with Non-Stationarity in Lifelong Reinforcement Learning

Installation

Running experiments

Running single experiments

Running multiple experiments

Citation

About

Releases

Packages

Languages

License

vihangp/reactive-exploration

Folders and files

Latest commit

History

Repository files navigation

Reactive Exploration to Cope with Non-Stationarity in Lifelong Reinforcement Learning

Installation

Running experiments

Running single experiments

Running multiple experiments

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages