Align-RUDDER: Learning From Few Demonstrations by Reward Redistribution

Vihang P. Patil¹, Markus Hofmarcher¹, Markus-Constantin Dinu¹, Matthias Dorfer³, Patrick M. Blies³, Johannes Brandstetter¹, Jose A. Arjona-Medina¹, Sepp Hochreiter^{1, 2}

¹ ELLIS Unit Linz and LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Austria
² Institute of Advanced Research in Artificial Intelligence (IARAI)
³ enliteAI, Vienna, Austria

Detailed blog post on this paper at this link and a video showcasing the MineCraft agent at this link.

The full paper is available at https://arxiv.org/abs/2009.14108

Implementation of Align-RUDDER

This package contains an implementation of Align-RUDDER together with code to reproduce the results of artificial tasks I & II as stated in the paper. For the sake of time the default settings include only 10 seeds per experiment instead of the 100 used for the results in the paper.

Dependencies

To reproduce all results we provide an environment.yml file to setup a conda environment with the required packages. Run the following command to create the environment:

conda env create --file environment.yml
conda activate align-rudder
pip install -e .

Usage

To recreate the results from the paper you can run the included run scripts for the FourRooms and EightRooms environments and the respective method.

Align-RUDDER

python align_rudder/run_four_alignrudder.py
python align_rudder/run_eight_alignrudder.py

Behavioral Cloning + Q-Learning

python align_rudder/run_four_bc.py
python align_rudder/run_eight_bc.py

DQFD (Deep Q-Learning from Demonstrations)

python align_rudder/run_four_dqfd.py
python align_rudder/run_eight_dqfd.py

RUDDER (LSTM)

python align_rudder/run_four_rudder_lstm.py
python align_rudder/run_eight_rudder_lstm.py

Results

Once you ran all experiments you are interested in you can run the following script to get a summary of the results. By default plots for all available environments will be generated.

python align_rudder/plot_results.py [--env "FourRooms"|"EightRooms"|"all"]

LICENSE

MIT LICENSE

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
align_rudder		align_rudder
LICENSE.md		LICENSE.md
README.md		README.md
environment.yml		environment.yml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Align-RUDDER: Learning From Few Demonstrations by Reward Redistribution

Detailed blog post on this paper at this link and a video showcasing the MineCraft agent at this link.

Implementation of Align-RUDDER

Dependencies

Usage

Results

LICENSE

About

Releases

Packages

Languages

License

vihangp/align-rudder

Folders and files

Latest commit

History

Repository files navigation

Align-RUDDER: Learning From Few Demonstrations by Reward Redistribution

Detailed blog post on this paper at this link and a video showcasing the MineCraft agent at this link.

Implementation of Align-RUDDER

Dependencies

Usage

Results

LICENSE

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages