Module Code: ECS7002P
Module Leader: Diego Perez-Liebana
Semester: 1
Submission Date: 10th Januaray 2022
Team Members:
The frozen lake environment has two main variants: the small frozen lake (4x4) and the big frozen lake (8x8). In both cases, each tile in a square grid corresponds to a state. There is also an additional absorbing state, which will be introduced soon. There are four types of tiles: start (grey), frozen lake (light blue), hole (dark blue), and goal (white). The agent has four actions, which correspond to moving one tile up, left, down, or right. However, with probability 0.1, the environment ignores the desired direction and the agent slips (moves one tile in a random direction, which may be the desired direction). An action that would cause the agent to move outside the grid leaves the state unchanged.
The agent receives reward 1 upon taking an action at the goal. In every other case, the agent receives zero reward. Note that the agent does not receive a reward upon moving into the goal (nor a negative reward upon moving into a hole). Upon taking an action at the goal or in a hole, the agent moves into the absorbing state. Every action taken at the absorbing state leads to the absorbing state, which also does not provide rewards. Assume a discount factor of γ = 0.9.
For the purposes of model-free reinforcement learning (or interactive testing), the agent is able to interact with the frozen lake for a number of time steps that is equal to the number of tiles.
+ Implement the Enviroment in Python
- frozen_lake.py
+ Implement policy evaluation
+ Implement policy improvement
+ Implement policy iteration
+ Implement value iteration
- model_based_rl.py
+ Implement Sarsa control
+ Implement Q-learning control
- tabular_model_free_rl.py
+ Implement Sarsa control using linear function approximation
+ Implement Q-learning control using linear function approximation
- non_tabular_model_free_rl.py
+ Implement a main function to execute all tasks
- flake.py
The man page for the flake implementation.
flake - the rl game
python ./flake.py [-T <task_number>] [-s] [-l] [-v]
flake is a small CLI game for reinforcement learning based agents.
Multiple rl methodes can be used for the agents
-T <task_number>
Run flake as execute the given Task, represented by the number 2 to 6.
By deflaut task 5 is executed.
2: Model based RL
3: tabular model free RL
4: non tabular model free RL
5: all
6: Itteration count to find optimal policy for tabular model free RL
- s
Run the program with a small lake (4x4)
- l
Run the program with a big lake (8x8)
- v
output the results as a .png file.
the ghostscript library is required (instructions below).
Tile represenation:
Tile | Icon |
---|---|
start | & |
frozen | . |
hole | # |
goal | $ |
-
For Mac:
brew install ghostscript
-
For Windows:
Go on the official Webpage to download the installer