Skip to content

Commit

Permalink
update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
simerplaha committed May 9, 2020
1 parent 81ebb1f commit 711fbfb
Show file tree
Hide file tree
Showing 6 changed files with 7 additions and 7 deletions.
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Reinforcement learning

Material referred
Referred material

- Book by Sutton & Barto - [Reinforcement Learning: An Introduction](http://incompleteideas.net/book/the-book-2nd.html)
- Lectures by David Silver - [Introduction to reinforcement learning](https://www.youtube.com/playlist?list=PLqYmG7hTraZDM-OYHWgPebj2MfCFzFObQ)
Expand All @@ -20,7 +20,7 @@ Last lever has the highest probability (`0.90`) therefore has more chance of get

## Student Markov Chain

Implements the `Student MDP` at David Silver's lecture 2 at [this (24:56)](https://youtu.be/lfHX2hHRMVQ?list=PLqYmG7hTraZDM-OYHWgPebj2MfCFzFObQ&t=1496) timestamp.
Implements the `Student MDP` from David Silver's lecture 2 at [this (24:56)](https://youtu.be/lfHX2hHRMVQ?list=PLqYmG7hTraZDM-OYHWgPebj2MfCFzFObQ&t=1496) timestamp.
There are tests in [StudentSpec](/src/test/scala/lecture/StudentSpec.scala) that prove that no other state can
return the same optimal value as optimal state using bellman's equation.

Expand All @@ -38,7 +38,7 @@ Implements bellman's equation to find the quickest path to targets within a grid
The following shows results of a 11x11 grid with 3 goal targets - ⌂ (circled green). The arrows indicate the optimal direction
to take at each grid to reach the nearest target.

![direction](doc/img/grid_direction_green.png "direction")
![direction](doc/img/grid_direction.png "direction")

Value function created after 100 value iteration.

Expand Down
Binary file modified doc/img/grid_direction.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed doc/img/grid_direction_green.png
Binary file not shown.
Binary file modified doc/img/grid_values.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion src/main/scala/grid/GridWorld.scala
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ object GridWorld extends App {
* Default terminals are top left, middle and bottom right.
*/
val terminals: Seq[(Int, Int)] =
Seq((0, 0), ((gridRows - 1) / 2, (gridCols - 1) / 2), (gridRows - 1, gridCols - 1))
Seq((0, 0), (gridRows - 1, gridCols - 1))

sealed trait Grid
object Grid {
Expand Down
6 changes: 3 additions & 3 deletions src/main/scala/tictactoe/Game.scala
Original file line number Diff line number Diff line change
Expand Up @@ -36,9 +36,9 @@ object Game {

println(
"""
|************
|Game started
|************
|*********************************************
|Game started - TODO - needs better prediction
|*********************************************
|""".stripMargin)

println("You are X and the bot is O")
Expand Down

0 comments on commit 711fbfb

Please sign in to comment.