multi-armed-bandits

Implementation of an example algorithm solving multi-armed bandits with gaussian rewards problem.

Assumes we know that each bandit has variance of rewards equal to 1.

Initially each bandit is drawn once. The order of subsequent draws is determined by finding bandit with the highest value of mean reward + 3 * its standard deviation.

Usage

(simulate bandits-distribution-standard-deviation number-of-bandits number-of-steps)

Returns vector consisting of:

collection of pairs [mean variance] defining each bandit,
vector of pairs [mean variance-of-mean] describing our knowledge of bandits based on drawn numbers,
vector of intigers - total numbers of draws from each bandit.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
README.md		README.md
core.clj		core.clj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

multi-armed-bandits

Usage

About

Releases

Packages

Languages

michaljasionowski/multi-armed-bandits

Folders and files

Latest commit

History

Repository files navigation

multi-armed-bandits

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages