Exp3.T

This project contains the code used for the simulations in the paper: "Trend Detection based Regret Minimization for Bandit Problems" - Nakhe and Reiffenhäuser.

The code essentially implements four algorithms, namely

Standard Exp3
Exp3.S
Exp3.R
Exp3D (algorithm proposed in the paper).

The performance of these algorithms is compared for two different reward models, namely a. dynamic stochastic regime b. adverserial regime with gap

These models represent a generalization of the conventional models.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Figures		Figures
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Exp3.T

About

Releases

Packages

Languages

pareshnakhe/Exp3.T

Folders and files

Latest commit

History

Repository files navigation

Exp3.T

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages