Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backtest/Plot are not thread safe ? #125

Open
wesleywilian opened this issue Aug 4, 2020 · 5 comments
Open

Backtest/Plot are not thread safe ? #125

wesleywilian opened this issue Aug 4, 2020 · 5 comments
Labels
bug Something isn't working Hacktoberfest https://hacktoberfest.digitalocean.com

Comments

@wesleywilian
Copy link

wesleywilian commented Aug 4, 2020

Expected Behavior

Run backtests and plots with multiple threads without concurrency problems.

Actual Behavior

Multithreaded execution results in data inconsistency. When plotting. (and backtesting maybe?)

Steps to Reproduce

  1. Run this code.
import os
import threading
from uuid import uuid4

from backtesting import Backtest, Strategy
from backtesting.lib import crossover
from backtesting.test import GOOG, SMA
from flask import Flask
from waitress import serve

app = Flask(__name__)


class SmaCross(Strategy):
    n1 = 15
    n2 = 30

    def init(self):
        setattr(self, "sma1", self.I(SMA, self.data.Close, self.n1))
        setattr(self, "sma2", self.I(SMA, self.data.Close, self.n2))

    def next(self):
        if crossover(getattr(self, "sma1"), getattr(self, "sma2")):
            self.buy()
        elif crossover(getattr(self, "sma2"), getattr(self, "sma1")):
            self.sell()


class Backtesting:
    def compute(self):
        bt = Backtest(GOOG, SmaCross, cash=10000, commission=.002, exclusive_orders=True)
        bt.run()
        file_uuid = str(uuid4())
        filename = "/tmp/backtest_plot_" + file_uuid + ".html"
        print("thread:", threading.get_ident(), "start creating file:", filename)
        bt.plot(open_browser=False, filename=filename)
        try:
            f = open(filename)
            some_raw_data = f.read()
            f.close()
            os.remove(filename)
        except FileNotFoundError:
            some_raw_data = ""
            print("thread:", threading.get_ident(), "file:", filename, "not found!")
        print("thread:", threading.get_ident(), "end creating file:", filename)
        return ""


@app.route('/', methods=['GET'])
def index():
    return Backtesting().compute()


if __name__ == '__main__':
    serve(app, host='0.0.0.0', port=9090, threads=10)
  1. Open two terminals and run on both (or use jmeter or something like that):
    while true ; do curl localhost:9090 ; done

  2. Check the logs

thread: 140514310731520 start creating file: /tmp/backtest_plot_bd84f04f-6c98-4180-9d4a-e353d917b4c3.html
thread: 140514302338816 start creating file: /tmp/backtest_plot_46abb441-d70a-472b-b67e-de446dd4d8c1.html
thread: 140514310731520 file: /tmp/backtest_plot_bd84f04f-6c98-4180-9d4a-e353d917b4c3.html not found!
thread: 140514310731520 end creating file: /tmp/backtest_plot_bd84f04f-6c98-4180-9d4a-e353d917b4c3.html
thread: 140514302338816 end creating file: /tmp/backtest_plot_46abb441-d70a-472b-b67e-de446dd4d8c1.html

simplifying...

thread: 1 start creating file:  A
thread: 2 start creating file:  B
thread: 1 file:                 A not found!
thread: 1 end creating file:    A
thread: 2 end creating file:    B

The "thread 1" starts creating the "A" file, then the "thread 2" starts creating the "B" file.
As we can see, there is a inconsistency due the threads usage (sorry for the flask example).

I suspect, the problem is the "SmaCross" and "Strategy" classes.

@kernc would you have suggestions, how we can fix this ?

Additional info

  • Backtesting version: 0.2.1
@kernc
Copy link
Owner

kernc commented Aug 4, 2020

If likely, the first thing I'd look at is the way we use Bokeh's global state:

def plot(*, results: pd.Series,
df: pd.DataFrame,
indicators: List[_Indicator],
filename='', plot_width=None,
plot_equity=True, plot_pl=True,
plot_volume=True, plot_drawdown=False,
smooth_equity=False, relative_equity=True,
superimpose=True, resample=True,
reverse_indicators=True,
show_legend=True, open_browser=True):
"""
Like much of GUI code everywhere, this is a mess.
"""
# We need to reset global Bokeh state, otherwise subsequent runs of
# plot() contain some previous run's cruft data (was noticed when
# TestPlot.test_file_size() test was failing).
if not filename and not IS_JUPYTER_NOTEBOOK:
filename = _windos_safe_filename(str(results._strategy))
_bokeh_reset(filename)

def _bokeh_reset(filename=None):
curstate().reset()
if filename:
if not filename.endswith('.html'):
filename += '.html'
output_file(filename, title=filename)
elif IS_JUPYTER_NOTEBOOK:
curstate().output_notebook()

PR welcome!

@kernc kernc added the bug Something isn't working label Aug 4, 2020
@wesleywilian
Copy link
Author

I see

So... basically we need to reset bokeh's global state ?

@wesleywilian
Copy link
Author

I was thinking.

Isn't the problem because we're not inheriting a class "SmaCross"?

Even if we reset the global state, it will still not be thread safe.

(Correct me if I'm wrong)

@kernc
Copy link
Owner

kernc commented Aug 5, 2020

The state is already reset each time. I think we need to replace the use of Bokeh's global state (curstate()) with a new State object for each plot() invocation.

I can't say for sure that that's the only critical section, but it's the obvious one and the plot-file-not-found error points to it as well.

We are using SmaCross only to instantiate further new objects; we never refer to the class directly, at least not in a writing manner:

self._strategy = strategy

strategy = self._strategy(broker, data, kwargs) # type: Strategy
strategy.init()

@wesleywilian
Copy link
Author

wesleywilian commented Aug 10, 2020

Hi @kernc

I did a test to validate that the backtest is not being affected

import os
import threading
from uuid import uuid4

from backtesting import Backtest, Strategy
from backtesting.lib import crossover
from backtesting.test import GOOG, SMA
from flask import Flask
from waitress import serve

app = Flask(__name__)


class SmaCross(Strategy):
    n1 = 15
    n2 = 30

    def init(self):
        setattr(self, "sma1", self.I(SMA, self.data.Close, self.n1))
        setattr(self, "sma2", self.I(SMA, self.data.Close, self.n2))

    def next(self):
        if crossover(getattr(self, "sma1"), getattr(self, "sma2")):
            self.buy()
        elif crossover(getattr(self, "sma2"), getattr(self, "sma1")):
            self.sell()


class SmaCrossBankrupt(Strategy):
    # n1 and n2 inverted on bankrupt class...
    n1 = 30
    n2 = 15

    def init(self):
        setattr(self, "sma1", self.I(SMA, self.data.Close, self.n1))
        setattr(self, "sma2", self.I(SMA, self.data.Close, self.n2))

    def next(self):
        if crossover(getattr(self, "sma1"), getattr(self, "sma2")):
            self.buy()
        elif crossover(getattr(self, "sma2"), getattr(self, "sma1")):
            self.sell()


class Backtesting:

    def __init__(self, strategy_id):
        self.strategy_id = strategy_id

    def compute(self):
        if self.strategy_id == 1:
            strategy_class = SmaCross
        else:
            strategy_class = SmaCrossBankrupt

        bt = Backtest(GOOG, strategy_class, cash=10000, commission=.002, exclusive_orders=True)
        x = bt.run()
        file_uuid = str(uuid4())
        filename = "/tmp/backtest_plot_" + file_uuid + ".html"
        print("thread:", threading.get_ident(), "start creating file:", filename)
        bt.plot(open_browser=False, filename=filename)
        try:
            f = open(filename)
            some_raw_data = f.read()
            f.close()
            os.remove(filename)
        except FileNotFoundError:
            some_raw_data = ""
            print("thread:", threading.get_ident(), "file:", filename, "not found!")
        print("thread:", threading.get_ident(), "end creating file:", filename)

        ret = "Return [%] {}\n".format(x.get("Return [%]"))

        return ret


@app.route('/<int:strategy_id>', methods=['GET'])
def index(strategy_id):
    return Backtesting(strategy_id).compute()


if __name__ == '__main__':
    serve(app, host='0.0.0.0', port=9090, threads=10)

Open multiple terminals with
while true ; do curl localhost:9090/1 ; done
expected results on all requests:
Return [%] 194.62947480000028

and at same time, multiples with another context

while true ; do curl localhost:9090/2 ; done
expected results on all requests:
Return [%] -90.43158260000001

So, this confirms, the backtest is not affected.

I'll check your suggestions and internally analyze how the plotting feature works...

Thanks @kernc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Hacktoberfest https://hacktoberfest.digitalocean.com
Projects
None yet
Development

No branches or pull requests

2 participants