Skip to content
This repository has been archived by the owner on Feb 16, 2020. It is now read-only.

Prevent dataset scanning from depleting memory #1970

Merged
merged 1 commit into from
Feb 26, 2018
Merged

Prevent dataset scanning from depleting memory #1970

merged 1 commit into from
Feb 26, 2018

Conversation

cmroche
Copy link
Contributor

@cmroche cmroche commented Feb 25, 2018

  • What kind of change does this PR introduce? (Bug fix, feature, docs update, ...)

Bugfix

  • What is the current behavior? (You can also link to an open issue here)

If you have large datasets, with many markets imported the dataset scanner is going to fork for every single exchange and market combination all at once. This depletes memory very fast, and doesn't provide a significant gain in processing performance (can often lead to worse performance even).

AWS and Docker machines often crash due to the scanning requiring several gigs of memory.

  • What is the new behavior (if this is a feature change)?

The dataset scanner now queues and runs only as many forks as there are CPU cores on the system. Greatly reducing the impact on memory to a couple 10s of MB.

  • Other information:

@askmike
Copy link
Owner

askmike commented Feb 26, 2018

super slick!

@askmike askmike merged commit 7cf20f6 into askmike:develop Feb 26, 2018
@cmroche cmroche deleted the scanning branch March 6, 2018 01:22
This was referenced Mar 21, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants