-
Notifications
You must be signed in to change notification settings - Fork 489
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rebalance swarm when necessary #34
Conversation
c349b5b
to
74c1b11
Compare
eccb4ad
to
fdb0f73
Compare
33f3291
to
b5d54f4
Compare
9191963
to
f3ea120
Compare
5b2406d
to
2308245
Compare
2308245
to
3c292fc
Compare
3c292fc
to
4a05f40
Compare
@@ -29,76 +30,13 @@ | |||
|
|||
|
|||
class Server(threading.Thread): | |||
"""Serves one or more bloom layers for inference, forward and backward; announces oneself to the DHT""" | |||
""" | |||
Runs ModuleContainer, periodically checks that the network is balanced, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] ModuleContainer sounds somewhat too generic for a class that announces modules to DHT, accepts requests from p2pd, etc
Maybe
- Server -> LoadBalancedServer
- ModuleContainer -> Server
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that ModuleContainer
is not a perfect name but I didn't come up with better options. If we find one, we can rename this class later.
|
||
|
||
class ModuleContainer(threading.Thread): | ||
"""Serves a set of specific Bloom layers for inference, forward, and backward. Announces itself over the DHT.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ServedLayers?
src/server/server.py
Outdated
max_block_selection_delay: float = 1, | ||
mean_block_selection_delay: float = 0.5, | ||
mean_balance_check_period: float = 150, | ||
min_balance_quality: float = 0.8, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
min_balance_quality: float = 0.8, | |
min_balance_quality: float = 0.0, |
TODO: Disable rebalancing by default, unless we solve issues with shmem
Status: I've tested it on simple cases, it works.
Future work:
Server
should followModuleContainer
status and restart it if it crashes.block_selection.py
.block_selection.py
.