Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test Llama, rebalancing, throughput eval, and all CLI scripts #452

Merged
merged 32 commits into from
Aug 8, 2023

Conversation

borzunov
Copy link
Collaborator

@borzunov borzunov commented Aug 8, 2023

This PR extends CI to:

  1. Test Llama code using TinyLlama-v0.
  2. Test rebalancing (sets up a situation where the 1st server needs to change its original position).
  3. Check if benchmark scripts run (in case someone breaks its code). Note that the benchmark results are meaningless here (since they're measured on a tiny swarm of CPU servers, with low --n_steps).
  4. Test petals.cli.run_dht.
  5. Increase swap space and watch free RAM (a common issue is that actions are cancelled without explanation if there's not enough RAM - so it's a useful reminder + debug tool).
  6. Fix flapping tests for bloom-560m by increasing tolerance.

Other minor changes: fix --help messages to show defaults, fix docs, tune rebalancing constants.

@@ -78,7 +78,7 @@ def __init__(
sender_threads: int = 1,
balance_quality: float = 0.75,
mean_balance_check_period: float = 120,
mean_block_selection_delay: float = 2.5,
mean_block_selection_delay: float = 5,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previous delay was not enough - servers were often choosing the same blocks since they didn't have time to write DHT "JOINING" messages.

@borzunov borzunov changed the title Show argparse defaults, test petals.cli.run_dht Test rebalancing and petals.cli.run_dht, show argparse defaults Aug 8, 2023
@borzunov borzunov changed the title Test rebalancing and petals.cli.run_dht, show argparse defaults Test rebalancing, throughput eval, and petals.cli.run_dht, show argparse defaults Aug 8, 2023
@borzunov borzunov force-pushed the fix-argparse branch 3 times, most recently from 0a6178b to dde6a1a Compare August 8, 2023 03:32
@borzunov borzunov changed the title Test rebalancing, throughput eval, and petals.cli.run_dht, show argparse defaults Test Llama rebalancing, throughput eval, and petals.cli.run_dht Aug 8, 2023
@borzunov borzunov changed the title Test Llama rebalancing, throughput eval, and petals.cli.run_dht Test Llama, rebalancing, throughput eval, and petals.cli.run_dht Aug 8, 2023
@borzunov borzunov force-pushed the fix-argparse branch 2 times, most recently from a550200 to a255bd3 Compare August 8, 2023 03:49
@borzunov borzunov force-pushed the fix-argparse branch 3 times, most recently from ef828cc to 9acc7f1 Compare August 8, 2023 05:25
@borzunov borzunov changed the title Test Llama, rebalancing, throughput eval, and petals.cli.run_dht Test Llama, rebalancing, throughput eval, and all CLI scripts Aug 8, 2023
@borzunov borzunov force-pushed the fix-argparse branch 2 times, most recently from 5208101 to a065dce Compare August 8, 2023 14:15
@borzunov borzunov merged commit 8c546d9 into main Aug 8, 2023
9 checks passed
@borzunov borzunov deleted the fix-argparse branch August 8, 2023 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant