Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Composer Jenkinsfile #82

Merged
merged 202 commits into from
Jan 20, 2022
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
202 commits
Select commit Hold shift + click to select a range
c1ebaf2
Composer Jenkinsfile
ravi-mosaicml Nov 12, 2021
92e8491
testing
ravi-mosaicml Nov 12, 2021
1dae58f
testing
ravi-mosaicml Nov 12, 2021
af23c9f
testing
ravi-mosaicml Nov 12, 2021
3f0eda2
testing
ravi-mosaicml Nov 12, 2021
f4961e9
testing
ravi-mosaicml Nov 12, 2021
d409d09
testing
ravi-mosaicml Nov 12, 2021
e86f0b2
testing
ravi-mosaicml Nov 12, 2021
d612a81
testing
ravi-mosaicml Nov 12, 2021
aa23fc6
testing
ravi-mosaicml Nov 12, 2021
20764b0
testing
ravi-mosaicml Nov 12, 2021
948d562
testing
ravi-mosaicml Nov 12, 2021
026b2ba
testing
ravi-mosaicml Nov 12, 2021
144d033
testing
ravi-mosaicml Nov 12, 2021
9244b60
testing
ravi-mosaicml Nov 12, 2021
78b3ba3
testing
ravi-mosaicml Nov 12, 2021
44be644
testing
ravi-mosaicml Nov 12, 2021
8ea4eef
testing
ravi-mosaicml Nov 12, 2021
fffe97a
testing
ravi-mosaicml Nov 12, 2021
8bbf1d6
testing
ravi-mosaicml Nov 12, 2021
a35c545
Fixed exit
ravi-mosaicml Nov 12, 2021
c739a1d
testing
ravi-mosaicml Nov 12, 2021
71f5dc0
testing
ravi-mosaicml Nov 12, 2021
a47ae49
testing
ravi-mosaicml Nov 12, 2021
eb3f95c
testing
ravi-mosaicml Nov 12, 2021
50e172d
testing
ravi-mosaicml Nov 12, 2021
2ebbb14
testing
ravi-mosaicml Nov 12, 2021
1a17e43
testing
ravi-mosaicml Nov 12, 2021
fed09c5
testing
ravi-mosaicml Nov 12, 2021
99e18d3
testing
ravi-mosaicml Nov 12, 2021
b9b9a45
testing
ravi-mosaicml Nov 12, 2021
d8ae8e7
testing
ravi-mosaicml Nov 12, 2021
0bb6ec0
testing
ravi-mosaicml Nov 12, 2021
377c9b8
testing
ravi-mosaicml Nov 12, 2021
2755499
testing
ravi-mosaicml Nov 12, 2021
e6d6a7b
testing
ravi-mosaicml Nov 12, 2021
a22f84b
testing
ravi-mosaicml Nov 12, 2021
343bf59
testing
ravi-mosaicml Nov 12, 2021
90f98f7
testing
ravi-mosaicml Nov 13, 2021
ece96a8
testing
ravi-mosaicml Nov 13, 2021
df08e90
testing
ravi-mosaicml Nov 15, 2021
3a759a7
Update README.md
ravi-mosaicml Nov 16, 2021
689fb78
Merge branch 'dev' into ravi/jenkinsfile
ravi-mosaicml Nov 16, 2021
d19e741
Merge branch 'ravi/jenkinsfile' of github.com:mosaicml/composer into …
ravi-mosaicml Nov 16, 2021
e879969
testing
ravi-mosaicml Nov 16, 2021
37e3853
testing
ravi-mosaicml Nov 16, 2021
52286c1
testing
ravi-mosaicml Nov 16, 2021
5a8faef
testing
ravi-mosaicml Nov 16, 2021
5514abe
testing
ravi-mosaicml Nov 16, 2021
a148f47
testing
ravi-mosaicml Nov 16, 2021
4fe7ceb
testing
ravi-mosaicml Nov 16, 2021
ebcc834
testing
ravi-mosaicml Nov 16, 2021
193fddc
testing
ravi-mosaicml Nov 16, 2021
7dfeab8
Update Jenkinsfile
ravi-mosaicml Nov 16, 2021
94b29a5
Update Jenkinsfile
ravi-mosaicml Nov 16, 2021
0578367
Update Jenkinsfile
ravi-mosaicml Nov 16, 2021
aac7bda
Removing bad symlink
ravi-mosaicml Nov 19, 2021
ef5fac9
DDP Port Auto Selection; Removed spawning in tests
ravi-mosaicml Nov 19, 2021
c23ac72
Merge branch 'ravi/remove_spawn' into ravi/jenkinsfile
ravi-mosaicml Nov 19, 2021
73cab9d
Fixed jenkinsfile
ravi-mosaicml Nov 19, 2021
1b8d731
Fixed missing tests
ravi-mosaicml Nov 19, 2021
a4a5d0d
Merge branch 'ravi/remove_spawn' into ravi/jenkinsfile
ravi-mosaicml Nov 19, 2021
9c7829a
Merge branch 'dev' into ravi/jenkinsfile
ravi-mosaicml Dec 3, 2021
06c6744
Docker builds
ravi-mosaicml Dec 3, 2021
33c819a
Smaller build matrix
ravi-mosaicml Dec 3, 2021
5bf4fad
Not running dev checks when building images
ravi-mosaicml Dec 3, 2021
10016d6
testing
ravi-mosaicml Dec 3, 2021
6eeeff1
testing
ravi-mosaicml Dec 3, 2021
14bdf33
testing
ravi-mosaicml Dec 3, 2021
555624b
Merge branch 'dev' into ravi/jenkinsfile
ravi-mosaicml Dec 21, 2021
55c37bb
Gpu tests
ravi-mosaicml Dec 21, 2021
fa3a08b
Typo fix
ravi-mosaicml Dec 21, 2021
05aff79
Testing
ravi-mosaicml Dec 21, 2021
95499c2
Testing
ravi-mosaicml Dec 21, 2021
3e7d8e1
Increased cpu limit
ravi-mosaicml Dec 21, 2021
99c4cba
Added log warning
ravi-mosaicml Dec 21, 2021
01a331b
Ensuring that the launch script raises on sigkilled processes
ravi-mosaicml Dec 21, 2021
7c15566
Upped the memory limit
ravi-mosaicml Dec 21, 2021
c1d8e93
Configure a default virtualenv in the dockerfile
ravi-mosaicml Dec 22, 2021
4bdd3af
Merge branch 'dev' into ravi/jenkinsfile
ravi-mosaicml Dec 22, 2021
e7a548a
Fix the run directory uploader
ravi-mosaicml Dec 22, 2021
d67e55b
Add ninja for deepspeed test
ravi-mosaicml Dec 22, 2021
0cd6e6e
Testing
ravi-mosaicml Dec 22, 2021
a9109fd
Fixed pytorch version in jenkinsfile
ravi-mosaicml Dec 22, 2021
361351a
Merge branch 'ravi/composer_in_virtualenv' into ravi/jenkinsfile
ravi-mosaicml Dec 22, 2021
5b384c5
Adding git to the jenkinsfile
ravi-mosaicml Dec 22, 2021
9507112
Update Dockerfile
ravi-mosaicml Dec 22, 2021
e8ee994
Update Dockerfile
ravi-mosaicml Dec 22, 2021
0a31ed1
Fixed Dockerfile virtualenv
ravi-mosaicml Dec 22, 2021
2217b52
Fixed python virtualenv in the dockerfile
ravi-mosaicml Dec 22, 2021
50e292e
testing
ravi-mosaicml Dec 22, 2021
256f964
testing
ravi-mosaicml Dec 22, 2021
eab9e9f
Update Dockerfile
ravi-mosaicml Dec 22, 2021
506a61c
Restore setting the NCCL version
ravi-mosaicml Dec 22, 2021
66bc16f
Fixed pip
ravi-mosaicml Dec 22, 2021
4a69138
Update Dockerfile
ravi-mosaicml Dec 22, 2021
94c3db5
More docker changes
ravi-mosaicml Dec 22, 2021
2ad19c3
Use the bash shell
ravi-mosaicml Dec 22, 2021
18348af
Update the default path; allow downgrades
ravi-mosaicml Dec 22, 2021
c9e5d05
testing
ravi-mosaicml Dec 22, 2021
69a7ad8
Testing
ravi-mosaicml Dec 22, 2021
929ecfc
Fixed ubuntu version
ravi-mosaicml Dec 22, 2021
a3029b0
testing
ravi-mosaicml Dec 23, 2021
0fdf3d3
Added virtualenv arg
ravi-mosaicml Dec 23, 2021
cef0204
null node selector cpu
ravi-mosaicml Dec 23, 2021
208acfc
Global virtualenv
ravi-mosaicml Dec 27, 2021
c125228
Global virtualenv
ravi-mosaicml Dec 27, 2021
272b254
Run on colo; fix docker for noninteractive shells
ravi-mosaicml Dec 27, 2021
0809ae8
Fix for non-interactive shells
ravi-mosaicml Dec 27, 2021
e09b6a1
Update Dockerfile
ravi-mosaicml Dec 27, 2021
6b3e3ce
A yapf update broke some formatting...re-running the linter
ravi-mosaicml Dec 27, 2021
e4011f4
Merge branch 'ravi/fix_yapf' into ravi/composer_in_virtualenv
ravi-mosaicml Dec 27, 2021
b4f1406
Merge branch 'ravi/composer_in_virtualenv' of github.com:mosaicml/com…
ravi-mosaicml Dec 27, 2021
fef4aed
Merge branch 'ravi/composer_in_virtualenv' into ravi/jenkinsfile
ravi-mosaicml Dec 27, 2021
17f710f
testing
ravi-mosaicml Dec 27, 2021
a770f35
testing
ravi-mosaicml Jan 3, 2022
6f632b2
Merge branch 'dev' into ravi/jenkinsfile
ravi-mosaicml Jan 5, 2022
93fedab
Enabled dockerfile matrix build; switched to 3080s
ravi-mosaicml Jan 5, 2022
e834201
Increase timeout for test_blurmaxpool_shapes
ravi-mosaicml Jan 5, 2022
7db76a4
Use deterministic mode
ravi-mosaicml Jan 6, 2022
e65a5d8
Deterministic mode for test_checkpoint
ravi-mosaicml Jan 6, 2022
a7f0110
Fix determinsitc mode
ravi-mosaicml Jan 6, 2022
86c1998
Early check check for CUBLAS_WORKSPACE_CONFIG when using deterministi…
ravi-mosaicml Jan 6, 2022
b48492b
Using colo to run all pytest
ravi-mosaicml Jan 6, 2022
703eb46
auto setting CUBLAS_WORKSPACE_CONFIG
ravi-mosaicml Jan 6, 2022
db04e35
Increase limits
ravi-mosaicml Jan 6, 2022
d26c907
Fix nit
ravi-mosaicml Jan 6, 2022
28d0e8b
Merge branch 'dev' into ravi/jenkinsfile
ravi-mosaicml Jan 7, 2022
b83d15a
Merge branch 'dev' into ravi/composer_in_virtualenv
ravi-mosaicml Jan 10, 2022
f3ba650
Removed change
ravi-mosaicml Jan 10, 2022
e2e06a2
Address PR feedback; fix zsh
ravi-mosaicml Jan 10, 2022
276c61c
Fixes
ravi-mosaicml Jan 10, 2022
69d20aa
Added --no-cache-dir
ravi-mosaicml Jan 10, 2022
5916a9a
Merge branch 'dev' into ravi/jenkinsfile
ravi-mosaicml Jan 10, 2022
f97d85a
Merge branch 'ravi/composer_in_virtualenv' into ravi/jenkinsfile
ravi-mosaicml Jan 10, 2022
3b60d70
Switched to 3090s
ravi-mosaicml Jan 10, 2022
52a903a
Running deepspeed tests via jenkins
ravi-mosaicml Jan 10, 2022
51d3b4e
Node without label
ravi-mosaicml Jan 10, 2022
50d4c02
Swithced cloud to colo-research-01
ravi-mosaicml Jan 10, 2022
7c6b2a1
Fixes
ravi-mosaicml Jan 10, 2022
942f766
Simplifying PR
ravi-mosaicml Jan 10, 2022
a5f1238
Make the run directory rank-local; fix checkpoints saving and restoring
ravi-mosaicml Jan 11, 2022
5aaceae
Fixed checkpointing tests
ravi-mosaicml Jan 11, 2022
ac62650
Merge branch 'dev' into ravi/rank_local_run_directory
ravi-mosaicml Jan 11, 2022
916b982
Merge branch 'ravi/rank_local_run_directory' of github.com:mosaicml/c…
ravi-mosaicml Jan 11, 2022
9bbcdd6
Merge branch 'ravi/rank_local_run_directory' into ravi/jenkinsfile
ravi-mosaicml Jan 11, 2022
5e0da6a
Fixed the node selector; only running deepspeed tests for the time being
ravi-mosaicml Jan 11, 2022
9475eb4
Added build system to pyproject.toml
ravi-mosaicml Jan 11, 2022
db82ee3
Testing
ravi-mosaicml Jan 11, 2022
12059c0
Merge branch 'ravi/composer_in_virtualenv' into ravi/jenkinsfile
ravi-mosaicml Jan 11, 2022
12898c3
Fixed isort
ravi-mosaicml Jan 11, 2022
8d76958
Merge branch 'ravi/composer_in_virtualenv' into ravi/jenkinsfile
ravi-mosaicml Jan 11, 2022
ab4fbc4
Re-enable python tests
ravi-mosaicml Jan 11, 2022
e50d658
testing
ravi-mosaicml Jan 11, 2022
6e83e3e
testing
ravi-mosaicml Jan 11, 2022
2757b2e
Fixed isort
ravi-mosaicml Jan 11, 2022
cbb8b06
Merge branch 'dev' into ravi/composer_in_virtualenv
ravi-mosaicml Jan 11, 2022
d348c35
Merge branch 'ravi/composer_in_virtualenv' into ravi/jenkinsfile
ravi-mosaicml Jan 11, 2022
2a41594
Fixing deepspeed conditional import
ravi-mosaicml Jan 11, 2022
a87877d
Speeding up logger test
ravi-mosaicml Jan 12, 2022
875470c
Adjusted k8s limits
ravi-mosaicml Jan 12, 2022
af75be1
Fixed jenkinsfile
ravi-mosaicml Jan 12, 2022
31a335b
Fixed missing values
ravi-mosaicml Jan 12, 2022
efd0d44
testing
ravi-mosaicml Jan 12, 2022
df56872
Fixed typos
ravi-mosaicml Jan 12, 2022
d4fc5d6
Update Jenkinsfile
ravi-mosaicml Jan 12, 2022
e854a99
Update Jenkinsfile
ravi-mosaicml Jan 12, 2022
0301fbc
Fixing some of the slow tests
ravi-mosaicml Jan 12, 2022
0e2332a
Making tests faster
ravi-mosaicml Jan 12, 2022
70f1059
Fixed broken tests
ravi-mosaicml Jan 12, 2022
a42fb7d
Merge branch 'dev' into ravi/jenkinsfile
ravi-mosaicml Jan 12, 2022
5513ab0
Merge branch 'dev' into ravi/rank_local_run_directory
ravi-mosaicml Jan 12, 2022
309c97a
Addressed PR feedback
ravi-mosaicml Jan 12, 2022
a270167
Formatting
ravi-mosaicml Jan 12, 2022
f9f8c71
Added docstrings
ravi-mosaicml Jan 12, 2022
ba2dee5
Merge branch 'ravi/rank_local_run_directory' into ravi/jenkinsfile
ravi-mosaicml Jan 12, 2022
bae17df
Lowered the CPU limit
ravi-mosaicml Jan 12, 2022
63dc6f8
Fixed tests
ravi-mosaicml Jan 12, 2022
ebca03b
Merge branch 'ravi/rank_local_run_directory' into ravi/jenkinsfile
ravi-mosaicml Jan 12, 2022
5bf7c77
Pinning yapf to 0.31.0 to see if that fixes a concurrency bug
ravi-mosaicml Jan 12, 2022
1cc3742
Bump yapf version
ravi-mosaicml Jan 12, 2022
cbcad03
Fix github status check names
ravi-mosaicml Jan 12, 2022
f42153d
Merge branch 'dev' into ravi/rank_local_run_directory
ravi-mosaicml Jan 14, 2022
884acac
Merge branch 'ravi/rank_local_run_directory' into ravi/jenkinsfile
ravi-mosaicml Jan 14, 2022
b08f7e3
Merge branch 'dev' into ravi/jenkinsfile
ravi-mosaicml Jan 14, 2022
49eea8c
Updated the README
ravi-mosaicml Jan 14, 2022
dd0443f
Merge branch 'dev' into ravi/jenkinsfile
ravi-mosaicml Jan 18, 2022
bf0bd3b
Fixed tests
ravi-mosaicml Jan 18, 2022
629b956
Merge branch 'dev' into ravi/jenkinsfile
ravi-mosaicml Jan 19, 2022
7963a1b
Addressed PR feedback
ravi-mosaicml Jan 19, 2022
2a79b9e
Added lint script to repo; using new Jenkins scratch/command
ravi-mosaicml Jan 19, 2022
bd51a4c
Fix typo
ravi-mosaicml Jan 19, 2022
c4ac824
Fixed closure
ravi-mosaicml Jan 19, 2022
2e99b79
Added missing commas
ravi-mosaicml Jan 19, 2022
7165546
fix typo
ravi-mosaicml Jan 19, 2022
f76975b
Added debugging
ravi-mosaicml Jan 19, 2022
b4de4e1
Fix the script
ravi-mosaicml Jan 19, 2022
bc860ba
Remove echo
ravi-mosaicml Jan 19, 2022
d320f55
Dockerfile fix
ravi-mosaicml Jan 19, 2022
0a720c4
Fix jenkinsfile
ravi-mosaicml Jan 19, 2022
03502d2
Merge branch 'dev' into ravi/jenkinsfile
ravi-mosaicml Jan 20, 2022
553da65
Updated shebangs
ravi-mosaicml Jan 20, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
A yapf update broke some formatting...re-running the linter
  • Loading branch information
ravi-mosaicml committed Dec 27, 2021
commit 6b3e3ceb9c816430ce8e58f0d4578f60be9a6b15
2 changes: 1 addition & 1 deletion composer/algorithms/alibi/alibi.py
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,7 @@ def apply_alibi(model: torch.nn.Module, heads_per_layer: int, max_sequence_lengt
zero_and_freeze_expand_position_embeddings(model=model,
attribute=position_embedding_attribute,
new_embedding_length=max_sequence_length)
log.info(f" Position embedding expanded to sequence " f"length {max_sequence_length}, zeroed, and frozen")
log.info(f" Position embedding expanded to sequence length {max_sequence_length}, zeroed, and frozen")

def convert_attention(module: torch.nn.Module, module_index: int = None):
module = register_alibi(module=module, n_heads=heads_per_layer, max_token_length=max_sequence_length)
Expand Down
2 changes: 1 addition & 1 deletion composer/datasets/hparams.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ class SyntheticHparamsMixin(hp.Hparams, abc.ABC):
Ignored if :attr:`use_synthetic` is False. (Default: ``CONTIGUOUS_FORMAT``)
"""

use_synthetic: bool = hp.optional("Whether to use synthetic data. Defaults to False." "", default=False)
use_synthetic: bool = hp.optional("Whether to use synthetic data. Defaults to False.", default=False)
synthetic_num_unique_samples: int = hp.optional("The number of unique samples to allocate memory for.", default=100)
synthetic_device: str = hp.optional("Device to store the sample pool. Should be `cuda` or `cpu`. Defauls to `cpu`.",
default="cpu")
Expand Down
5 changes: 3 additions & 2 deletions composer/optim/pytorch_future.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ def __init__(self,
verbose=False,
interval='step'):
if warmup_method not in ("constant", "linear"):
raise ValueError("Only 'constant' or 'linear' warmup_method accepted, but " "got {}".format(warmup_method))
raise ValueError("Only 'constant' or 'linear' warmup_method accepted, but got {}".format(warmup_method))
self.warmup_factor = warmup_factor
self.warmup_iters = warmup_iters
self.warmup_method = warmup_method
Expand All @@ -84,7 +84,8 @@ def get_lr(self):
"""

if not self._get_lr_called_within_step:
warnings.warn("To get the last learning rate computed by the scheduler, " "please use `get_last_lr()`.")
warnings.warn("To get the last learning rate computed by the scheduler, "
"please use `get_last_lr()`.")

if self.last_epoch == 0:
return [group['lr'] * self.warmup_factor for group in self.optimizer.param_groups]
Expand Down