-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cleanup unused Jenkins files and scripts #1829
Conversation
NVIDIA#1568 Move Databricks scripts to GitLab so we can use the common scripts for the nightly build job and integration tests job Remove unused Dockerfiles Signed-off-by: Tim Liu <timl@nvidia.com>
MR123 to move scripts to GitLab |
build |
build |
jenkins/databricks/build.sh
Outdated
@@ -1,133 +0,0 @@ | |||
#!/bin/bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd much rather see as many of the scripts remain in this repository and have any other repository leverage the scripts here. This repository is where developers will be looking first for things, so this should be the place we store things unless there's a compelling reason we cannot. I don't see why we can't have the Gitlab repository leverage any scripts in the github repository.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's Good for me , in fact I'd prefer to leave DB nightly scripts in the GitHub/spark-rapids
repo.
So I misunderstood the issue: #1568
I thought you want to move all the DB nightly scripts together with DB IT ones.
Now all the Cloud Environments are using the same scripts https://github.com/NVIDIA/spark-rapids/blob/branch-0.5/integration_tests/run_pyspark_from_build.sh to run pytest IT
Let's only remove the unused Dockerfiles.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought you want to move all the DB nightly scripts together with DB IT ones.
We do want them moved together but ideally here. 😄
The rule of thumb should be to place files here unless for some reason they cannot (e.g.: files containing secrets or other items not appropriate for public consumption). In those cases, only the parts of the scripts that need to be isolated should be isolated, and common, public portions of the script should be factored out and placed in this repository.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry if I wasn't clear, I stated this:
If we can split and commonize them such that the scripts for running tests are in spark-rapids and other Jenkinsfiles and CSP setup scripts are in private, I'm good with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I probably shouldn't have updated description
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it.
As the DB IT only
scripts are written by Shell CLI, by contrast the DB nightly build
is python based scripts .
So I'd prefer keep where they are and what they are. The key point is they using the same IT scripts here: https://github.com/NVIDIA/spark-rapids/blob/branch-0.5/integration_tests/run_pyspark_from_build.sh
So in common, you only need to change the run_pyspark_from_build.sh, it will work for both DB nightly
and DB IT only
pipelines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I don't understand why we have hardly anything in the gitlab repo. There is still duplication between those shell scripts the scripts in spark-rapids. The create_cluster.sh script, the setup for pytests on Databricks could all be common.
The reason for python versus shell was because there was a nice API for getting things back vs having to parse a bunch of stdout back from the shell commands, as well as to have some small functions that would be reused by other scripts. I guess it doesn't matter to much at this point because they are written in both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tgravescs OK, Could you please help check if I miss anything? Seems the changes will be:
- To move
DB IT
scripts into GitHub - Make it common to create cluster, setup pytests ... with python API
Signed-off-by: Tim Liu <timl@nvidia.com>
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that this is just removing a couple of unreferenced jenkins Dockerfiles, +1 lgtm.
* Cleanup unused Jenkins files and scripts NVIDIA#1568 Move Databricks scripts to GitLab so we can use the common scripts for the nightly build job and integration tests job Remove unused Dockerfiles Signed-off-by: Tim Liu <timl@nvidia.com> * rm Dockerfile.integration.ubuntu16 * Restore Databricks nightly scripts Signed-off-by: Tim Liu <timl@nvidia.com>
* Cleanup unused Jenkins files and scripts NVIDIA#1568 Move Databricks scripts to GitLab so we can use the common scripts for the nightly build job and integration tests job Remove unused Dockerfiles Signed-off-by: Tim Liu <timl@nvidia.com> * rm Dockerfile.integration.ubuntu16 * Restore Databricks nightly scripts Signed-off-by: Tim Liu <timl@nvidia.com>
#1568
Move Databricks scripts to GitLab so we can use the common scripts for the nightly build job and integration tests job
Remove unused Dockerfiles