Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup unused Jenkins files and scripts #1829

Merged
merged 3 commits into from
Mar 2, 2021

Conversation

NvTimLiu
Copy link
Collaborator

@NvTimLiu NvTimLiu commented Mar 1, 2021

#1568

Move Databricks scripts to GitLab so we can use the common scripts for the nightly build job and integration tests job

Remove unused Dockerfiles

NVIDIA#1568

Move Databricks scripts to GitLab so we can use the common scripts for the nightly build job and integration tests job

Remove unused Dockerfiles

Signed-off-by: Tim Liu <timl@nvidia.com>
@NvTimLiu NvTimLiu self-assigned this Mar 1, 2021
@NvTimLiu
Copy link
Collaborator Author

NvTimLiu commented Mar 1, 2021

MR123 to move scripts to GitLab

@NvTimLiu
Copy link
Collaborator Author

NvTimLiu commented Mar 1, 2021

build

@NvTimLiu
Copy link
Collaborator Author

NvTimLiu commented Mar 1, 2021

build

@@ -1,133 +0,0 @@
#!/bin/bash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd much rather see as many of the scripts remain in this repository and have any other repository leverage the scripts here. This repository is where developers will be looking first for things, so this should be the place we store things unless there's a compelling reason we cannot. I don't see why we can't have the Gitlab repository leverage any scripts in the github repository.

Copy link
Collaborator Author

@NvTimLiu NvTimLiu Mar 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tgravescs @jlowe

It's Good for me , in fact I'd prefer to leave DB nightly scripts in the GitHub/spark-rapids repo.

So I misunderstood the issue: #1568

I thought you want to move all the DB nightly scripts together with DB IT ones.

Now all the Cloud Environments are using the same scripts https://github.com/NVIDIA/spark-rapids/blob/branch-0.5/integration_tests/run_pyspark_from_build.sh to run pytest IT

Let's only remove the unused Dockerfiles.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought you want to move all the DB nightly scripts together with DB IT ones.

We do want them moved together but ideally here. 😄

The rule of thumb should be to place files here unless for some reason they cannot (e.g.: files containing secrets or other items not appropriate for public consumption). In those cases, only the parts of the scripts that need to be isolated should be isolated, and common, public portions of the script should be factored out and placed in this repository.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry if I wasn't clear, I stated this:

If we can split and commonize them such that the scripts for running tests are in spark-rapids and other Jenkinsfiles and CSP setup scripts are in private, I'm good with that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I probably shouldn't have updated description

Copy link
Collaborator Author

@NvTimLiu NvTimLiu Mar 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it.

As the DB IT only scripts are written by Shell CLI, by contrast the DB nightly build is python based scripts .

So I'd prefer keep where they are and what they are. The key point is they using the same IT scripts here: https://github.com/NVIDIA/spark-rapids/blob/branch-0.5/integration_tests/run_pyspark_from_build.sh

So in common, you only need to change the run_pyspark_from_build.sh, it will work for both DB nightly and DB IT only pipelines.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I don't understand why we have hardly anything in the gitlab repo. There is still duplication between those shell scripts the scripts in spark-rapids. The create_cluster.sh script, the setup for pytests on Databricks could all be common.
The reason for python versus shell was because there was a nice API for getting things back vs having to parse a bunch of stdout back from the shell commands, as well as to have some small functions that would be reused by other scripts. I guess it doesn't matter to much at this point because they are written in both.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tgravescs OK, Could you please help check if I miss anything? Seems the changes will be:

  1. To move DB IT scripts into GitHub
  2. Make it common to create cluster, setup pytests ... with python API

Signed-off-by: Tim Liu <timl@nvidia.com>
@NvTimLiu
Copy link
Collaborator Author

NvTimLiu commented Mar 1, 2021

build

Copy link
Member

@jlowe jlowe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that this is just removing a couple of unreferenced jenkins Dockerfiles, +1 lgtm.

@sameerz sameerz added the build Related to CI / CD or cleanly building label Mar 2, 2021
@pxLi pxLi merged commit 28b00a7 into NVIDIA:branch-0.5 Mar 2, 2021
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
* Cleanup unused Jenkins files and scripts

NVIDIA#1568

Move Databricks scripts to GitLab so we can use the common scripts for the nightly build job and integration tests job

Remove unused Dockerfiles

Signed-off-by: Tim Liu <timl@nvidia.com>

* rm Dockerfile.integration.ubuntu16

* Restore Databricks nightly scripts

Signed-off-by: Tim Liu <timl@nvidia.com>
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
* Cleanup unused Jenkins files and scripts

NVIDIA#1568

Move Databricks scripts to GitLab so we can use the common scripts for the nightly build job and integration tests job

Remove unused Dockerfiles

Signed-off-by: Tim Liu <timl@nvidia.com>

* rm Dockerfile.integration.ubuntu16

* Restore Databricks nightly scripts

Signed-off-by: Tim Liu <timl@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Related to CI / CD or cleanly building
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants