Change databricks build to dynamically create a cluster #981

tgravescs · 2020-10-19T22:01:17Z

This changes it so we dynamically create a new databricks cluster everytime we kick off a nightly build. It deletes the cluster at the end of the run as well. This is the basic functionality that we have now and we can continue enhance it later.

Signed-off-by: Thomas Graves <tgraves@nvidia.com>

tgravescs · 2020-10-20T19:36:31Z

added a create.py script and split create off from run-tests.py. Added a bunch more options to make it more configurable. Passing clusterid back in stdout from create script and then pass it in jenkinsfile via environment variable.

tgravescs · 2020-10-20T19:36:54Z

running one final test here: https://ci-dev.ngcc.nvidia.com/job/spark/job/tgraves-test-databricks-create/36/

tgravescs · 2020-10-20T19:36:58Z

build

jenkins/databricks/create.py

jenkins/databricks/run-tests.py

Signed-off-by: Thomas Graves <tgraves@nvidia.com>

tgravescs · 2020-10-20T20:32:19Z

build

tgravescs · 2020-10-20T22:05:04Z

3.1.0 failure we have pr up for

tgravescs · 2020-10-20T22:38:13Z

build

tgravescs · 2020-10-21T12:58:15Z

same 3.1.0 failure with ParquetRowConverter that should have been fixed second time I built. will try again

tgravescs · 2020-10-21T12:58:20Z

build

jlowe · 2020-10-21T13:21:01Z

same 3.1.0 failure with ParquetRowConverter that should have been fixed second time I built. will try again

The PR needs to be upmerged with latest on branch-0.3 to pick up the fix.

tgravescs · 2020-10-21T13:24:48Z

build

* Add some more checks to databricks build scripts Signed-off-by: Thomas Graves <tgraves@nvidia.com> * remove extra newline * use the right -gt for bash * Add new python file for databricks cluster utils * Fix up scripts * databricks scripts working Signed-off-by: Thomas Graves <tgraves@nvidia.com> * Pass in sshkey Signed-off-by: Thomas Graves <tgraves@nvidia.com> * cluster creation script mods * fix * fix pub key * fix missing quote * fix $ * update public key to be param Signed-off-by: Thomas Graves <tgraves@nvidia.com> * Add public key value * clenaup Signed-off-by: Thomas Graves <tgraves@nvidia.com> * modify permissions Signed-off-by: Thomas Graves <tgraves@nvidia.com> * change loc cluster id file * fix extra / * quote public key * try different setting cluster id * debug * try again * try readfile * try again * try quotes * cleanup * Add option to control number of partitions when converting from CSV to Parquet (NVIDIA#915) * Add command-line arguments for applying coalesce and repartition on a per-table basis Signed-off-by: Andy Grove <andygrove@nvidia.com> * Move command-line validation logic and address other feedback Signed-off-by: Andy Grove <andygrove@nvidia.com> * Update copyright years and fix import order Signed-off-by: Andy Grove <andygrove@nvidia.com> * Update docs/benchmarks.md Co-authored-by: Jason Lowe <jlowe@nvidia.com> * Remove withPartitioning option from TPC-H and TPC-xBB file conversion Signed-off-by: Andy Grove <andygrove@nvidia.com> Co-authored-by: Jason Lowe <jlowe@nvidia.com> * Benchmark runner script (NVIDIA#918) * Benchmark runner script Signed-off-by: Andy Grove <andygrove@nvidia.com> * Add argument for number of iterations Signed-off-by: Andy Grove <andygrove@nvidia.com> * Fix docs Signed-off-by: Andy Grove <andygrove@nvidia.com> * add license Signed-off-by: Andy Grove <andygrove@nvidia.com> * improve documentation for the configuration files Signed-off-by: Andy Grove <andygrove@nvidia.com> * Add missing line-continuation symbol in example Signed-off-by: Andy Grove <andygrove@nvidia.com> * Remove hard-coded spark-submit-template.txt and add --template argument. Also make all arguments required. Signed-off-by: Andy Grove <andygrove@nvidia.com> * Update benchmarking guide to link to the benchmark python script Signed-off-by: Andy Grove <andygrove@nvidia.com> * Add --template to example and fix markdown header Signed-off-by: Andy Grove <andygrove@nvidia.com> * Add legacy config to clear active Spark 3.1.0 session in tests (NVIDIA#970) Signed-off-by: Jason Lowe <jlowe@nvidia.com> * XFail tests until final fix can be put in (NVIDIA#968) Signed-off-by: Robert (Bobby) Evans <bobby@apache.org> * Stop reporting totalTime metric for GpuShuffleExchangeExec (NVIDIA#973) Signed-off-by: Andy Grove <andygrove@nvidia.com> * Add some more checks to databricks build scripts Signed-off-by: Thomas Graves <tgraves@nvidia.com> * Pass in sshkey * Add create script, add more parameters, etc Signed-off-by: Thomas Graves <tgraves@nvidia.com> * add create script * rework some scripts Signed-off-by: Thomas Graves <tgraves@nvidia.com> * fix is_cluster_running Signed-off-by: Thomas Graves <tgraves@nvidia.com> * put slack back in * update text * cleanup Signed-off-by: Thomas Graves <tgraves@nvidia.com> * remove datetime * send output to stderr Signed-off-by: Thomas Graves <tgraves@nvidia.com> Co-authored-by: Andy Grove <andygrove@users.noreply.github.com> Co-authored-by: Jason Lowe <jlowe@nvidia.com> Co-authored-by: Robert (Bobby) Evans <bobby@apache.org>

tgravescs added 27 commits September 10, 2020 10:52

Add some more checks to databricks build scripts

275a2c4

Signed-off-by: Thomas Graves <tgraves@nvidia.com>

remove extra newline

ddbd1fa

use the right -gt for bash

72fac13

Add new python file for databricks cluster utils

c657fe5

Fix up scripts

8dfb093

databricks scripts working

87abf4e

Signed-off-by: Thomas Graves <tgraves@nvidia.com>

Merge remote-tracking branch 'origin/branch-0.3' into dbci

802790a

Pass in sshkey

90904cc

Signed-off-by: Thomas Graves <tgraves@nvidia.com>

cluster creation script mods

5cc3859

fix

5335a17

fix pub key

974db22

fix missing quote

76afdb9

fix $

7679c15

update public key to be param

a238546

Signed-off-by: Thomas Graves <tgraves@nvidia.com>

Add public key value

f06b5c1

clenaup

7f776f8

Signed-off-by: Thomas Graves <tgraves@nvidia.com>

modify permissions

d7840f5

Signed-off-by: Thomas Graves <tgraves@nvidia.com>

change loc cluster id file

f447ca3

fix extra /

4ac4558

quote public key

8f4f64d

try different setting cluster id

1def7c8

debug

618e246

try again

66f5b2c

try readfile

a7c3fe0

try again

9eea971

try quotes

6038985

cleanup

cc2746c

tgravescs requested review from GaryShen2008, jlowe and NvTimLiu as code owners October 19, 2020 22:01

tgravescs added 4 commits October 20, 2020 14:22

put slack back in

9f70434

update text

9f9ab10

cleanup

1dfe832

Signed-off-by: Thomas Graves <tgraves@nvidia.com>

remove datetime

db693e0

tgravescs force-pushed the dbcirebase branch from d8b0bd9 to db693e0 Compare October 20, 2020 19:34

revans2 previously approved these changes Oct 20, 2020

View reviewed changes

jlowe reviewed Oct 20, 2020

View reviewed changes

jenkins/databricks/create.py Outdated Show resolved Hide resolved

jenkins/databricks/create.py Outdated Show resolved Hide resolved

jenkins/databricks/run-tests.py Show resolved Hide resolved

send output to stderr

ce94011

Signed-off-by: Thomas Graves <tgraves@nvidia.com>

tgravescs dismissed revans2’s stale review via ce94011 October 20, 2020 20:31

jlowe approved these changes Oct 20, 2020

View reviewed changes

Merge remote-tracking branch 'origin/branch-0.3' into dbcirebase

ce14b4b

tgravescs merged commit e05b3f4 into NVIDIA:branch-0.3 Oct 21, 2020

tgravescs deleted the dbcirebase branch October 21, 2020 16:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change databricks build to dynamically create a cluster #981

Change databricks build to dynamically create a cluster #981

tgravescs commented Oct 19, 2020 •

edited

Loading

tgravescs commented Oct 20, 2020

tgravescs commented Oct 20, 2020 •

edited

Loading

tgravescs commented Oct 20, 2020

tgravescs commented Oct 20, 2020

tgravescs commented Oct 20, 2020

tgravescs commented Oct 20, 2020

tgravescs commented Oct 21, 2020

tgravescs commented Oct 21, 2020

jlowe commented Oct 21, 2020

tgravescs commented Oct 21, 2020

Change databricks build to dynamically create a cluster #981

Change databricks build to dynamically create a cluster #981

Conversation

tgravescs commented Oct 19, 2020 • edited Loading

tgravescs commented Oct 20, 2020

tgravescs commented Oct 20, 2020 • edited Loading

tgravescs commented Oct 20, 2020

tgravescs commented Oct 20, 2020

tgravescs commented Oct 20, 2020

tgravescs commented Oct 20, 2020

tgravescs commented Oct 21, 2020

tgravescs commented Oct 21, 2020

jlowe commented Oct 21, 2020

tgravescs commented Oct 21, 2020

tgravescs commented Oct 19, 2020 •

edited

Loading

tgravescs commented Oct 20, 2020 •

edited

Loading