Merge pull request #1835 from jlowe/fix-merge

Fix merge conflict with branch-0.4
NVIDIA · Mar 1, 2021 · 50fd165 · 50fd165
2 parents bb03535 + c40ec37
commit 50fd165
Show file tree

Hide file tree

Showing 32 changed files with 9 additions and 10,818 deletions.
diff --git a/README.md b/README.md
@@ -5,18 +5,8 @@ The RAPIDS Accelerator for Apache Spark provides a set of plugins for
 [Apache Spark](https://spark.apache.org) that leverage GPUs to accelerate processing
 via the [RAPIDS](https://rapids.ai) libraries and [UCX](https://www.openucx.org/).
 
-![TPCxBB Like query results](./docs/img/tpcxbb-like-results.png "TPCxBB Like Query Results")
-
-The chart above shows results from running ETL queries based off of the 
-[TPCxBB benchmark](http://www.tpc.org/tpcx-bb/default.asp). These are **not** official results in
-any way. It uses a 10TB Dataset (scale factor 10,000), stored in parquet. The processing happened on
-a two node DGX-2 cluster. Each node has 96 CPU cores, 1.5TB host memory, 16 V100 GPUs, and 512 GB
-GPU memory.
-
 To get started and try the plugin out use the [getting started guide](./docs/get-started/getting-started.md).
 
-For more information about these benchmarks, see the [benchmark guide](./docs/benchmarks.md).
-
 ## Compatibility
 
 The SQL plugin tries to produce results that are bit for bit identical with Apache Spark.

diff --git a/docs/benchmarks.md b/docs/benchmarks.md
diff --git a/docs/get-started/Dockerfile.cuda b/docs/get-started/Dockerfile.cuda
@@ -35,7 +35,6 @@ RUN set -ex && \
     ln -s /lib /lib64 && \
     mkdir -p /opt/spark && \
     mkdir -p /opt/spark/jars && \
-    mkdir -p /opt/tpch && \
     mkdir -p /opt/spark/examples && \
     mkdir -p /opt/spark/work-dir && \
     mkdir -p /opt/sparkRapidsPlugin && \

diff --git a/integration_tests/README.md b/integration_tests/README.md
@@ -171,26 +171,6 @@ any GPU resources on the cluster. For standalone, Mesos, and Kubernetes you can
 of executors you want to use per application. The extra core is for the driver. Dynamic allocation can mess with these settings
 under YARN and even though it is off by default you probably want to be sure it is disabled (spark.dynamicAllocation.enabled=false).
 
-### Enabling TPCxBB/TPCH/TPCDS/Mortgage Tests
-
-The TPCxBB, TPCH, TPCDS, and Mortgage tests in this framework can be enabled by providing a couple of options:
-
-   * TPCxBB `tpcxbb-format` (optional, defaults to "parquet"), and `tpcxbb-path` (required, path to the TPCxBB data).
-   * TPCH `tpch-format` (optional, defaults to "parquet"), and `tpch-path` (required, path to the TPCH data).
-   * TPCDS `tpcds-format` (optional, defaults to "parquet"), and `tpcds-path` (required, path to the TPCDS data).
-   * Mortgage `mortgage-format` (optional, defaults to "parquet"), and `mortgage-path` (required, path to the Mortgage data).
-
-As an example, here is the `spark-submit` command with the TPCxBB parameters on CUDA 10.1:
-
-```
-$SPARK_HOME/bin/spark-submit --jars "rapids-4-spark_2.12-0.5.0-SNAPSHOT.jar,rapids-4-spark-udf-examples_2.12-0.5.0-SNAPSHOT.jar,cudf-0.19-SNAPSHOT-cuda10-1.jar,rapids-4-spark-tests_2.12-0.5.0-SNAPSHOT.jar" ./runtests.py --tpcxbb_format="csv" --tpcxbb_path="/path/to/tpcxbb/csv"
-```
-
-Be aware that running these tests with read data requires at least an entire GPU, and preferable several GPUs/executors
-in your cluster so please be careful when enabling these tests.  Also some of these test actually produce non-deterministic
-results when run in a real cluster. If you do see failures when running these tests please contact us so we can investigate
-them and possibly tag the tests appropriately when running on an actual cluster.
-
 ### Enabling cudf_udf Tests
 
 The cudf_udf tests in this framework are testing Pandas UDF(user-defined function) with cuDF. They are disabled by default not only because of the complicated environment setup, but also because GPU resources scheduling for Pandas UDF is an experimental feature now, the performance may not always be better.

diff --git a/integration_tests/conftest.py b/integration_tests/conftest.py
@@ -14,24 +14,6 @@
 
 def pytest_addoption(parser):
     """Pytest hook to define command line options for pytest"""
-    parser.addoption(
-        "--tpcxbb_format", action="store", default="parquet", help="format of TPCXbb data"
-    )
-    parser.addoption(
-        "--tpcxbb_path", action="store", default=None, help="path to TPCXbb data"
-    )
-    parser.addoption(
-        "--tpcds_format", action="store", default="parquet", help="format of TPC-DS data"
-    )
-    parser.addoption(
-        "--tpcds_path", action="store", default=None, help="path to TPC-DS data"
-    )
-    parser.addoption(
-        "--tpch_format", action="store", default="parquet", help="format of TPCH data"
-    )
-    parser.addoption(
-        "--tpch_path", action="store", default=None, help="path to TPCH data"
-    )
     parser.addoption(
         "--mortgage_format", action="store", default="parquet", help="format of Mortgage data"
     )
@@ -61,4 +43,3 @@ def pytest_addoption(parser):
         "--test_type", action='store', default="developer",
         help="the type of tests that are being run to help check all the correct tests are run - developer, pre-commit, or nightly"
     )
-