Avoid requiring single batch when using out-of-core sort #5903

res-life · 2022-06-24T10:26:50Z

Problem

The cause is that the input of the out-of-core sort is a single large batch.
The single large batch caused the OOM.

out-of-core sort should not require a single batch, it can pull all the input batches and then sort.
Before the out-of-core sort feature was added in, we do need the single batch for in-memory sorting.

Solution

When executing partitioned writes which requires sorting, we use out-of-core sort and do not require single batch.
Note: If specified stable sort configuration, still need require a single batch as before.

Signed-off-by: Chong Gao res_life@163.com

Signed-off-by: Chong Gao <res_life@163.com>

res-life · 2022-07-01T11:21:12Z

build

res-life · 2022-07-08T10:35:13Z

build

res-life · 2022-07-08T10:42:39Z

Verified a large data frame, OOM did not occur after disabled the require a single batch

I will file a follow-on issue to explore support for something like DynamicPartitionDataConcurrentWriter

res-life · 2022-07-12T02:26:25Z

Thanks, @wjxiz1992 verified this PR against the corresponding NV bug: NDS 2.0 convert CSV to Parquet failed by OOM

res-life · 2022-07-14T07:53:43Z

Filed a follow-on issue for DynamicPartitionDataConcurrentWriter: #5999

res-life · 2022-07-14T07:54:30Z

@revans2 Help review

revans2 · 2022-06-24T13:45:10Z

integration_tests/src/main/python/parquet_write_test.py

@@ -465,3 +465,14 @@ def test_write_daytime_interval(spark_tmp_path):
            lambda spark, path: spark.read.parquet(path),
            data_path,
            conf=writer_confs)
+
+# TODO need to test large DF


We simulate a large DF by setting the batch size to be very small. This lets us send multiple batches.

revans2 · 2022-07-14T15:08:10Z

...gin/src/main/scala/org/apache/spark/sql/rapids/GpuCreateDataSourceTableAsSelectCommand.scala

@@ -36,7 +36,8 @@ case class GpuCreateDataSourceTableAsSelectCommand(
    query: LogicalPlan,
    outputColumnNames: Seq[String],
    origProvider: Class[_],
-    gpuFileFormat: ColumnarFileFormat)
+    gpuFileFormat: ColumnarFileFormat,
+    useStableSort: Boolean)


My only nit is that we pass useStableSort around this code a lot, but in the final part when we do the sort we get it from a different location.

spark-rapids/sql-plugin/src/main/scala/org/apache/spark/sql/rapids/GpuFileFormatWriter.scala

Lines 207 to 211 in 4a313e7

val sortType = if (RapidsConf.STABLE_SORT.get(plan.conf)) {

FullSortSingleBatch

} else {

OutOfCoreSort

}

Could we please make it consistent? Either we pass it all the way down all the time, or we go off of the plan.conf all the time.

Updated, see the below comment.

sql-plugin/src/main/scala/org/apache/spark/sql/rapids/GpuFileFormatWriter.scala

res-life · 2022-07-15T10:55:36Z

Premerge is blocked by #6003

res-life · 2022-07-18T11:17:30Z

build

Avoid requiring single batch when using out-of-core sort

378027b

Signed-off-by: Chong Gao <res_life@163.com>

sameerz added the task Work required that improves the product but is not user facing label Jul 1, 2022

Chong Gao added 2 commits July 8, 2022 18:29

Refactor

16f4bf4

Update comments

ad8eba6

res-life marked this pull request as ready for review July 8, 2022 10:42

res-life requested a review from revans2 July 8, 2022 10:43

revans2 reviewed Jul 14, 2022

View reviewed changes

Chong Gao added 2 commits July 15, 2022 18:36

Refactor

492bb3d

Merge branch 'branch-22.08' into parquet-partition-write-ooo

0c568a5

res-life commented Jul 15, 2022

View reviewed changes

sql-plugin/src/main/scala/org/apache/spark/sql/rapids/GpuFileFormatWriter.scala Show resolved Hide resolved

revans2 approved these changes Jul 18, 2022

View reviewed changes

revans2 merged commit 6286d05 into NVIDIA:branch-22.08 Jul 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid requiring single batch when using out-of-core sort #5903

Avoid requiring single batch when using out-of-core sort #5903

res-life commented Jun 24, 2022 •

edited

Loading

res-life commented Jul 1, 2022

res-life commented Jul 8, 2022

res-life commented Jul 8, 2022 •

edited

Loading

res-life commented Jul 12, 2022

res-life commented Jul 14, 2022

res-life commented Jul 14, 2022

revans2 Jun 24, 2022

revans2 Jul 14, 2022

res-life Jul 15, 2022

res-life commented Jul 15, 2022

res-life commented Jul 18, 2022

	val sortType = if (RapidsConf.STABLE_SORT.get(plan.conf)) {
	FullSortSingleBatch
	} else {
	OutOfCoreSort
	}

Avoid requiring single batch when using out-of-core sort #5903

Avoid requiring single batch when using out-of-core sort #5903

Conversation

res-life commented Jun 24, 2022 • edited Loading

Problem

Solution

res-life commented Jul 1, 2022

res-life commented Jul 8, 2022

res-life commented Jul 8, 2022 • edited Loading

res-life commented Jul 12, 2022

res-life commented Jul 14, 2022

res-life commented Jul 14, 2022

revans2 Jun 24, 2022

Choose a reason for hiding this comment

revans2 Jul 14, 2022

Choose a reason for hiding this comment

res-life Jul 15, 2022

Choose a reason for hiding this comment

res-life commented Jul 15, 2022

res-life commented Jul 18, 2022

res-life commented Jun 24, 2022 •

edited

Loading

res-life commented Jul 8, 2022 •

edited

Loading