Support `int96RebaseModeInWrite` and `int96RebaseModeInRead` #3330

razajafri · 2021-08-27T22:33:49Z

While writing/reading a parquet file a user can specify a different config for reading/writing int96 values vs datetime values in general.

This PR takes the value set by users to int96RebaseModeInWrite and int96RebaseModeInRead and matches what Apache spark does.

Signed-off-by: Raza Jafri rjafri@nvidia.com

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

razajafri · 2021-08-27T22:52:59Z

build

jlowe

I was expecting a shim function that would check INT96 timestamp rebase node, and in Spark < 3.1.1 it would check the old datetime config and in Spark >= 3.1.1 it would check the new INT96-specific configs.

sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuParquetFileFormat.scala

razajafri · 2021-08-30T18:05:04Z

I was expecting a shim function that would check INT96 timestamp rebase node, and in Spark < 3.1.1 it would check the old datetime config and in Spark >= 3.1.1 it would check the new INT96-specific configs.

Ironically, that was my first implementation. The reason why I didn't like that was because we will be returning the value of dateTimeRebase mode instead of int96RebaseMode when asked for it, its even worse when asking for int96RebaseModeWrite.key, but I can see that it's still better than running into a RTE when/if we ever support another version of 3.0.x. I will make the change.

jlowe · 2021-08-30T18:12:45Z

The reason why I didn't like that was because we will be returning the value of dateTimeRebase mode instead of int96RebaseMode when asked for it

But that is exactly what Spark < 3.1.1 is doing today. The shim function name would indicate we're trying to get the int96 rebase mode, and on Spark >= 3.1.1 it would check the Int96-specific config, and on Spark < 3.1.1 it would check the same config the Spark code is checking in that situation.

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

razajafri · 2021-08-30T20:20:51Z

I have made some unnecessary changes to the SparkBaseShim in every version of 311. Will revert in a bit

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

razajafri · 2021-08-30T20:37:38Z

build

jlowe

The Databricks shims will also need to be updated accordingly.

sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuParquetFileFormat.scala

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

razajafri · 2021-08-31T05:45:35Z

build

pxLi · 2021-08-31T06:59:15Z

NFS issue, reported to SRE

pxLi · 2021-08-31T09:34:29Z

build

sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuParquetFileFormat.scala

sql-plugin/src/main/scala/com/nvidia/spark/RebaseHelper.scala

sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuParquetFileFormat.scala

sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuParquetScan.scala

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

razajafri · 2021-09-01T00:06:49Z

build

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

razajafri · 2021-09-01T04:58:12Z

build

razajafri · 2021-09-01T05:02:35Z

build

pxLi · 2021-09-01T05:22:16Z

@razajafri hi, I saw you triggered premerge twice but not abort the previous build, may I ask why? each build would take 2GPUs

jlowe · 2021-09-22T13:02:11Z

Need to resolve merge conflicts but otherwise lgtm.

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

razajafri · 2021-09-22T18:11:39Z

build

razajafri · 2021-09-22T20:47:09Z

Looks like a failure unrelated to my change in DB, root causing it

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

razajafri · 2021-09-22T21:50:30Z

Skipping test_no_fallback_when_ansi_enabled until a resolution to #3611

razajafri · 2021-09-22T21:50:42Z

build

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

This reverts commit 14658b4. Signed-off-by: Raza Jafri <rjafri@nvidia.com>

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

razajafri · 2021-09-22T22:12:15Z

build

integration_tests/src/main/python/hash_aggregate_test.py

Co-authored-by: Jason Lowe <jlowe@nvidia.com>

razajafri · 2021-09-22T22:57:28Z

build

razajafri · 2021-09-23T03:36:24Z

@revans2 or @jlowe can you please +1 this and merge before we see any more merge conflicts?

…VIDIA#3330)" This reverts commit fc40c00.

tgravescs · 2021-09-23T15:44:25Z

this pr is not up to date with latest moves of of shim files, reverted #3627

support int96 rebase mode

1fc3b86

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

sameerz assigned razajafri Aug 28, 2021

sameerz added the task Work required that improves the product but is not user facing label Aug 28, 2021

sameerz added this to the Aug 30 - Sept 10 milestone Aug 28, 2021

sameerz linked an issue Aug 28, 2021 that may be closed by this pull request

[FEA] Spark3.1.0 test spark.sql.legacy.parquet.int96RebaseModeInRead #1135

Closed

jlowe changed the title ~~Support int96RebaseModeInWrite and int96RebaeModeInRead~~ Support int96RebaseModeInWrite and int96RebaseModeInRead Aug 30, 2021

jlowe reviewed Aug 30, 2021

View reviewed changes

sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuParquetFileFormat.scala Outdated Show resolved Hide resolved

razajafri added 2 commits August 30, 2021 13:00

addressed review comments

c25aab8

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

Merge remote-tracking branch 'origin/branch-21.10' into HEAD

5df6921

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

razajafri added 2 commits August 30, 2021 13:29

removed unnecessary changes to the SparkBaseShim

2c31105

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

removed unnecessary extra line

1798391

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

jlowe reviewed Aug 30, 2021

View reviewed changes

sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuParquetFileFormat.scala Outdated Show resolved Hide resolved

razajafri added 2 commits August 30, 2021 19:40

split the date and time exception checks

1d06300

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

Merge remote-tracking branch 'origin/branch-21.10' into HEAD

489e342

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

jlowe reviewed Aug 31, 2021

View reviewed changes

addressed review comments

45a3773

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

skip test if Spark version <3.1.1

b25422c

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

Merge remote-tracking branch 'origin/branch-21.10' into HEAD

5c49856

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

jlowe previously approved these changes Sep 22, 2021

View reviewed changes

skip ANSI test for hash aggregate until we root cause the failure

abd2f5d

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

razajafri dismissed jlowe’s stale review via abd2f5d September 22, 2021 21:44

razajafri mentioned this pull request Sep 22, 2021

[BUG] test_no_fallback_when_ansi_enabled failed in databricks #3611

Closed

abellina mentioned this pull request Sep 22, 2021

Ignore order for the test_no_fallback_when_ansi_enabled #3615

Merged

razajafri added 3 commits September 22, 2021 15:10

Adding resolution to the failed test

14658b4

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

Revert "Adding resolution to the failed test"

0692169

This reverts commit 14658b4. Signed-off-by: Raza Jafri <rjafri@nvidia.com>

Adding resolution to the failed test

b3c1c1e

Signed-off-by: Raza Jafri <rjafri@nvidia.com>

jlowe reviewed Sep 22, 2021

View reviewed changes

integration_tests/src/main/python/hash_aggregate_test.py Outdated Show resolved Hide resolved

Update integration_tests/src/main/python/hash_aggregate_test.py

7166a1c

Co-authored-by: Jason Lowe <jlowe@nvidia.com>

jlowe approved these changes Sep 23, 2021

View reviewed changes

jlowe changed the title ~~Support int96RebaseModeInWrite and int96RebaseModeInRead [databricks]~~ Support int96RebaseModeInWrite and int96RebaseModeInRead Sep 23, 2021

jlowe merged commit fc40c00 into NVIDIA:branch-21.10 Sep 23, 2021

tgravescs added a commit to tgravescs/spark-rapids that referenced this pull request Sep 23, 2021

Revert "Support int96RebaseModeInWrite and int96RebaseModeInRead (N…

4d8d5b0

…VIDIA#3330)" This reverts commit fc40c00.

This was referenced Sep 23, 2021

Fix CDH Build #3630

Merged

[BUG] Shims improperly overridden #3642

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support `int96RebaseModeInWrite` and `int96RebaseModeInRead` #3330

Support `int96RebaseModeInWrite` and `int96RebaseModeInRead` #3330

razajafri commented Aug 27, 2021

razajafri commented Aug 27, 2021

jlowe left a comment

razajafri commented Aug 30, 2021

jlowe commented Aug 30, 2021

razajafri commented Aug 30, 2021

razajafri commented Aug 30, 2021

jlowe left a comment

razajafri commented Aug 31, 2021

pxLi commented Aug 31, 2021

pxLi commented Aug 31, 2021

razajafri commented Sep 1, 2021

razajafri commented Sep 1, 2021

razajafri commented Sep 1, 2021

pxLi commented Sep 1, 2021 •

edited

Loading

jlowe commented Sep 22, 2021

razajafri commented Sep 22, 2021

razajafri commented Sep 22, 2021

razajafri commented Sep 22, 2021

razajafri commented Sep 22, 2021

razajafri commented Sep 22, 2021

razajafri commented Sep 22, 2021

razajafri commented Sep 23, 2021

tgravescs commented Sep 23, 2021

Support int96RebaseModeInWrite and int96RebaseModeInRead #3330

Support int96RebaseModeInWrite and int96RebaseModeInRead #3330

Conversation

razajafri commented Aug 27, 2021

razajafri commented Aug 27, 2021

jlowe left a comment

Choose a reason for hiding this comment

razajafri commented Aug 30, 2021

jlowe commented Aug 30, 2021

razajafri commented Aug 30, 2021

razajafri commented Aug 30, 2021

jlowe left a comment

Choose a reason for hiding this comment

razajafri commented Aug 31, 2021

pxLi commented Aug 31, 2021

pxLi commented Aug 31, 2021

razajafri commented Sep 1, 2021

razajafri commented Sep 1, 2021

razajafri commented Sep 1, 2021

pxLi commented Sep 1, 2021 • edited Loading

jlowe commented Sep 22, 2021

razajafri commented Sep 22, 2021

razajafri commented Sep 22, 2021

razajafri commented Sep 22, 2021

razajafri commented Sep 22, 2021

razajafri commented Sep 22, 2021

razajafri commented Sep 22, 2021

razajafri commented Sep 23, 2021

tgravescs commented Sep 23, 2021

Support `int96RebaseModeInWrite` and `int96RebaseModeInRead` #3330

Support `int96RebaseModeInWrite` and `int96RebaseModeInRead` #3330

pxLi commented Sep 1, 2021 •

edited

Loading