Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the time zone check issue #5767

Merged
merged 2 commits into from
Jun 14, 2022
Merged

Conversation

firestarman
Copy link
Collaborator

Fixes #5678

There are two settings for time zone, the Spark session local time zone and JVM's default time zone. And they may be different from each other according to issue #5678.

With this PR, plugin now requires both of the two settings being equal to UTC to support timestamp type.

Signed-off-by: Firestarman firestarmanllc@gmail.com

Signed-off-by: Firestarman <firestarmanllc@gmail.com>
@firestarman
Copy link
Collaborator Author

For early review

@tgravescs
Copy link
Collaborator

tgravescs commented Jun 7, 2022

did you mean this to be against 22.06? Unless specifically requested this should go against 22.08

@sameerz sameerz added the bug Something isn't working label Jun 7, 2022
@sameerz sameerz added this to the Jun 6 - Jun 17 milestone Jun 7, 2022
@firestarman
Copy link
Collaborator Author

firestarman commented Jun 8, 2022

did you mean this to be against 22.06? Unless specifically requested this should go against 22.08

It is a bug fix, so I thought it should go into 22.06. Rebased to 22.08

@jlowe
Copy link
Member

jlowe commented Jun 8, 2022

build

@gerashegalov
Copy link
Collaborator

The change looks good as a narrow fix.

However, the logic for timezone error handling in the plugin initialization is not quire clear

  • first we stash away the initial value of the default timezone on the driver
    conf.set(RapidsConf.DRIVER_TIMEZONE.key, ZoneId.systemDefault().normalized().toString)
    .
  • We don't throw an exception on the driver, if it's not a supported time zone, which I think is OK because it can still be changed for the JVM using java.util.TimeZone.setDefault
  • Then on the executor side plugin init we throw an exception on discrepancy between the driver and executor default timezones but only if the driver timezone is UTC
    throw new RuntimeException(s" Driver and executor timezone mismatch. " +
    This does not make sense to me. Either always throw when timezones are different. Or adjust the executor timezone with TimeZone.setDefault(TimeZone.getTimeZone(driverTimeZone)) unless it is explicitly overridden with user.timezone in executor Java options

We need to file the issue to clarify this if it's beyond the current PR.

Copy link
Collaborator

@gerashegalov gerashegalov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with caveats regarding Plugin init checks

@firestarman
Copy link
Collaborator Author

firestarman commented Jun 13, 2022

We need to file the issue to clarify this if it's beyond the current PR.

Personally the current time zone checks in Plugin init are OK. (Please correct me if I am wrong)

At the init stage on the driver, Plugin does not know whether the time zone will be used. So it is not necessary to check the time zone here. And this check will be executed during the time related operators' overriding process later.

On executors, the time zone is required to be equal to the driver's only for the time related GPU operators, I think. If time zone is not supported, no time related GPU operators exist (all fall back to CPU), then no need to check the difference of the time zones.

@firestarman
Copy link
Collaborator Author

I am going to merge this, if any issue, we can follow up in new issues.

@firestarman firestarman merged commit 04991c6 into NVIDIA:branch-22.08 Jun 14, 2022
@firestarman firestarman deleted the tz_check branch June 14, 2022 01:33
@gerashegalov
Copy link
Collaborator

gerashegalov commented Jun 14, 2022

We need to file the issue to clarify this if it's beyond the current PR.

Personally the current time zone checks in Plugin init are OK. (Please correct me if I am wrong)

At the init stage on the driver, Plugin does not know whether the time zone will be used. So it is not necessary to check the time zone here. And this check will be executed during the time related operators' overriding process later.

On executors, the time zone is required to be equal to the driver's only for the time related GPU operators, I think. If time zone is not supported, no time related GPU operators exist (all fall back to CPU), then no need to check the difference of the time zones.

So looks like you are arguing for dropping the check

throw new RuntimeException(s" Driver and executor timezone mismatch. " +

because it's an init-time not a query-time check.

Also again the default timezone of the driver as of init time may be different at the query time. I'll file an issue to clarify these limitations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
5 participants