Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-32068][WEBUI] Correct task lauchtime show issue due to timezone in stage tab #28918

Closed

Conversation

TJX2014
Copy link
Contributor

@TJX2014 TJX2014 commented Jun 24, 2020

What changes were proposed in this pull request?

formatDate in utils.js org/apache/spark/ui/static/utils.js is partly refactored.

Why are the changes needed?

In branch-2.4,task launch time is returned as html string from driver,
while in branch-3.x,this is returned in JSON Object asDatetype from org.apache.spark.status.api.v1.TaskData
Due to:
LaunchTime from jersey server in spark driver is correct, which will be converted to date string like 2020-06-28T02:57:42.605GMT in json object, then the formatDate in utils.js treat it as date.split(".")[0].replace("T", " ").
So 2020-06-28T02:57:42.605GMT will be converted to 2020-06-28 02:57:42, but correct is 2020-06-28 10:57:42 in GMT+8 timezone.
选区_071
选区_070

Does this PR introduce any user-facing change?

No

How was this patch tested?

Manual test.

@TJX2014
Copy link
Contributor Author

TJX2014 commented Jun 24, 2020

@HyukjinKwon Could you please help me check this PR. :-)

@HyukjinKwon
Copy link
Member

In branch-2.4,task launch time is returned as html string from driver,
while in branch-3.x,this is returned in JSON Object asDatetype from org.apache.spark.status.api.v1.TaskData

Can you point out what changes or which PR caused this?

@TJX2014
Copy link
Contributor Author

TJX2014 commented Jun 25, 2020

In branch-2.4,task launch time is returned as html string from driver,
while in branch-3.x,this is returned in JSON Object asDatetype from org.apache.spark.status.api.v1.TaskData

Can you point out what changes or which PR caused this?

Sure, @HyukjinKwon PR link
Jira link are related.

@HyukjinKwon
Copy link
Member

cc @tgravescs, @gengliangwang and @sarutak FYI

@tgravescs
Copy link
Contributor

@TJX2014 thanks for providing a patch, in general it would be great if you can provide more detailed description of the problem as well as the fix to help the reviewers easily understand what is going on. You spent time debugging so just sharing your analysis is great.
In this case it would be nice to point to exactly where in 2.4 it was using string and then in 3.0 where it was changed and perhaps an example of the format change. It also is useful to say how/why your change fixes this issue. In this case it seems like its formatting it differently. For instance it doesn't looks like the launchTime in the TaskData structure changed at all, it is a Date in both 2.4 and 3.0.

2.4 calls UIUtils.formatDate which returns a string: https://github.com/apache/spark/blob/branch-2.4/core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala#L850

3.0 uses the formatDate js function on the same taskData.launchTime. so I think what you are saying is that formatting done there is not correct for the timezone handling? It is also putting it in UTC versus putting it in the local timezone?

@tgravescs
Copy link
Contributor

tgravescs commented Jun 25, 2020

Note, we updated to use local timezone (https://github.com/apache/spark/pull/19640/files#diff-77d1ca904f4bd8645c31f148047d9270L42). the utils.js has functions similar to patch here - formatTimeMillis. The only difference is date vs timeMillis which we might be able to use with just Date.getTime which is what is done by ApplicationAttemptInfo.
I'm wondering if we shouldn't do a similar thing here to keep things consistent. Let me investigate a bit more.

@TJX2014
Copy link
Contributor Author

TJX2014 commented Jun 25, 2020

@TJX2014 thanks for providing a patch, in general it would be great if you can provide more detailed description of the problem as well as the fix to help the reviewers easily understand what is going on. You spent time debugging so just sharing your analysis is great.
In this case it would be nice to point to exactly where in 2.4 it was using string and then in 3.0 where it was changed and perhaps an example of the format change. It also is useful to say how/why your change fixes this issue. In this case it seems like its formatting it differently. For instance it doesn't looks like the launchTime in the TaskData structure changed at all, it is a Date in both 2.4 and 3.0.

2.4 calls UIUtils.formatDate which returns a string: https://github.com/apache/spark/blob/branch-2.4/core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala#L850

3.0 uses the formatDate js function on the same taskData.launchTime. so I think what you are saying is that formatting done there is not correct for the timezone handling? It is also putting it in UTC versus putting it in the local timezone?

Thanks, @tgravescs We can see in branch-3,launchTime from jersey server in spark driver is correct, which will be converted to date string like 2020-06-25T14:50:49.779GMT in json object, then the formatDate in utils.js treat it as date.split(".")[0].replace("T", " ");, so 2020-06-25T14:50:49.779GMT will be converted to 2020-06-25 14:50:49.779, but correct is 2020-06-25 22:50:49.779 in +8 timezone.

In branch-2,this date format is acted in driver side, and return the static content html to browser, while branch-3, static html model and json object are separated return to browser.

else return date.split(".")[0].replace("T", " ");
else {
var d = new Date(date.replace("GMT", "Z"))
return d.getFullYear() + '-' +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this pad with zeroes properly - see formatTimeMillis? From your screen shot it looks like it does but I'm wondering if that is browser specific.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done,due to timeMillis is int orlong type and formatDate is string param, I extract a common method formatDateString to format DateType in js.

@tgravescs
Copy link
Contributor

ok to test

@SparkQA
Copy link

SparkQA commented Jun 25, 2020

Test build #124515 has finished for PR 28918 at commit 698cae6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@sarutak
Copy link
Member

sarutak commented Jun 25, 2020

According to the following documents, this change seems work with recent browsers.
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Date/parse
https://tc39.es/ecma262/#sec-date-time-string-format

@TJX2014
Copy link
Contributor Author

TJX2014 commented Jun 25, 2020

According to the following documents, this change seems work with recent browsers.
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Date/parse
https://tc39.es/ecma262/#sec-date-time-string-format

Thanks, @sarutak I find this change also work with ES6 . https://www.tutorialspoint.com/es6/es6_date.htm

@dongjoon-hyun
Copy link
Member

Retest this please.

@dongjoon-hyun
Copy link
Member

Hi, @TJX2014 . So, this is not a correctness issue. This is a timezone issue, isn't it?
It would be great if we can have more specific PR title which describes your PR's contribution here.

@dongjoon-hyun
Copy link
Member

BTW, please update the image by using full name instead of CST in the PR description. It seems that you wanted to say China Standard Time, but in USA, it can be Central Standard Time (CST). For now, the given image looks ambiguous to me.

@TJX2014 TJX2014 changed the title [SPARK-32068][WEBUI] Task lauchtime in stage tab not correct [SPARK-32068][WEBUI] Correct task lauchtime show issue due to timezone in stage tab Jun 28, 2020
@TJX2014
Copy link
Contributor Author

TJX2014 commented Jun 28, 2020

@dongjoon-hyun Thanks, I have done. :-)

@SparkQA
Copy link

SparkQA commented Jun 28, 2020

Test build #124577 has finished for PR 28918 at commit 698cae6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@tgravescs tgravescs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me, I'll leave open for a bit to see if anyone else has comments

@TJX2014
Copy link
Contributor Author

TJX2014 commented Jun 29, 2020

Thanks all for your suggestion and attention very much :-)

@asfgit asfgit closed this in 165c948 Jun 30, 2020
asfgit pushed a commit that referenced this pull request Jun 30, 2020
…e in stage tab

### What changes were proposed in this pull request?
`formatDate` in utils.js `org/apache/spark/ui/static/utils.js` is partly refactored.

### Why are the changes needed?
In branch-2.4,task launch time is returned as html string from driver,
while in branch-3.x,this is returned in JSON Object as`Date`type  from `org.apache.spark.status.api.v1.TaskData`
Due to:
LaunchTime from jersey server in spark driver is correct, which will be converted to date string like `2020-06-28T02:57:42.605GMT` in json object, then the formatDate in utils.js treat it as date.split(".")[0].replace("T", " ").
So `2020-06-28T02:57:42.605GMT` will be converted to `2020-06-28 02:57:42`, but correct is `2020-06-28 10:57:42` in GMT+8 timezone.
![选区_071](https://user-images.githubusercontent.com/7149304/85937186-b6d36780-b933-11ea-8382-80a3891f1c2a.png)
![选区_070](https://user-images.githubusercontent.com/7149304/85937190-bcc94880-b933-11ea-8860-2083c97ea269.png)

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Manual test.

Closes #28918 from TJX2014/master-SPARK-32068-ui-task-lauch-time-tz.

Authored-by: TJX2014 <xiaoxingstack@gmail.com>
Signed-off-by: Thomas Graves <tgraves@apache.org>
(cherry picked from commit 165c948)
Signed-off-by: Thomas Graves <tgraves@apache.org>
@tgravescs
Copy link
Contributor

thanks @TJX2014 merged to master and branch-3.0

@dongjoon-hyun
Copy link
Member

Thank you all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants