-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Cast String to TimeStamp issues #716
Comments
Note I have seen other formats for strings - such as 2020-09-07T01:05:57.840+0000 |
This is related to issue #987 |
This is also related to #1117 which I am currently working on |
@tgravescs The main issue raised in this issue (using the wrong cuDF timestamp formats) was resolved by #1718 and there is now a separate issue for reducing the regex overhead (#1738). We do not support the format |
yes I think we should file one to say supported more formats |
I filed #1748 for supporting additional formats. |
…IDIA#716) Signed-off-by: spark-rapids automation <70000568+nvauto@users.noreply.github.com> Signed-off-by: spark-rapids automation <70000568+nvauto@users.noreply.github.com>
Describe the bug
We recently discovered that the config to turn off the string to timestamp cast was not being used. We fixed that with #705.
But while investigating that it seemed like a few things might be off. the format we are using and is used in CUDF is "%Y-%m-%dT%H:%M:%SZ%f". Based on what we support it seems like the ms section should be %SZ.%f. The pattern that we didn't match on but worked on the CPU was just: 2017-11-29 20:00:35. Ideally we would support this.
I think the regex's in this case are a pretty high performance overhead as well though so perhaps we should figure out different way to handle.
The text was updated successfully, but these errors were encountered: