-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for POSIX predefined character classes #5692
Add support for POSIX predefined character classes #5692
Conversation
Signed-off-by: Anthony Chang <antchang@nvidia.com>
build |
@anthony-chang @andygrove We are in burndown for 22.06. Is this meant to target 22.06, or should it be retargeted towards 22.08? |
Good point. @anthony-chang can you retarget for 22.08 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please retarget for 22.08
Signed-off-by: Anthony Chang <antchang@nvidia.com>
This is blocked until the fix in #5610 is merged |
@anthony-chang this has now been merged |
…six-character-classes
…six-character-classes Signed-off-by: Anthony Chang <antchang@nvidia.com>
…six-character-classes Signed-off-by: Anthony Chang <antchang@nvidia.com>
6be0c4a
to
7a2a5a1
Compare
build |
…six-character-classes Signed-off-by: Anthony Chang <antchang@nvidia.com>
tests/src/test/scala/com/nvidia/spark/rapids/RegularExpressionTranspilerSuite.scala
Outdated
Show resolved
Hide resolved
doTranspileTest("\\p{Alnum}", "[a-zA-Z0-9]") | ||
doTranspileTest("\\p{Punct}", "[!\"#$%&'()*+,\\-./:;<=>?@\\^_`{|}~\\[\\]]") | ||
doTranspileTest("\\p{Print}", "[a-zA-Z0-9!\"#$%&'()*+,\\-./:;<=>?@\\^_`{|}~\\[\\]\u0020]") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it intentional that only a subset are tested here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this one is mostly a sanity check since the \p{Print}
transpilation recursively uses the definition of the 4 \p{}
classes above it.
I tested the full set in the integration tests.
Signed-off-by: Anthony Chang <antchang@nvidia.com>
build |
Signed-off-by: Anthony Chang <antchang@nvidia.com>
build |
I've removed |
Signed-off-by: Anthony Chang <antchang@nvidia.com>
build |
Closes #4413
This PR adds support for
\p{...}
for these character classes defined here under "POSIX character classes (US-ASCII only)"Signed-off-by: Anthony Chang antchang@nvidia.com