Filter nulls for left semi and left anti join to work around cudf #1664
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
performance issues
Signed-off-by: Alessandro Bellina abellina@nvidia.com
Closes #1643
This is only targetting
leftSemi
andleftAnti
because these are based oncudf::left_semi_anti_join
, and this path was not sped up in 0.18 (specifically building the hash table for the join is slow with many nulls).This passes the tests locally, and I fixed some leaks I had but I need to run performance tests at scale, so I am posting it as draft for now.