Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Improve organization of AST code #8145

Closed
10 tasks done
vyasr opened this issue May 3, 2021 · 5 comments
Closed
10 tasks done

[FEA] Improve organization of AST code #8145

vyasr opened this issue May 3, 2021 · 5 comments
Assignees
Labels
feature request New feature or request libcudf Affects libcudf (C++/CUDA) code.

Comments

@vyasr
Copy link
Contributor

vyasr commented May 3, 2021

Now that conditional joins have been merged, I'll use this PR as a checklist for the various improvements that we'd like to make to the code base:

This issue was originally created in from #7418 (comment), but has since been heavily edited to reflect the updated task list.

@harrism harrism added libcudf Affects libcudf (C++/CUDA) code. feature request New feature or request labels May 3, 2021
@harrism
Copy link
Member

harrism commented May 3, 2021

@vyasr how did you create an issue without choosing the type? Usually the system auto-populates some labels...

@harrism harrism changed the title Improved organization of AST code [FEA] Improve organization of AST code May 3, 2021
@vyasr
Copy link
Contributor Author

vyasr commented May 3, 2021

You can create an issue from a comment on another PR or Issue by clicking on the three dots on the right and selecting "Reference in a new issue" (see the picture below). Issues created this way get the "Originally posted by..." note at the end of the issue. I'm surprised that it bypasses the auto-labeling though...

image

@beckernick beckernick added this to the Conditional Joins milestone Jul 23, 2021
@vyasr
Copy link
Contributor Author

vyasr commented Jul 26, 2021

Now that conditional joins have been merged, I'll use this issue as a checklist for the various improvements that we'd like to make to the code base: the checklist is now in the issue description.

@harrism
Copy link
Member

harrism commented Jul 27, 2021

I'll use this PR as a checklist

This is an issue. :) But in general checklists should go into the opening comment (the issue description). This way the checklist counter shows up in github's other interfaces (e.g. project boards).

@vyasr
Copy link
Contributor Author

vyasr commented Jul 27, 2021

All good points :) I'll move the checklist.

@vyasr vyasr self-assigned this Aug 4, 2021
rapids-bot bot pushed a commit that referenced this issue Aug 13, 2021
Resolves #8918 by providing a new API for getting the output size for conditional joins (except full joins). This PR removes the unnecessary `conditional_join.cuh` header and inlines the logic into the `conditional_join.cu` file where it is used, and adds the new logic into that file as well. The public APIs are now exposed in `conditional_join.hpp`.

Adding the join size calculation also revealed a couple of bugs in the conditional join tests that were hiding a real bug in the conditional join implementation. The main test bug was the use of `std::equal` with the actual result as the first iterator, so if the actual result was empty it was never compared against the expected result (even if it was nonempty). This bug masked a couple of minor errors in the expected outputs encoded in the test. These are now fixed. This bug was also hiding a deeper issue where the AST device code was always using the left row index to pull data for the left hand operand to binary operations, even when the lhs was actually a column from the right table. That bug is now fixed as well.

Contributes to #8145.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Jason Lowe (https://github.com/jlowe)
  - MithunR (https://github.com/mythrocks)

URL: #8928
rapids-bot bot pushed a commit that referenced this issue Sep 9, 2021
Contributes to #8980 and #8145.

This PR makes all `operator_functors` templated on the potential nullability of the inputs, allowing per-operator customization of null handling with minimal performance penalties. The default specialization in the nullable case is a null-propagating fall-through to the non-nullable case, which matches standard libcudf semantics, but any specific operator's null handling behavior can be customized by explicit specialization of the `operator_functor` template corresponding to that operator.

This PR implements `NULL_EQUAL`, `NULL_LOGICAL_AND`, and `NULL_LOGICAL_OR` operators that behave like their non-nullable counterparts when no nulls are present but can return non-null results from null inputs. The `NULL_EQUAL` operator replaces the `compare_nulls` parameter for conditional joins.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Jason Lowe (https://github.com/jlowe)
  - Robert (Bobby) Evans (https://github.com/revans2)
  - Bradley Dice (https://github.com/bdice)
  - https://github.com/nvdbaranec

URL: #9096
@vyasr vyasr closed this as completed Sep 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request libcudf Affects libcudf (C++/CUDA) code.
Projects
None yet
Development

No branches or pull requests

4 participants