Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Support self comparisons in cudf::experimental::row::lexicographic::two_table_comparator #13371

Open
divyegala opened this issue May 17, 2023 · 2 comments
Labels
1 - On Deck To be worked on next feature request New feature or request libcudf Affects libcudf (C++/CUDA) code.

Comments

@divyegala
Copy link
Member

divyegala commented May 17, 2023

As showcased in #13347, there's a need for {lhs, lhs} and {rhs, rhs} comparisons in an instance of two_table_comparator.

This can't simply be achieved by adding more overloads because left and right terminology is baked into the comparator when it's constructed at the host-side. In a device function, the strongly typed indices now work with the assumption that a comp(i, j) that is called in a device function operates on {lhs, rhs} or {rhs, lhs}.

We need to settle on a design that lets us refactor the row operators such that the assumption of working on two different tables can be removed.

Do we strongly type device_row_comparator::operator() over here such that we can decide which columns of which tables to pass along to the element_comparator over here?

I see the design looking something like this:

class device_row_comparator {
  class element_comparator {
    operator() (size_type lhs_index, size_type rhs_index);
  };
  dispatch_element_operator(lhs_col, rhs_col, lhs_index, rhs_index);

  // these call dispatch_element_operator with the correct columns and indices
  operator() (lhs_index_type lhs_index, rhs_index_type rhs_index);
  operator() (lhs_index_type lhs_index, lhs_index_type rhs_index);
  operator() (rhs_index_type lhs_index, rhs_index_type rhs_index);
};

// the template `Comparator` here and below will be an instance of `device_row_comparator`,
// such that the strongly type indices can be passed along directly
template <typename Comparator, weak_ordering... values>
class single_table_ordering {
  operator() (size_type lhs_index, size_type rhs_index) {
    return comparator(lhs_index_type{lhs_index}, lhs_index_type{rhs_index});
};

template <typename Comparator, weak_ordering... values>
class two_table_ordering {
// same as current version of strong_index_comparator with added overloads for {lhs, lhs} and {rhs, rhs}
};

class self_comparator {
  auto less() {
    return less_comparator{single_table_ordering{device_row_comparator{...}}};
  }
};

class two_table_comparator {
  auto less() {
    return less_comparator{two_table_ordering{device_row_comparator{...}}};
  }
};

Note: In this example, weak_ordering_comparator_impl will be removed and it's functionality will instead be baked into single_table_ordering and two_table_ordering. less_comparator will then be reworked with CRTP such that:

template <typename Comparator>
class less_comparator : Comparator<weak_ordering::LESS>
@divyegala divyegala added feature request New feature or request Needs Triage Need team to review and classify labels May 17, 2023
@ttnghia
Copy link
Contributor

ttnghia commented May 17, 2023

I don't like the idea of having the ability to do self-comparison inside two-table-comparision. Instead, the ability to do comparison either between lhs-lhs, lhs-rhs and rhs-rhs should be a "mixed comparison" thus we should have a new comparator alongside with self_comparator and two_table_comparator. Something may be called mixed_table_compartor. Maybe with this comparator, we can remove two_table_comparator completely, but we should not use the name "two table comparator" for mixed comparison like that.

@divyegala
Copy link
Member Author

@ttnghia that's acceptable. We'll still need many of the changes from the description to support mixed_table_comparator.

@GregoryKimball GregoryKimball added 1 - On Deck To be worked on next libcudf Affects libcudf (C++/CUDA) code. and removed Needs Triage Need team to review and classify labels Jun 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1 - On Deck To be worked on next feature request New feature or request libcudf Affects libcudf (C++/CUDA) code.
Projects
None yet
Development

No branches or pull requests

3 participants