-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] count reductions are failing on databricks because lack for Complete support #898
Comments
I know that Databricks was doing something different with inserting |
@abellina also had a look at it a little, but yes @kuhushukla if you want to take lead on fixing it that would be great. @tgravescs should we mark the tests as xfail for now? The aggregations should work fine if we don't coalesce the data to a single partition. I am doing that in the test so first and last get a deterministic result, but it should not be that common in the real world. Because of that I really want us to fix this correctly and not just patch it. We should have mode be checked completely everywhere so we know 100% if we support that mode or not for reductions as well as aggregations. |
Thanks, will look into it and update in the coming days. +1 on xfailing until then. |
yes I'm fine with xfailing since this is marked p1 for 0.3 |
I put up a patch to mark them as xfail here #899 I have not tested it on databricks yet, but I will soon. |
Signed-off-by: Peixin Li <pxli@nyu.edu>
Describe the bug
In databricks the tests for
test_generic_reductions
are failing. I traced this down to databricks using theComplete
aggregation type, which our code does not properly support.Steps/Code to reproduce bug
Run
test_generic_reductions
on databricks with the latest branch-0.3 code.Expected behavior
Test should pass
This looks like it should be somewhat simple to fix in a stop gap, but I am not 100% sure that we are doing the right thing in all cases with this. Ideally we check all possible aggregation types instead of just checking for a single one, but that may be difficult.
The reason this works for
count(*)
is because it is turned intocount(1)
and we do a SUM reduction on it, so it comes out the same.The text was updated successfully, but these errors were encountered: