-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unsatisfiable WHERE clause leads to exception instead of empty result #13152
Comments
I tried this locally and what happens here is:
We can fix it by allowing the |
Yeah. I was thinking if we can just make a special case of empty inline data. Since we would be returning an empty sequence anyway. @599166320 was also making some changes in this area recently. that could be relevant. |
@abhishekagarwal87 I recently improved the sorting function of common fields, which should solve the sorting problem of inline data together. However, I have recently closed this PR for some reasons. If you are interested, I will reopen this pr and let you review it. |
@paul-rogers - since you were reviewing that PR, can you take that PR to completion? |
In the DB community, there is often a need to get the schema of a query without running the query. Typically a UI wants to work out the table or chart structure it will use to process data. Or, some bit of code generation wants to know how to create code to handle this particular query structure. The fancy DBs provides a Each of these is a message to the planner that the client only needs the result set schema (signature), not the data. Given this, the proper approach is to plan the query, gather the signature, then go down a "no data" path. In Druid, that just means returning an empty Given this, the need to handle sorting is moot on this code path. Because the "no data" path is a well-known idiom, I'd guess that Calcite has a way to tell us that the result set will always be empty. We can use that to trigger our no-data path. |
I sketched out an idea for this issue. Here it is: master...gianm:druid:scan-inline-sort. I haven't made a PR yet, since I'd need to add some tests. The idea in the patch goes beyond the empty-data case. It also handles the case where we really do have inline data, and want to actually sort it. By skipping |
@gianm - but what piece of code does the sorting of inline data itself by the time column? I was thinking of limiting the size of inline data and then feeding it to |
Good question. That's another reason that little sketch isn't ready to be a PR 🙂 I think the answer is, if there's a limit then it's done by stableLimitingSort. If there's no limit then I don't see that it actually does get sorted. That'd need to change. Your idea sounds good. What would we do if there's no limit? Ideally, since it's inline data and it's not likely to be very big, we sort it anyway. It could be done either in the mergeRunners call, or in the runner created by createRunner (since there's only one runner for inline data). Wondering what you think. |
Thinking about it a bit more. I think it makes the most sense to do the sorting in the runner from createRunner. I actually think that is already happening: check out the makeCursors call in ScanQueryEngine. It is either ascending or descending based on the time order of the query. So mergeRunners just needs to merge runners that are already sorted. In the inline case, there is only one runner, so there's nothing to do. We can return it as-is. What do you think? |
I can't spot the |
Affected Version
0.23.0
Description
The following query
leads to the following error:
druid error: Unsupported operation (org.apache.druid.java.util.common.UOE): Time-ordering on scan queries is only supported for queries with segment specs of type MultipleSpecificSegmentSpec or SpecificSegmentSpec...a [MultipleIntervalSegmentSpec] was received instead.
false can be an arbitrary unsatisfiable boolean expression. I came across it when trying to filter for a specific exact timestamp in Superset, which is translated into the unsatisfiable expression
Since it seems to be valid vor any unsatisfiable WHERE clause, I'm omitting details about my specific dataset.
The text was updated successfully, but these errors were encountered: