-
Notifications
You must be signed in to change notification settings - Fork 24.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SQL: Disallow non-collapsable subselects with ORDER BY #72991
Conversation
Ordering an already pre-ordered and limited subselect is not allowed, as such queries cannot be collapsed and translated into query DSL, but the require an extra ordering step on top of the results returned internally by the search/agg query. Fixes: elastic#71158
Pinging @elastic/es-ql (Team:QL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
import static org.elasticsearch.xpack.sql.SqlTestUtils.TEST_CFG; | ||
import static org.elasticsearch.xpack.sql.types.SqlTypesTests.loadMapping; | ||
|
||
public class VerifierErrorTests extends ESTestCase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently errors are handled also inside VerifierTests
through the error
method (iirc); the reason being the test checks both things that pass and those that do not and it's easier to see both in one place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VerifierTests is in eql
, should I just rename it VerifierTests
also for the sql class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say so yes, easier to find it since the difference is just in the packages.
TEST_CFG, | ||
new SqlFunctionRegistry(), | ||
indexResolution, | ||
new org.elasticsearch.xpack.sql.analysis.analyzer.Verifier(new Metrics()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's with the fqn?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was automatic from IDE, since we are in the sql.planner
package, to avoid mixing with that Verifier
(which we actually test here), but will change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm.
if (hasLimit.get() == false && p instanceof LimitExec) { | ||
hasLimit.set(Boolean.TRUE); | ||
return; | ||
} | ||
if (orderBy.get() == null && p instanceof OrderExec) { | ||
orderBy.set(((OrderExec) p).order()); | ||
return; | ||
} | ||
if (hasLimit.get() && orderBy.get() != null && p instanceof OrderExec) { | ||
if (((OrderExec) p).order().equals(orderBy.get()) == false) { | ||
failures.add(fail(p, "Cannot use ORDER BY on top of a subquery with ORDER BY and LIMIT")); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional compacting.
if (hasLimit.get() == false && p instanceof LimitExec) { | |
hasLimit.set(Boolean.TRUE); | |
return; | |
} | |
if (orderBy.get() == null && p instanceof OrderExec) { | |
orderBy.set(((OrderExec) p).order()); | |
return; | |
} | |
if (hasLimit.get() && orderBy.get() != null && p instanceof OrderExec) { | |
if (((OrderExec) p).order().equals(orderBy.get()) == false) { | |
failures.add(fail(p, "Cannot use ORDER BY on top of a subquery with ORDER BY and LIMIT")); | |
} | |
} | |
if (p instanceof LimitExec) { | |
if (hasLimit.get() == false) { | |
hasLimit.set(true); // I'd use either objects or primitives for this and above statement | |
} | |
} else if (p instanceof OrderExec) { | |
if (orderBy.get() == null) { | |
orderBy.set(((OrderExec) p).order()); | |
} else if (hasLimit.get()) { | |
if (((OrderExec) p).order().equals(orderBy.get()) == false) { | |
failures.add(fail(p, "Cannot use ORDER BY on top of a subquery with ORDER BY and LIMIT")); | |
} | |
} | |
} |
assertEquals( | ||
"1:96: Cannot use ORDER BY on top of a subquery with ORDER BY and LIMIT", | ||
error( | ||
"SELECT * FROM (" + "SELECT max(int) AS max, bool FROM test GROUP BY bool ORDER BY max ASC LIMIT 10) " + "ORDER BY max DESC" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the string split meant to emphasise the subselect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, but the formatter made it an one-liner, will fix.
} | ||
if (hasLimit.get() && orderBy.get() != null && p instanceof OrderExec) { | ||
if (((OrderExec) p).order().equals(orderBy.get()) == false) { | ||
failures.add(fail(p, "Cannot use ORDER BY on top of a subquery with ORDER BY and LIMIT")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the error message descriptive enough, tho? That is actually possible[*], just not in any (i.e. "conflicting") configuration, right? Subselects are a gray area anyways, so I guess this could also remain like this, but alternatively we could add some notes in the limitations docs or somehow rephrase this message -- up to you.
[*] Didn't run it on the new code, but I guess this should work: SELECT * FROM (SELECT * FROM test_emp ORDER BY emp_no LIMIT 10) ORDER BY emp_no LIMIT 5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep it passes, since it's the exact same order By so it can be executed once and then merge the limits to 5: https://github.com/elastic/elasticsearch/pull/72991/files#diff-147ac28e10bf6b2342a276203e3b250ccd593163c3aa087ee6d72ed19572c163R147
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the error message, imho we shouldn't add more details, about not supporting an ORDER BY on top of the result set returned internally, I think it will just complicate things, and regarding the limitations we already state about supporting queries that can be "flattened".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure that I fully understand the scenario that we are avoiding here...
You compare the ORDER BY
s basically from two different plans that contain each other using if (((OrderExec) p).order().equals(orderBy.get()) == false)
. The equals
method of Order covers the direction (ASC/DESC), on what the ordering is applied (the expression itself) and the nulls position.
Two comments:
- NULLS are not covered by the tests. There is only one passing test using
NULLS FIRST
. - I don't understand why this test passes
SELECT * FROM (SELECT * FROM (SELECT * FROM (SELECT * FROM test_emp ORDER BY emp_no DESC) ORDER BY emp_no ASC NULLS LAST) ORDER BY emp_no DESC NULLS LAST LIMIT 12) ORDER BY emp_no DESC NULLS FIRST
if thenulls
position is different in the outermost query and the first inner query. I must be missing the exact scenario that we are forbidding.
@astefan Thx a lot for catching this extra case, in fact when everything inside the hierarchy is flattened the actual ORDER BY applied, is the last one (top of hierarchy) before encountering the next ORDER BY that will break things (because a LIMIT is also already present). Added more tests for verifying those un-supported queries and a couple more integ tests that also include nulls. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Ordering an already pre-ordered and limited subselect is not allowed, as such queries cannot be collapsed and translated into query DSL, but the require an extra ordering step on top of the results returned internally by the search/agg query. Fixes: elastic#71158 (cherry picked from commit a5a20ae)
Ordering an already pre-ordered and limited subselect is not allowed, as such queries cannot be collapsed and translated into query DSL, but the require an extra ordering step on top of the results returned internally by the search/agg query. Fixes: elastic#71158 (cherry picked from commit a5a20ae)
Ordering an already pre-ordered and limited subselect is not allowed,
as such queries cannot be collapsed and translated into query DSL, but
the require an extra ordering step on top of the results returned
internally by the search/agg query.
Fixes: #71158