-
Notifications
You must be signed in to change notification settings - Fork 214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(optimizer, storage): pushdown range-filter to storage #786
Conversation
Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: Runji Wang <wangrunji0408@163.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM and thanks for the PR! Seems that SQLLogicTest needs to be updated with make apply_planner_test
. Also it seems that we pass Option<KeyRange>
to the filter but the default value for that is true
? Shall we convert it to None
somewhere so that we can avoid unnecessary expression evaluation?
let end = match (&ra.end, &rb.end) { | ||
(Bound::Unbounded, s) | (s, Bound::Unbounded) => s.clone(), | ||
_ => return None, | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if both range from one side are bound? (i.e., a < 3 && a < 5) In this case we should not return a None
here if we don't have filter executor above scan-filter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it will return a None
. I didn't merge the bounds here because I thought it would be a bit complicated. For these cases I expected them to be handled in rewrite rules later. For example:
(and (< ?a ?v1) (< ?a ?v2)) => (< ?a (min ?v1 ?v2))
if min.is_some() { | ||
min.unwrap().min(ROWSET_MAX_OUTPUT) | ||
if let Some(min) = min { | ||
min.min(ROWSET_MAX_OUTPUT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be if let Some(ref mut min) = min
and *min = min.min(xxx)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The whole if expression will be assigned to fetch_size
. I think it is fine.
Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: Runji Wang <wangrunji0408@163.com>
// required by range-filter scan rule | ||
record_first_key: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this feature is required by range-filter scan now. Should we make it mandatory? or will we support range-filter scan without it? @skyzh
); | ||
|
||
let snapshot = if is_sorted { | ||
assert!(opts.filter.is_none(), "MemTxn doesn't support filter scan"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we plan to support range-filter scan (store in order by key) in memory storage?
Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: Runji Wang <wangrunji0408@163.com>
I've changed the default value to |
Range-filter scan has been supported in storage #589. But the query engine doesn't make use of it. This PR adds an optimization rule to push range predicates from filters down to scans. To identify range predicates on primary keys (e.g.
id > 1
), we introduced the range analysis. It will extract aKeyRange
from=
,>
,>=
,<
,<=
,and
node, which can be passed into storage later.Another goal of this PR is to decouple storage from the query engine v1. Currently it depends on
BoundExpr
from v1 to filter data inside storage. But in fact we don't expect storage to support general filters other than range filters on keys. So this PR removed general filter from storage and added aKeyRange
for range filter. After that, the v1 engine can be removed completely.A simple test on TPC-H dataset: