sql: Implement Summer, a smarter typing system #6358

nvanbenschoten · 2016-04-27T19:47:25Z

See #6189 for the updated RFC design doc on this typing system.

This change implements the new typing system in a few stages:

Add downward typing preference to type checking
Define algorithms for overload resolution and expression homogeneity resolution
Create TypedExpr interface to split pre and post-type checked expressions
Clarify untyped constant literals (numeric and string-like) and their purpose in the system

The commits should be read in order to gain insight into the design process and goals.
However, a good amount of polishing/refactoring is done in a few later commits, so it's
probably better to critique code with all commits considered.

This change is

knz · 2016-04-27T22:54:16Z

Reviewed the first commit.

Reviewed 39 of 39 files at r1.
Review status: 10 of 55 files reviewed at latest revision, 32 unresolved discussions, all commit checks successful.

sql/group.go, line 98 [r1] (raw file):
Could we use proper names for Types here and in other places? Like parser.TypeBool or something.
(please ignore if a subsequent commit deals with this already)

sql/index_selection.go, line 781 [r1] (raw file):
I think you want to desire dummyLeft when typing right.

sql/pgwire_test.go, line 222 [r1] (raw file):
Why is this failing?
Don't we have the preference for homogenous types to take care of this? I'd expect $1 typed as timestamp.

sql/pgwire_test.go, line 226 [r1] (raw file):
I'd like to improve the error message. IT is not clear what "= " refers to. Is that a hint about the resolved type of the placeholder, or is that a hint about the resolved type for the entire + expression?

sql/select_qvalue.go, line 111 [r1] (raw file):
why is desired not propagated?

sql/select_qvalue.go, line 268 [r1] (raw file):
same question as above

sql/sort.go, line 122 [r1] (raw file):
What about desiring an int here?

sql/subquery.go, line 62 [r1] (raw file):
Please extend this comment with:
TODO(knz) the instantiation of the subquery's select node should be moved to the TypeCheck() method once the prepare and execute phase are separated for select nodes.

sql/table.go, line 198 [r1] (raw file):
IS the 3rd argument to TypeCheck already the desired type in this commit? (it's difficult for me to see because Reviewable makes it hard to navigate between commits)
If not, probably you want to desire the column type when typing the default expr.

sql/parser/datum.go, line 69 [r1] (raw file):
What's going on here? Are you keeping the code in comments because a subsequent commit rewrites it and you want to keep the diff simple? Otherwise I'd suggest deleting.

sql/parser/datum.go, line 1154 [r1] (raw file):
Either delete or update this comment.

sql/parser/eval.go, line 990 [r1] (raw file):
Why the special casing of "dummyTuple"?

sql/parser/eval.go, line 1200 [r1] (raw file):
Woath. Why are we type checking again during Eval()? Shouldn't be all types resolved already?

If we need a call to TypeCheck to ensure that the functions are memoized then I would recommend specifying so in Expr's interface: that the caller of Eval() should always ensure TypeCheck() has been called first.

sql/parser/eval.go, line 1510 [r1] (raw file):
See above.

sql/parser/eval.go, line 1782 [r1] (raw file):
See above.

sql/parser/eval.go, line 1820 [r1] (raw file):
Please move the block of code below to a separate function.

sql/parser/eval_test.go, line 93 [r1] (raw file):
Please explain in the comment what doesn't seem correct about them.
(They look like valid SQL and expected results to me!)

sql/parser/eval_test.go, line 377 [r1] (raw file):
Woath.
Is there a representation for NaN already in our types?
If so, I assume it's float and then this test result makes sense.
if not, then I'd recommend typing all constants with a denominator of 0 with preferred type "int". It seems wrong to me that IF would promote the type of an operand because of another operand which, albeit non-evaluable, only has int operands.

sql/parser/expr.go, line 312 [r1] (raw file):
I am worried about the multiple calls to asInt(). Perhaps asInt() should memoize the result.

sql/parser/normalize.go, line 453 [r1] (raw file):
I am not sure we need a call to typecheck here since all users of normalization will have called TypeCheck prior to normalization anyway,

sql/parser/normalize.go, line 543 [r1] (raw file):
Woah, you must clarify this comment. What are the cases where a panic can be safely ignored?

sql/parser/type_check.go, line 378 [r1] (raw file):
I'd recommend using "desiring" instead of "=" here.

sql/parser/type_check.go, line 444 [r1] (raw file):
Please add a commented test with a TODO to check this.

sql/parser/type_check.go, line 507 [r1] (raw file):
I was expecting a check that we are not assigning a conflicting type to the parameter.
(please ignore this comment if SetInferredType does this already)

sql/parser/type_check.go, line 581 [r1] (raw file):
Suggestion: move the cast out of the loop.

sql/parser/type_check.go, line 614 [r1] (raw file):
Is there no way to share more of this code with overload resolution for function calls?

sql/parser/type_check.go, line 739 [r1] (raw file):
Woah. Is there a guarantee that sort.Sort preserves the order of elements that compare equal? Otherwise users will experience unpleasant surprises.
(And please add a comment that you expect the sub-orders to be preserved.)

sql/parser/type_check.go, line 758 [r1] (raw file):
(This comment is out of place, it actually applies to the whole PR)
The tests containing examples that justify each of these typing rules really ought to stand out as comments next to the rule they justify. Otherwise it is just too easy to get lost and lose track of things while reading this code.
In other words we should see these tests defined primarily right here in comments. We then should have a tool that extracts example-comments from this file (and a few others) and generate a test source from them.
Would you mind filing an issue to do just that? We can look at it after this PR goes in.

sql/parser/type_check_test.go, line 116 [r1] (raw file):
I don't see how you can get a COALESCE error when looking at a IFNULL expression.
Also shouldnt' IFNULL accept any type for its 1st parameter?

sql/parser/type_check_test.go, line 117 [r1] (raw file):
Same question as above re. the expected argument types of NULLIF

sql/testdata/select_index, line 245 [r1] (raw file):
Does this change really belong to this PR?

sql/testdata/select_index, line 279 [r1] (raw file):
Same question as above.

Comments from Reviewable

RaduBerinde · 2016-04-27T23:32:25Z

Looked at r1 so far, LGTM (though I don't pretend to understand all the details). Feel free to ignore nits and handle any comments in separate commits or PRs or however it's easiest.

Review status: 10 of 55 files reviewed at latest revision, 43 unresolved discussions, all commit checks successful.

sql/analyze_test.go, line 73 [r1] (raw file):
[nit] consider defining a parser.NoTypePreference nil Datum constant to use instead of nil /* no preference */

sql/group.go, line 69 [r1] (raw file):
line too long

sql/insert.go, line 53 [r1] (raw file):
line too long

sql/parser/builtins.go, line 57 [r1] (raw file):
a description of what the interface represents would help

sql/parser/builtins.go, line 58 [r1] (raw file):
a short description of each function would help too

sql/parser/builtins.go, line 150 [r1] (raw file):
What's the point of the loop here? I assume it's a copy-paste. If the function can have a single universal implementation that just calls the other functions, I would pull it out of the interface (e.g. make it func matchTypeList(t typeList, types ArgTypes) bool)

sql/parser/expr.go, line 291 [r1] (raw file):
checks

sql/parser/parse.go, line 87 [r1] (raw file):
Should mention something about the returned Expr

sql/parser/type_check.go, line 104 [r1] (raw file):
line too long

sql/parser/type_check.go, line 849 [r1] (raw file):
precedence

sql/parser/type_check.go, line 868 [r1] (raw file):
by checking

sql/parser/type_check.go, line 871 [r1] (raw file):
if it is type?

sql/parser/type_check_test.go, line 135 [r1] (raw file):
nice!f

sql/testdata/select_index, line 279 [r1] (raw file):
I am guessing that now b < 5.0 actually does something, and before it didn't

Comments from Reviewable

knz · 2016-04-27T23:42:39Z

Reviewed r2.

Reviewed 38 of 38 files at r2.
Review status: 31 of 55 files reviewed at latest revision, 46 unresolved discussions, all commit checks successful.

sql/executor.go, line 582 [r2] (raw file):
I recommend sending an email to eng@ to request an explanationa bout this testing knob. Depending on the answer, you may as well remove this check entirely.

sql/insert.go, line 486 [r2] (raw file):
Just to clarify, why does normalization disappear here?

sql/limit.go, line 48 [r2] (raw file):
As indicated ina previous comment I wonder if we need to call TypeCheck here since the entire eval recursion should always take place after type checking.

sql/pgwire_test.go, line 228 [r2] (raw file):
I am surprised that the homogeneity rule is not forcing $1 to become a timestamp.

sql/parser/eval.go, line 1200 [r1] (raw file):
NEver mind, I should have looked at the next commit.

sql/parser/eval.go, line 1820 [r1] (raw file):
NEver mind, resolved by next commit.

sql/parser/normalize.go, line 453 [r1] (raw file):
NEver mind, fixed in next commit.

sql/parser/normalize.go, line 334 [r2] (raw file):
Unless a subsequent commit does so already, I recommend deleting the entire block.

In any case the comment above must be updated.

sql/parser/normalize.go, line 403 [r2] (raw file):
Extend this comment with: TODO(knz) This should happen when the prepare and execute phases are separated for SelectClause.

sql/parser/type_check.go, line 507 [r1] (raw file):
NEver mind, next commit fixes that.

Comments from Reviewable

knz · 2016-04-27T23:54:13Z

Reviewed r3.

Reviewed 20 of 20 files at r3.
Review status: 9 of 55 files reviewed at latest revision, 48 unresolved discussions, some commit checks pending.

sql/parser/normalize_test.go, line 184 [r3] (raw file):
Woah this seems wrong. Is that what we want? To me conversion to float should only be permissible if there is no information loss. Here you truncate 2 digits.

sql/parser/set.go, line 68 [r3] (raw file):
Please add this comment here: TODO(knz) clean up this when refactoring the pretty-printer.

Comments from Reviewable

knz · 2016-04-27T23:57:46Z

Reviewed r4.

Reviewed 2 of 2 files at r4.
Review status: 9 of 55 files reviewed at latest revision, 48 unresolved discussions, some commit checks pending.

Comments from Reviewable

knz · 2016-04-28T00:05:27Z

Reviewed r5.

Reviewed 2 of 2 files at r5.
Review status: 9 of 55 files reviewed at latest revision, 49 unresolved discussions, some commit checks pending.

sql/parser/normalize_test.go, line 98 [r5] (raw file):
Why is there still one level of parentheses remaining here?

Comments from Reviewable

nvanbenschoten · 2016-04-28T00:19:39Z

Review status: 9 of 55 files reviewed at latest revision, 52 unresolved discussions, all commit checks successful.

sql/analyze_test.go, line 73 [r1] (raw file):
I like this idea a lot. Added to my list of follow up refactoring.

sql/executor.go, line 582 [r2] (raw file):
I think @RaduBerinde added it, so I was hoping to get input from him.

sql/group.go, line 69 [r1] (raw file):
Done.

sql/group.go, line 98 [r1] (raw file):
There's a TODO in a later commit to switch all DummyT to TypeT. It's not worth doing any extra refactoring in this PR.

sql/index_selection.go, line 781 [r1] (raw file):
This is definitely not the final form of this function. See later commits if you want to compare against the algorithm from the RFC.

sql/insert.go, line 53 [r1] (raw file):
Done.

sql/insert.go, line 486 [r2] (raw file):
Because it's added above after the expressions returned here are type checked.

sql/pgwire_test.go, line 222 [r1] (raw file):
It's not failing, that's why it's commented out of the TestPGPrepareFail test. It's added into the passing test in a later commit.

sql/pgwire_test.go, line 226 [r1] (raw file):
It's a hint about the resolved type for the entire + expression. Any suggestions?

sql/pgwire_test.go, line 228 [r2] (raw file):
There is no $1, I think that's what this test is trying to show. It's caught in pgwire code.

sql/select_qvalue.go, line 111 [r1] (raw file):
Because it's a datum so desiring a value doesnt do anything. This also will be ripped out in the next commit.

sql/select_qvalue.go, line 268 [r1] (raw file):
Same answer :)

sql/sort.go, line 122 [r1] (raw file):
Done.

sql/subquery.go, line 62 [r1] (raw file):
Done.

sql/table.go, line 198 [r1] (raw file):
Yes it is.

sql/parser/builtins.go, line 57 [r1] (raw file):
This is in one of the later commits in this PR.

sql/parser/builtins.go, line 150 [r1] (raw file):
Yeah, it's fixed in the sql: Clean up code within the new typing system commit.

sql/parser/datum.go, line 69 [r1] (raw file):
Done.

sql/parser/datum.go, line 1154 [r1] (raw file):
Done.

sql/parser/eval.go, line 990 [r1] (raw file):
Im not sure why we decided on that, but it's not new? Feel free to assign an issue to me if you think it should be fixed, but I foresee us refactoring much more than this is a future PR dealing with types (and their nullability status) anyway.

sql/parser/eval.go, line 1510 [r1] (raw file):
See above.

sql/parser/eval.go, line 1782 [r1] (raw file):
See above.

sql/parser/eval_test.go, line 93 [r1] (raw file):
This is fixed in a later commit. The problem was that we were type checking the qualified name a without a having a type.

sql/parser/eval_test.go, line 377 [r1] (raw file):
A few things are going on here. Primarily what you're seeing is that we define int/int as a float, so the 1 constant is becoming a float as well. Interestingly, this is the exact same thing that Postgres will give.

We also do have NaN exclusively defined for float, but that's not what's happening here.

sql/parser/expr.go, line 291 [r1] (raw file):
Done.

sql/parser/expr.go, line 312 [r1] (raw file):
Where are you seeing multiple calls to asInt()? I think it's only used in the parser and in folding.

Regardless, a lot of this code is ripe for performance gains, but that should almost certainly come in a later PR.

sql/parser/normalize.go, line 453 [r1] (raw file):
This is addressed in a later commit. See TypedExpr.

sql/parser/normalize.go, line 543 [r1] (raw file):
Division by zero, any incorrect types being used in operations, etc. And go/constant doesn't make these easy to handle. It will just return an Unknown value type or panic, neither of which are ideal. We're not ignoring the errors, just failing to fold (which is best effort) so that evaluation (during run-time or during normalization) can throw errors. It would be a pretty bad idea to try to keep all error handling consistent across folding and evaluation.

sql/parser/normalize.go, line 334 [r2] (raw file):
It's all gone in a few commits.

sql/parser/normalize.go, line 403 [r2] (raw file):
Done.

sql/parser/normalize_test.go, line 98 [r5] (raw file):
Because we dont unfold all parentheses, only those with direct numeric constant children.

sql/parser/parse.go, line 87 [r1] (raw file):
The next commit redoes this function.

sql/parser/type_check.go, line 378 [r1] (raw file):
Let have this discussion in the previous thread.

sql/parser/type_check.go, line 444 [r1] (raw file):
This goes away later.

sql/parser/type_check.go, line 507 [r1] (raw file):
SetInferredType already does that. The semantics of this function are changed a lot in future commits though.

sql/parser/type_check.go, line 581 [r1] (raw file):
It's not a cast, just a type assertion.

sql/parser/type_check.go, line 614 [r1] (raw file):
Share what code? We call typeCheckOverloadedExprs in here.

sql/parser/type_check.go, line 739 [r1] (raw file):
This is removed in a future commit.

sql/parser/type_check.go, line 758 [r1] (raw file):
Filed #6365.

sql/parser/type_check.go, line 868 [r1] (raw file):
Done (in later commit because this code moves).

sql/parser/type_check.go, line 871 [r1] (raw file):
This comment is changed later.

sql/parser/type_check_test.go, line 116 [r1] (raw file):
Good catch. IFNULL is implemented as a two argument COALESCE statement. I made a change to fix the error here. And the reversed error is an artifact of the string being resolvable and the constant being typed later. We can look into improving the errors for cases like that in the future.

sql/testdata/select_index, line 279 [r1] (raw file):
Right, 5 must be getting typed as an Int now.

Comments from Reviewable

knz · 2016-04-28T00:20:54Z

Reviewed r6-r11.

Reviewed 1 of 7 files at r6, 1 of 46 files at r9, 24 of 38 files at r10, 17 of 20 files at r11, 2 of 2 files at r12, 2 of 2 files at r13.
Review status: 35 of 55 files reviewed at latest revision, 41 unresolved discussions, all commit checks successful.

Comments from Reviewable

nvanbenschoten · 2016-04-28T00:25:55Z

Review status: 35 of 55 files reviewed at latest revision, 39 unresolved discussions, all commit checks successful.

sql/limit.go, line 48 [r2] (raw file):
Is the *parser.Limit ever previously type checked though?

sql/parser/normalize_test.go, line 184 [r3] (raw file):
I agree, this is kind of what I was getting at with that float overflow TODO you brought up. Since it's not a regression, is it ok with you if we keep that TODO and address this after this PR?

sql/parser/set.go, line 68 [r3] (raw file):
Done.

Comments from Reviewable

nvanbenschoten · 2016-04-28T00:29:36Z

Review status: 35 of 55 files reviewed at latest revision, 39 unresolved discussions, all commit checks successful.

sql/pgwire_test.go, line 226 [r1] (raw file):
You suggested <int> + <parameter> desiring <string> below.

Does unsupported binary operator: <int> + <parameter> when desiring <string> sound better or worse to you?

Comments from Reviewable

nvanbenschoten · 2016-04-28T00:37:29Z

Review status: 35 of 55 files reviewed at latest revision, 35 unresolved discussions, all commit checks successful.

sql/parser/builtins.go, line 58 [r1] (raw file):
Done (in a later commit).

Comments from Reviewable

RaduBerinde · 2016-04-28T00:40:46Z

Looked over r2. I love the distinction between Expr and TypedExpr

Review status: 35 of 55 files reviewed at latest revision, 44 unresolved discussions, all commit checks successful.

sql/executor.go, line 582 [r2] (raw file):
This check verifies that we don't leak changes to expressions into the expression we are given (which can be problematic, e.g. for retries). We shouldn't be changing expressions in place. If something needs to change, a new expression should be generated, which can share unchanged nodes with the old one. The Walk infrastructure works like this.

sql/expr_filter.go, line 243 [r2] (raw file):
long line

sql/group.go, line 104 [r2] (raw file):
very long lines

sql/limit.go, line 58 [r2] (raw file):
long line
btw if you use vim you can have it highlight them automatically

autocmd FileType c,cpp,java,sh,python,make,go highlight OverLength ctermbg=darkred ctermfg=white
autocmd FileType c,cpp,java,sh,python,make,go match OverLength /\%101v.\+/

sql/select.go, line 480 [r2] (raw file):
line too long

sql/update.go, line 429 [r2] (raw file):
line too long

sql/parser/eval.go, line 1989 [r2] (raw file):
i would give some names to the output values (at least the flags)

sql/parser/expr.go, line 38 [r2] (raw file):
maps ValArg names to

sql/parser/overload.go, line 160 [r2] (raw file):
a bunch of long lines in this file

sql/parser/walk.go, line 61 [r2] (raw file):
maybe these (AndExpr, NotExpr, OrExpr) need a full copy as well since they have annotations?

Comments from Reviewable

knz · 2016-04-28T00:46:02Z

Reviewed 1 of 38 files at r10, 1 of 7 files at r14, 1 of 7 files at r15, 17 of 17 files at r16.
Review status: all files reviewed at latest revision, 45 unresolved discussions, all commit checks successful.

sql/pgwire_test.go, line 226 [r1] (raw file):
YEah let's say " (desired )" instead of " = "

sql/parser/constant_test.go, line 59 [r16] (raw file):
That's where I had an earlier comment: I'm not sure the presence of "eE" should be sufficient to upgrade a literal to "natural float". At best you could make it decimal.

sql/parser/constant_test.go, line 92 [r16] (raw file):
No risk to lose precision here?

sql/parser/constant_test.go, line 196 [r16] (raw file):
Not happy about that one. To me this should be decimal with an exact representation for the result.

sql/parser/constant_test.go, line 254 [r16] (raw file):
Still surprised by the extra parentheses in the result.

sql/parser/expr.go, line 312 [r1] (raw file):
Ok I'll defer to your judgement about possible issues to file as a todo for later.

sql/parser/normalize.go, line 543 [r1] (raw file):
Ok but then the special cases should be documented in comments.

sql/parser/normalize_test.go, line 184 [r3] (raw file):
Ok but then please file an issue and cross-reference both.

sql/parser/normalize_test.go, line 98 [r5] (raw file):
This does not make sense to me. Why not unfold everything?

sql/parser/overload.go, line 313 [r16] (raw file):
This comment is too obscure. Please add an example or more to clarify.

sql/parser/type_check.go, line 581 [r1] (raw file):
Well I'm still thinking the type assertion (and accompanying runtime check) can be factored out.

sql/parser/type_check.go, line 614 [r1] (raw file):
What I mean is that I still think comparisons / binops should be transformed to function calls during parsing and typed as such. Perhaps this deserves a separate PR/issue though.

Comments from Reviewable

knz · 2016-04-28T00:46:40Z

Reviewed to r16.

Review status: all files reviewed at latest revision, 43 unresolved discussions, all commit checks successful.

Comments from Reviewable

RaduBerinde · 2016-04-28T00:52:23Z

Review status: all files reviewed at latest revision, 42 unresolved discussions, all commit checks successful.

sql/parser/constant.go, line 1 [r3] (raw file):
2016

sql/parser/constant.go, line 32 [r3] (raw file):
maybe explain that the first type in the list is the "natural" type

sql/parser/constant.go, line 41 [r3] (raw file):
can just return ok with no if

sql/parser/constant.go, line 74 [r3] (raw file):
this needs a comment explaining the distinction from canConstantBecome

sql/parser/constant.go, line 173 [r3] (raw file):
long line (and below)

Comments from Reviewable

RaduBerinde · 2016-04-28T00:54:28Z

Review status: all files reviewed at latest revision, 41 unresolved discussions, all commit checks successful.

sql/parser/type_check.go, line 59 [r12] (raw file):
long line

Comments from Reviewable

knz · 2016-04-28T00:55:12Z

Review status: all files reviewed at latest revision, 41 unresolved discussions, all commit checks successful.

sql/executor.go, line 582 [r2] (raw file):
So Radu what would be the correct approach now here?

sql/limit.go, line 48 [r2] (raw file):
Huh if it is not then it should be. Doesn't the SelectClause do this?

sql/testdata/select_index, line 279 [r1] (raw file):
Woah, nice.

Comments from Reviewable

RaduBerinde · 2016-04-28T01:03:29Z

sql/executor.go, line 582 [r2] (raw file):
Not changing things in place.. Or if we are changing them in a way that we know for sure is ok if the same expression is retried (e.g. like we already do for QualifiedName normalization), making sure it prints the same.. If we remove this check, changes that can potentially screw up retries will go undetected.

Comments from Reviewable

RaduBerinde · 2016-04-28T01:08:02Z

Review status: all files reviewed at latest revision, 44 unresolved discussions, all commit checks successful.

sql/parser/constant.go, line 136 [r14] (raw file):
"with a flag indicating whether the conversion was possible". BTW do we really need this flag, isn't the nil result enough?

sql/parser/eval.go, line 1983 [r15] (raw file):
long line

sql/parser/type_check.go, line 545 [r15] (raw file):
long line

Comments from Reviewable

RaduBerinde · 2016-04-28T01:21:16Z

Reviewed 9 of 39 files at r1, 1 of 46 files at r9, 21 of 38 files at r10, 5 of 20 files at r11, 1 of 7 files at r14, 1 of 7 files at r15, 17 of 17 files at r16.
Review status: all files reviewed at latest revision, 44 unresolved discussions, all commit checks successful.

sql/limit.go, line 48 [r2] (raw file):
I think the issue is that SelectCause isn't an expression (it could be part of an expression under a subquery, but itself it isn't an expression). So the "eval recursion" starts at the filters, limit etc expressions not above.

sql/parser/constant.go, line 84 [r16] (raw file):
comments should wrap at 80 cols

sql/parser/constant_test.go, line 59 [r16] (raw file):
All's fair in love, war, and testing code :)

sql/parser/constant_test.go, line 77 [r16] (raw file):
superlong lines (and a couple more a bit below)

Comments from Reviewable

nvanbenschoten · 2016-04-28T03:56:02Z

Thanks for the reviews @RaduBerinde and @knz!

Review status: 32 of 55 files reviewed at latest revision, 35 unresolved discussions, some commit checks pending.

sql/executor.go, line 582 [r2] (raw file):
Radu, thanks for the explanation. That's kind of what I figured, but I don't think either approach is going to work 100% here.

Correct me if I'm wrong, but the deep copying that the Walk infrastructure does will make a copy of the entire parent expression tree above the changed node. If that's true, then I fear the overhead of typing constants would be far too high, because you could all but guarantee that every statement tree would need to be copied in its entirety. At that point, it might just be better to preserve an "original" copy of the tree for each retry to start with. On the other hand, making sure constant literals, once folded and resolved to concrete datum types, print to the same string sounds like a fraught effort.

It seems to me that a folded and type checked expression tree (with constants resolved in-place) would be completely safe for a retry. These resolved constant expressions would never change between retries so I can't see a reason why keeping them folded would mess anything up. That said, this of course would mess up the string comparison, so I'm wondering if we might want to explore other ways of making this safety equality check to give us a bit more flexibility.

(@knz This also made me realize though that I think we should always populate parameters after type checking, to guarantee that the type checking during prepare and execute always resolve constants the same way. What do you think? I could just be overthinking things, because having the populated parameters with the previously inferred types might guarantee that we always type check the same.)

sql/expr_filter.go, line 243 [r2] (raw file):
Done.

sql/group.go, line 104 [r2] (raw file):
Done.

sql/pgwire_test.go, line 226 [r1] (raw file):
SGTM. Done.

sql/select.go, line 480 [r2] (raw file):
With this pattern, I can bring it down to 104 chars by changing to the variable to typ. I would argue anything else (including wrapping) falls under the "less legible" clause of https://github.com/cockroachdb/cockroach/blob/master/CONTRIBUTING.md#style-guide.

sql/update.go, line 429 [r2] (raw file):
Done.

sql/parser/constant.go, line 1 [r3] (raw file):
Heh, nice catch. Done for all.

sql/parser/constant.go, line 32 [r3] (raw file):
I'm not sure which revision you were looking at, but does the final comment do a good enough job expressing this?

sql/parser/constant.go, line 41 [r3] (raw file):
Yep. Done.

sql/parser/constant.go, line 173 [r3] (raw file):
Most of these would get a lot less legible if we start wrapping the error string, and I don't think they're far enough past the suggested character limit to warrant that.

sql/parser/constant.go, line 136 [r14] (raw file):
Done. And I guess nil could be used, but I prefer out-of-band values where possible for their clarity and extensibility.

sql/parser/constant_test.go, line 59 [r16] (raw file):
Yeah these are just tests. That said, we have always upgraded the literal to a "natural float" in our parser.

sql/parser/constant_test.go, line 77 [r16] (raw file):
Done.

sql/parser/constant_test.go, line 92 [r16] (raw file):
There probably is with the right input data, but we dont have any test cases which lose precision so I'm not too worried about it.

sql/parser/constant_test.go, line 196 [r16] (raw file):
Agreed. See the float overflow thread.

sql/parser/constant_test.go, line 254 [r16] (raw file):
Constant folding != normalization. The current Expr tree is Paren{Paren{BinExpr{..., ...}}} so getting rid of the parentheses doesnt change anything for type checking.

sql/parser/eval.go, line 1989 [r2] (raw file):
Done.

sql/parser/eval.go, line 1983 [r15] (raw file):
Done.

sql/parser/expr.go, line 38 [r2] (raw file):
Done.

sql/parser/normalize.go, line 543 [r1] (raw file):
Done.

sql/parser/normalize_test.go, line 184 [r3] (raw file):
Done.

sql/parser/normalize_test.go, line 98 [r5] (raw file):
See above. We only unfold parens at this stage if it could promote more constant folding. If the child expression is not a constant, why unfold it? We're not trying to add another normalization stage here, we're trying to help out the type checker, so unfolding all parentheses is diluting the stage's purpose when it should be very focused.

sql/parser/overload.go, line 160 [r2] (raw file):
Done.

sql/parser/overload.go, line 313 [r16] (raw file):
Could you explain? It seems pretty straightforward. Anything after this point is a "preference" filter because "all remaining candidate overload implementations are valid" from this point on.

sql/parser/type_check.go, line 545 [r15] (raw file):
Done.

sql/parser/walk.go, line 61 [r2] (raw file):
The annotations do not affect their string representations, but lets have this discussion in one place above.

Comments from Reviewable

RaduBerinde · 2016-04-28T13:11:35Z

Review status: 32 of 55 files reviewed at latest revision, 35 unresolved discussions, all commit checks successful.

sql/executor.go, line 582 [r2] (raw file):

I fear the overhead of typing constants would be far too high, because you could all but guarantee that every statement tree would need to be copied in its entirety. At that point, it might just be better to preserve an "original" copy of the tree for each retry to start with

I don't get the overhead argument, the former is still cheaper than the latter (at least you don't copy all the unrelated leaves etc). In any case, there is no facility for deep copying an entire expression.. I tried to do that a while back and it was a can of worms (see #4418). Also people seemed to think a full deep copy was a bit much (#4418 (comment))

The problem is not just retries - given that the visitor/walk infrastructure is based on sharing unchanged nodes between the input and the output expression, any time you make an in-place modification, you have to make sure it doesn't screw up some other expression that is using that node. It's hard to track and reason about all the ways in which this could happen.

Can you explain exactly what we are modifying in place and/or point me to the code that actually modifies things in place? AFAICT the constant folding code does not modify anything in-place (it generates new nodes, and the walk infrastructure will automatically make copies of the changed parents). If we are just setting type annotations in nodes that is probably fine (but in that case wouldn't it be easy to write a visitor that clears all that stuff from the tree before the string check?).

Even if everything we are doing is ok, I am very hesitant to remove the entire check (which could allow future subtle bugs).. I'm ok with finding a way to relax the check though (maybe a visitor that generates a string representation with less info?)

Comments from Reviewable

knz · 2016-04-28T15:48:15Z

Reviewed 1 of 20 files at r17, 1 of 7 files at r20, 1 of 7 files at r21, 20 of 20 files at r22.
Review status: all files reviewed at latest revision, 28 unresolved discussions, all commit checks successful.

sql/executor.go, line 582 [r2] (raw file):
@nvanbenschoten: what do you mean with "we should always populate parameters after type checking"?

sql/limit.go, line 48 [r2] (raw file):
Well that's unfortunate. The call for typechecking really ought to be in there since we need the types during evaluation. If we can't do this in this PR, then please file an issue about it.

sql/parser/overload.go, line 313 [r16] (raw file):

// At this point, all remaining overload candidates accept the argument list. 
// In case there is more than one candidate remaining, the following code 
// uses heuristics to find a most preferable candidate. 

// The first heuristic is to prefer candidates that return the desired type.

(then before each code blocks below, add a comment summarizing the rule that the block implements.)

Comments from Reviewable

RaduBerinde · 2016-04-28T16:34:13Z

Review status: all files reviewed at latest revision, 28 unresolved discussions, all commit checks successful.

sql/limit.go, line 48 [r2] (raw file):
When you say "in there", where exactly do you mean?

Comments from Reviewable

petermattis · 2016-05-02T18:13:27Z

This PR is very large, which makes it difficult to review, even when reviewed commit-by-commit. Are there particularly areas that you think deserve special attention?

Seeing all of the Dummy* values makes me want to s/Dummy{Int,String,...}/Type{Int,String,...}.

Review status: 10 of 55 files reviewed at latest revision, 34 unresolved discussions, all commit checks successful.

sql/sort.go, line 75 [r30] (raw file):
Did you want to leave this commented out? Seems like the comment above isn't correct without this code as you're no longer normalizing here.

sql/parser/constant.go, line 273 [r30] (raw file):
Would be clearer to use []Datum{DummyBytes} here. The memory savings aren't worth the effort and I hope sharing the underlying data isn't necessary.

sql/parser/eval.go, line 156 [r30] (raw file):
Does this cause an allocation? I'm not sure how the compiler would be able to tell op is on the heap, which means op.types[:] will force op onto the heap.

sql/parser/eval.go, line 160 [r30] (raw file):
Why are you going throw the effort of creating params() when you're just indexing into it? This would be clearer as op.LeftType.match(l) && op.RightType.match(r). Grr, review is too big for reviewable to show all at once so I can't easily find the definition of matchAt.

sql/parser/eval.go, line 172 [r30] (raw file):
Returning the first match makes the ordering of the overloads significant. Does that actually matter at the moment?

sql/parser/eval.go, line 1868 [r30] (raw file):
Did you intend to keep around this commented out code?

sql/parser/eval.go, line 1879 [r30] (raw file):
Any reason you're panicing here? Returning an error seems preferable as it doesn't kill the process.

sql/parser/expr.go, line 44 [r30] (raw file):
I'm not clear what safety you're getting by splitting Expr and TypedExpr given that all of the expression nodes implement the TypedExpr interface and that you frequently type-assert Expr to TypedExpr without first making sure the expr is typed. On the other hand, I prefer not having to have a separate set of expression nodes that are typed (e.g. AndTypedExpr).

sql/parser/overload_test.go, line 76 [r30] (raw file):
Nice testing!

sql/parser/sql.y, line 1206 [r30] (raw file):
Shouldn't you include err in the message? Or perhaps: sqllex.Error(err.Error()).

sql/parser/type_check.go, line 59 [r12] (raw file):
Reviewing lines longer than 100 characters in side-by-side code review tools is irritating. I haven't looked at all of the cases here, but most of the ones I've seen look easy to wrap.

Comments from Reviewable

nvanbenschoten · 2016-05-02T18:43:36Z

Thanks for the review Peter, and I apologize for the size of the PR. It grew as a result of me not feeling 100% confident about the design until I had iterated a few times on it and was able to see the full picture unroll. For future changes I'll be more conscious about rebasing as I go to avoid the need for a huge PR with entangled commits. I will also look into building up new systems beside their current counterpart over time before switching everything over.

I agree with your second point, and I think there's actually a TODO at the top of type_check.go that says exactly that. @knz and I had even discussed backing away from using dummy variables directly as types. I have a number of polishing commits staged to come after this which address that issue, as well as

unexporting the Expr.typeCheck method (your suggestion)
creating a parser.TypeCheckAndRequire function
Creating an IsUnresolvedArgument method on MapArgs
Reducing allocations in the type checker
Defining a NoTypePreference sentinel value (Radu's suggestion)
Adding a number of LogicTest and PGWire tests to demonstrate/validate new functionality
Cleaning up aggregate functions
Propagating TypedExpr through analysis phases in /sql
Adding sanity checks to TypedExpr methods
And creating a placeholderAnnotationVisitor to begin looking at implicit and explicit (potentially) annotations

In terms of places to focus on, I think the biggest area that I would appreciate your input is on the Expr vs TypedExpr distinction. Your previous comments on it led me away from creating distinct underlying types (to avoid allocations), and I think the superset interface approach does a good job providing a "soft" layer of separation. Still, any further input would be appreciated.

Review status: 10 of 55 files reviewed at latest revision, 34 unresolved discussions, all commit checks successful.

Comments from Reviewable

RaduBerinde · 2016-05-02T19:23:08Z

Review status: 10 of 55 files reviewed at latest revision, 34 unresolved discussions, all commit checks successful.

sql/parser/type_check.go, line 59 [r12] (raw file):
We have a code style and this is part of it.. There should be some limit so that everyone can configure their editors etc accordingly, and IMO the value 100 is reasonable. Regardless, this discussion doesn't belong here. If you think we should change this, open a PR against the coding conventions document.

The "unless it makes it less readable" exception in the code style applies to things like string literals, test case definitions, etc where wrapping would make things significantly less readable. This is not the case here, and it's not the case for function definitions (the code style document explicitly describes how those should be wrapped).

The fact that there are many existing violations should be an incentive for us to be more careful, not less. These are brand new lines you are adding, not existing lines which happened to be long already..

Finally, your 1899 count I'm includes generated files (like *.pb.go) and various test files (where it's less important).

Comments from Reviewable

nvanbenschoten · 2016-05-02T20:04:32Z

Review status: 10 of 55 files reviewed at latest revision, 34 unresolved discussions, some commit checks pending.

sql/sort.go, line 75 [r30] (raw file):
Done.

sql/parser/constant.go, line 273 [r30] (raw file):
Done. And above.

sql/parser/eval.go, line 156 [r30] (raw file):
Yeah it does and this is an issue. I have a follow up commit that fixes this, but I'll pull it back to here if you think its necessary.

sql/parser/eval.go, line 160 [r30] (raw file):
RightType doesnt have a match method defined on it. We could call TypeEqual, but that would require pulling in some of the Null and Tuple handling present in typeList.matchAt. What do you think?

sql/parser/eval.go, line 172 [r30] (raw file):
The first match? This is defined on a single BinOp, not on the binOpOverload slice. I might be misunderstanding your question.

sql/parser/eval.go, line 1868 [r30] (raw file):
Removed.

sql/parser/eval.go, line 1879 [r30] (raw file):
Done.

sql/parser/expr.go, line 44 [r30] (raw file):
More safety is possible when future changes fully propagate the TypedExpr interface through sql analysis code. After that change, there are very few places where a type-assertion should be necessary (outside of the sql/parser package), so they should stick out and be carefully considered. Ideally, the understanding that the same expression nodes that implement Expr also implement TypedExpr should be uniquely known by the parser.

There are also a number of expression types that do not implement TypedExpr, such as the constant expressions and unresolved placeholder epressions, so type asserting them would fail and visibly demonstrate a failure to type check. Finally, I'm planning on adding in a few sanity checks which should remove most of the concern about falsely type asserting and using expression nodes which do implement TypedExpr but havent been type-checked yet.

sql/parser/sql.y, line 1206 [r30] (raw file):
Done.

sql/parser/type_check.go, line 59 [r12] (raw file):
Fair enough. Have either of you ever been able to increase Reviewable's line width? I've only been able to get it to wrap at 80 characters so even lines that fit our style guide get wrapped, which is part of the reason that a 100 character hard limit seemed arbitrary.

Comments from Reviewable

RaduBerinde · 2016-05-02T20:36:32Z

Review status: 10 of 55 files reviewed at latest revision, 34 unresolved discussions, all commit checks successful.

sql/parser/type_check.go, line 59 [r12] (raw file):
Yes - if you look at the box where a file diff starts, in the top right corner there is a triangle/arrow pointing down. You can drag that to the right. It will remember the value across reviews.

Comments from Reviewable

RaduBerinde · 2016-05-02T20:39:07Z

Review status: 10 of 55 files reviewed at latest revision, 33 unresolved discussions, all commit checks successful.

sql/parser/walk.go, line 61 [r2] (raw file):
It's not about the string representations - let's say I have a typed expression and then I use a visitor on it (e.g. splitFilter) that modifies the child of an AndExpr. The walk code will copy the AndExpr parent but as it stand the type annotation will be lost in the process AndExpr

Comments from Reviewable

RaduBerinde · 2016-05-02T20:40:47Z

Review status: 10 of 55 files reviewed at latest revision, 19 unresolved discussions, all commit checks successful.

sql/parser/constant.go, line 32 [r3] (raw file):
Definitely, it was an earlier revision and I forgot to remove the comment.

Comments from Reviewable

nvanbenschoten · 2016-05-02T20:44:40Z

Review status: 10 of 55 files reviewed at latest revision, 18 unresolved discussions, all commit checks successful.

sql/parser/type_check.go, line 59 [r12] (raw file):
Ah awesome, thanks!

sql/parser/walk.go, line 61 [r2] (raw file):
I think I replied to the wrong comment here because what I said doesnt make sense. You're right though, I'll make the change.

Comments from Reviewable

knz · 2016-05-03T14:03:39Z

Reviewed 3 of 45 files at r23, 16 of 38 files at r24, 3 of 20 files at r25, 14 of 20 files at r30, 9 of 9 files at r31.
Review status: 52 of 55 files reviewed at latest revision, 17 unresolved discussions, all commit checks successful.

sql/parser/eval.go, line 1879 [r30] (raw file):
I'd like to have this error logged on the error log as well, since it occurrence would be a clear server bug.

Comments from Reviewable

This is one of the cornerstones of the Summer typing algorithm. Type checking now takes a "desired" type, which is a hint to `Expr`s to resolve themselves as the type, if possible. It it up to the caller to assert that the type returned if correct, if it is actually required.

This commit adds a `TypedExpr` interface which is a superset of Expr but can also perform expression evaluation and can return type annotations.

This change creates the types `NumVal` and `StrVal`, which both implement the new `Constant` interface. The interface is a super-set of `Expr`, but includes extra methods for literal values.

This commit creates a new error type to hold commonly seen errors. It then uses this error in a few previously ambiguous situations.

Previously `(1) + 2` would not have folded to `3`, now it does.

This change pulls the constant related visitors into constant.go and cleans up a few tests.

nvanbenschoten · 2016-05-03T19:38:43Z

Review status: 52 of 55 files reviewed at latest revision, 17 unresolved discussions, all commit checks successful.

sql/parser/eval.go, line 1879 [r30] (raw file):
Done. Here and wherever Eval is unhandled.

Comments from Reviewable

This change creates the types `unaryOpOverload`, `binOpOverload`, and `cmpOpOverload`, each with a lookup method for an overload implementation for a given set of argument types. It also changes `ComparisonOp` to `ComparisonOperator` to stay consistent.

This commit cleans up a number of uses of the new `TypedExpr` interface, improves the type checking system, added new test cases, and adds in a few TODOs for future work.

nvanbenschoten assigned petermattis Apr 27, 2016

nvanbenschoten added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-sql-semantics labels Apr 27, 2016

nvanbenschoten mentioned this pull request Apr 27, 2016

sql: Pull column to parser.Datum conversion into getTypeForColumn #6337

Merged

nvanbenschoten force-pushed the nvanbenschoten/typeCheck branch from e8da1a6 to c3eb7d2 Compare April 27, 2016 23:52

nvanbenschoten force-pushed the nvanbenschoten/typeCheck branch from c3eb7d2 to 1a2d043 Compare April 28, 2016 03:53

knz mentioned this pull request Apr 28, 2016

[WIP] sql: refactor the statement pretty-printer and implement EXPLAIN(TYPES) #6406

Closed

jseldess added the docs-todo label May 2, 2016

nvanbenschoten force-pushed the nvanbenschoten/typeCheck branch from f868aa0 to 80449ec Compare May 2, 2016 20:04

nvanbenschoten force-pushed the nvanbenschoten/typeCheck branch from 80449ec to 30708f2 Compare May 2, 2016 21:04

tbg mentioned this pull request May 3, 2016

sql: unsupported unary operator: - <parameter> #6460

Closed

nvanbenschoten added 6 commits May 3, 2016 15:06

sql: Split notion of pre and post-type checked expression trees

a5522a3

This commit adds a `TypedExpr` interface which is a superset of Expr but can also perform expression evaluation and can return type annotations.

sql: Pull out Constant literal interface and add string-like constants

92a6bbb

This change creates the types `NumVal` and `StrVal`, which both implement the new `Constant` interface. The interface is a super-set of `Expr`, but includes extra methods for literal values.

sql: Create unexpectedTypeError and define a few error situations

516e200

This commit creates a new error type to hold commonly seen errors. It then uses this error in a few previously ambiguous situations.

sql: Numeric constant folding now folds parentheses

1993a61

Previously `(1) + 2` would not have folded to `3`, now it does.

sql: Improve testing and documentation for constantFolderVisitor

b072a61

This change pulls the constant related visitors into constant.go and cleans up a few tests.

nvanbenschoten force-pushed the nvanbenschoten/typeCheck branch from 30708f2 to 6c01a3a Compare May 3, 2016 19:44

sql: Clean up code within the new typing system

ad3d609

This commit cleans up a number of uses of the new `TypedExpr` interface, improves the type checking system, added new test cases, and adds in a few TODOs for future work.

nvanbenschoten force-pushed the nvanbenschoten/typeCheck branch from 6c01a3a to ad3d609 Compare May 3, 2016 19:57

nvanbenschoten merged commit b8f272e into cockroachdb:master May 3, 2016

nvanbenschoten deleted the nvanbenschoten/typeCheck branch May 3, 2016 20:24

RaduBerinde mentioned this pull request May 4, 2016

sql: don't use qname visitor for check constraints #6498

Merged

This was referenced May 16, 2016

sql: investigate using arbitrary precision constants #1795

Closed

sql: type error when specifying decimal precision #4650

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: Implement Summer, a smarter typing system #6358

sql: Implement Summer, a smarter typing system #6358

nvanbenschoten commented Apr 27, 2016 •

edited by petermattis

Loading

knz commented Apr 27, 2016

RaduBerinde commented Apr 27, 2016

knz commented Apr 27, 2016

knz commented Apr 27, 2016

knz commented Apr 27, 2016

knz commented Apr 28, 2016

nvanbenschoten commented Apr 28, 2016

knz commented Apr 28, 2016

nvanbenschoten commented Apr 28, 2016

nvanbenschoten commented Apr 28, 2016

nvanbenschoten commented Apr 28, 2016

RaduBerinde commented Apr 28, 2016

knz commented Apr 28, 2016

knz commented Apr 28, 2016

RaduBerinde commented Apr 28, 2016

RaduBerinde commented Apr 28, 2016

knz commented Apr 28, 2016

RaduBerinde commented Apr 28, 2016

RaduBerinde commented Apr 28, 2016

RaduBerinde commented Apr 28, 2016

nvanbenschoten commented Apr 28, 2016 •

edited

Loading

RaduBerinde commented Apr 28, 2016

knz commented Apr 28, 2016

RaduBerinde commented Apr 28, 2016

petermattis commented May 2, 2016

nvanbenschoten commented May 2, 2016

RaduBerinde commented May 2, 2016

nvanbenschoten commented May 2, 2016

RaduBerinde commented May 2, 2016

RaduBerinde commented May 2, 2016

RaduBerinde commented May 2, 2016

nvanbenschoten commented May 2, 2016

knz commented May 3, 2016

nvanbenschoten commented May 3, 2016

sql: Implement Summer, a smarter typing system #6358

sql: Implement Summer, a smarter typing system #6358

Conversation

nvanbenschoten commented Apr 27, 2016 • edited by petermattis Loading

knz commented Apr 27, 2016

RaduBerinde commented Apr 27, 2016

knz commented Apr 27, 2016

knz commented Apr 27, 2016

knz commented Apr 27, 2016

knz commented Apr 28, 2016

nvanbenschoten commented Apr 28, 2016

knz commented Apr 28, 2016

nvanbenschoten commented Apr 28, 2016

nvanbenschoten commented Apr 28, 2016

nvanbenschoten commented Apr 28, 2016

RaduBerinde commented Apr 28, 2016

knz commented Apr 28, 2016

knz commented Apr 28, 2016

RaduBerinde commented Apr 28, 2016

RaduBerinde commented Apr 28, 2016

knz commented Apr 28, 2016

RaduBerinde commented Apr 28, 2016

RaduBerinde commented Apr 28, 2016

RaduBerinde commented Apr 28, 2016

nvanbenschoten commented Apr 28, 2016 • edited Loading

RaduBerinde commented Apr 28, 2016

knz commented Apr 28, 2016

RaduBerinde commented Apr 28, 2016

petermattis commented May 2, 2016

nvanbenschoten commented May 2, 2016

RaduBerinde commented May 2, 2016

nvanbenschoten commented May 2, 2016

RaduBerinde commented May 2, 2016

RaduBerinde commented May 2, 2016

RaduBerinde commented May 2, 2016

nvanbenschoten commented May 2, 2016

knz commented May 3, 2016

nvanbenschoten commented May 3, 2016

nvanbenschoten commented Apr 27, 2016 •

edited by petermattis

Loading

nvanbenschoten commented Apr 28, 2016 •

edited

Loading