Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: extract k/v operations from insert/update/delete #6211

Merged
merged 1 commit into from
Apr 25, 2016

Conversation

danhhz
Copy link
Contributor

@danhhz danhhz commented Apr 21, 2016

Separates the mapping between table rows/indexes and key/values for the
mutative operations. This will be useful immediately for implementing UPSERT and
eventually for TableWriter.

I tried to keep this as pure a refactor as I could, but it was a lot simpler
to model an UPDATE with a PK change as a delete followed by an update (instead
of migrating the mess I committed before).


This change is Reviewable

@danhhz
Copy link
Contributor Author

danhhz commented Apr 21, 2016

cc @knz @RaduBerinde

@knz
Copy link
Contributor

knz commented Apr 21, 2016

I like this refactoring very much! Agreed on modeling update on PK change as delete+insert.


Reviewed 4 of 4 files at r1.
Review status: all files reviewed at latest revision, 1 unresolved discussion, some commit checks failed.


sql/delete.go, line 161 [r1] (raw file):
I'd rather see a more verbose method name here. eg. HasFastPath / FastPathAvailable


Comments from Reviewable

@RaduBerinde
Copy link
Member

Awesome, this is great! I have various comments and I want to make another pass in more detail but the overall structure looks great.

I definitely agree on modeling a PK UPDATE as a delete + insert!


Review status: all files reviewed at latest revision, 14 unresolved discussions, some commit checks failed.


sql/delete.go, line 161 [r1] (raw file):
+1, except it should be unexported, fastPathAvailable


sql/rowwriter.go, line 32 [r1] (raw file):
I think this name is a misnomer. rowWriter doesn't actually write anything, it just helps with creating keys etc. I would name it maybe rowEncoder or rowWriteHelper.


sql/rowwriter.go, line 35 [r1] (raw file):
Which columns are in this map? Only the columns we are providing a value for? Including default values? This should be mentioned in a field comment, and in a comment for makeRowWriter. I am assuming this doesn't contain all the columns in the table, because then I don't see why the require.. functions below are useful.


sql/rowwriter.go, line 94 [r1] (raw file):
why is this exported? would a better name be columnInPK? Maybe add a TODO to move this functionality to TableDescriptor (I filed #6233, you can reference that)


sql/rowwriter.go, line 96 [r1] (raw file):
This doesn't work as you expect with a non-pointer receiver.. The "new" rw is gone when the function ends


sql/rowwriter.go, line 107 [r1] (raw file):
I am not convinced composing makes the most sense here. rowWriter just provides some helper functionality, I would make it a regular field.


sql/rowwriter.go, line 114 [r1] (raw file):
needs a description of what the map contains (in the context of an insert, and w.r.t default values)


sql/rowwriter.go, line 142 [r1] (raw file):
insertRow. The comment should be more accurate about what it's actually doing // insertRow adds to the batch the CPut operations necessary to insert a table row with the given values


sql/rowwriter.go, line 224 [r1] (raw file):
same - map needs a description in the context of update, and the relation between the columns in the map and updateCols


sql/rowwriter.go, line 227 [r1] (raw file):
why are we passing full column descriptors instead of just column IDs (or index in TableDesc.Columns)?


sql/rowwriter.go, line 288 [r1] (raw file):
Here we are making three rowWriters.. Might not matter but seems a bit wasteful. At the very least we could stop making rw and just using rd or ri for that functionality (maybe make rw a pointer to rd.rowWriter). Or do we expect these rowWriters to be different from each other (I haven't looked closely if they handle mutations differently)?


sql/rowwriter.go, line 319 [r1] (raw file):
updateRow. Same as before, the comment should say that we are adding ops to the batch, not actually performing the update.


sql/rowwriter.go, line 437 [r1] (raw file):
what does the map contain? all the columns? If yes why do we take it as an argument instead of computing it in here?


sql/rowwriter.go, line 452 [r1] (raw file):
all these functions should be unexported.

Also, I just noticed the receiver, all these should take a pointer receiver or we may be making full copies of these structs every time! Applies to rowUpdater and rowInserter as well.


Comments from Reviewable

@danhhz
Copy link
Contributor Author

danhhz commented Apr 22, 2016

Thanks for the review! RFAL


Review status: 0 of 4 files reviewed at latest revision, 14 unresolved discussions, some commit checks pending.


sql/delete.go, line 161 [r1] (raw file):
Done.


sql/rowwriter.go, line 32 [r1] (raw file):
Done.


sql/rowwriter.go, line 35 [r1] (raw file):
Ugh. It's really complicated, depends on which operation(s) you're doing, and probably needs a cleanup. I added docs all over the place, but I'm going to clean it up more once I see how upsert affects everything.


sql/rowwriter.go, line 94 [r1] (raw file):
Done.


sql/rowwriter.go, line 96 [r1] (raw file):
Oops! Good catch.


sql/rowwriter.go, line 107 [r1] (raw file):
Fair, I was 50/50 myself.


sql/rowwriter.go, line 114 [r1] (raw file):
Done.


sql/rowwriter.go, line 142 [r1] (raw file):
Done.


sql/rowwriter.go, line 224 [r1] (raw file):
Done.


sql/rowwriter.go, line 227 [r1] (raw file):
That's what was happening before. Trying to keep the refactor minimal, I guess.


sql/rowwriter.go, line 288 [r1] (raw file):
The indexes given to each are different. I want to look into unifying them, but I'm worried that upsert might invalidate anything I do now. Added a TODO.


sql/rowwriter.go, line 319 [r1] (raw file):
Done.


sql/rowwriter.go, line 437 [r1] (raw file):
Added a comment.

I realize I should have been way more clear about the role I saw of this commit. It's mostly setting up the structure and trying to change as little as possible otherwise.

REPLACE is going to build on this, and I'm hoping it'll help me see some parts of this that are still not clear. Which component is responsible for deciding which columns have to be provided (and sharing colIDtoRowIndex) is part of that.


sql/rowwriter.go, line 452 [r1] (raw file):
Done.

(Argh, I was trying to reduce allocations. Copying the struct shouldn't be that expensive, but I forgot it would do that for every row while the alloc only happens once per statement.)


Comments from Reviewable

@RaduBerinde
Copy link
Member

:lgtm:


Reviewed 4 of 4 files at r1, 4 of 4 files at r2.
Review status: all files reviewed at latest revision, 9 unresolved discussions, all commit checks successful.


sql/rowwriter.go, line 288 [r1] (raw file):
Ah, then it's fine, I didn't have a clear understanding of how the column map differs between them.
One thing that we could do is avoid initializing rd and ri if primaryKeyColChange is false (right?)


sql/rowwriter.go, line 437 [r1] (raw file):
Ah, got it. You can ignore some of the comments I made so far about the columns passed to the "constructors"


sql/rowwriter.go, line 452 [r1] (raw file):
Reducing allocations is a bit perpendicular to this. A pointer receiver just means that the function can modify the struct in place. It doesn't imply that you have to call it through a pointer, and it doesn't even necessarily imply that the struct has to be on the heap (though it can cause that to happen). In our case it would have been on the heap anyway because it's all part of some bigger struct that is we allocate (like insertNode).

Anyway, I doubt it would matter.rowHelper already does some internal allocations anyway (for the slices, maps etc) and (as you pointed out) it's a once-per-statement kind of thing.


sql/rowwriter.go, line 55 [r2] (raw file):
It's fine either way bit I wanted to point out that the make* functions could have stayed as they were. There is no requirement that just because a struct has pointer-receiver methods that it has to be addressed through a pointer. It can be a non-pointer field in a struct.


sql/rowwriter.go, line 122 [r2] (raw file):
can be a non-pointer rowHelper if you are not sharing pointers across structs


sql/rowwriter.go, line 129 [r2] (raw file):
newRowInserter


sql/rowwriter.go, line 133 [r2] (raw file):
do we require all non-nullable columns to be part of cols? (i'm assuming yes)


sql/rowwriter.go, line 251 [r2] (raw file):
mentioning the 1:1 relationship between the map and the values is fine but we're still not specifying what columns these are expected to be.. I assume it has to be the union of:

  • all the columns we are actually changing
  • all PK columns
  • all columns that are part of an index
    This needs to be specified somewhere..

Comments from Reviewable

@danhhz danhhz force-pushed the rowwriter branch 3 times, most recently from 8f43eff to 6a75969 Compare April 25, 2016 18:15
Separates the mapping between table rows/indexes and key/values for the
mutative operations. This will be useful immediately for implementing UPSERT and
eventually for TableWriter.

I tried to keep this as pure a refactor as I could, but it was a lot simpler to
model an UPDATE with a PK change as a delete followed by an update (instead of
migrating the mess I committed before).

I'm still not clear which component decides which columns are required, but I
think the additionally complexity that UPSERT/REPLACE adds will make it more
obvious one way or another. Once that happens it should be more obvious how to
handle all these colIDtoRowIndex maps.
@danhhz
Copy link
Contributor Author

danhhz commented Apr 25, 2016

Review status: 0 of 4 files reviewed at latest revision, 9 unresolved discussions, all commit checks successful.


sql/rowwriter.go, line 288 [r1] (raw file):
Good call. Done.


sql/rowwriter.go, line 55 [r2] (raw file):
Got it, I'm still figuring out when I should use which. Just asked David and Nathan and it got the impression that it's more idiomatic for me to switch these back, so I will.


sql/rowwriter.go, line 122 [r2] (raw file):
Done.


sql/rowwriter.go, line 129 [r2] (raw file):
Switched back to make


sql/rowwriter.go, line 133 [r2] (raw file):
There's a separate check in Insert. I could be convinced otherwise, but I think this layer doesn't care about nullability


sql/rowwriter.go, line 251 [r2] (raw file):
Okay, added that information.

This code review has convinced me that the colIDtoRowIndex map is a bit of a maintenance nightmare. I'm going to make a pointed effort to fix it up somehow in the next commit


Comments from Reviewable

@danhhz danhhz merged commit 5f6e9e1 into cockroachdb:master Apr 25, 2016
@danhhz danhhz deleted the rowwriter branch April 25, 2016 19:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants