rfc: hooks to help with cluster version upgrades #9404

danhhz · 2016-09-15T18:31:05Z

@vivekmenezes This is an idea I had that might help with #7073, among other things. WDYT?

This change is

vivekmenezes · 2016-09-16T13:25:18Z

docs/RFCS/cluster_upgrade_tool.md

+  that we can delete the function.
+- The `information_schema` tables could be persisted, removing the specialness
+  around leasing them.
+


I do like this idea but it needs to be spelled out in more detail for the current set of examples you list here. The reduction in complexity is not obvious from reading the doc.

I think this is valuable even if it's not a net reduction in complexity. As I said in #7073, I think that allowing every node to spam CREATE TABLE IF NOT EXISTS commands at startup is not the best way to do migrations and will tie our hands when our needs get more complicated. I'd rather designate one node to manage the migration.

Okay, I reworked the doc quite a bit. Let me know if this is any more clear

bdarnell

Generally LGTM, although it could use more details about the exact leasing process and how the SystemVersion table gets bootstrapped.

bdarnell · 2016-09-19T15:15:26Z

docs/RFCS/cluster_upgrade_tool.md

+migration hook.
+
+A new kv entry `SystemVersion` (ClusterVersion?) is introduced in the system
+keys to keep track of which of these hooks was the last to run. It stores a


In past systems I've found it useful to make the "version" table store a set of strings (i.e. all the migrations that have been applied) instead of a single value. This facilitates development by letting different migrations be merged to the mainline out of order.

Agreed. I was wary of overengineering, but since you're concerned about this too, I added something about to address it. (Plus I tweaked it a bit so you don't have to know the cockroach version your code is going into, which might not always be straightforward).

bdarnell · 2016-09-19T15:18:06Z

docs/RFCS/cluster_upgrade_tool.md

+to run the `./cockroach cluster-upgrade` cli command using a binary of the
+version being upgraded to. For ease of deployment, rolling one node will also do
+this upgrade before its normal startup. Any nodes rolled after the first, but
+before the cluster is upgraded, will panic to make it very clear the roll is


This seems backwards to me. It's easier for the new code to be compatible with the old data than for the migration to be performed while the old code is still running. I would prefer to roll out new code everywhere (in backwards-compatibility mode) and then perform the update once there are no more old nodes.

That was my gut feeling too, but if you're adding a system table, then you have to handle it not existing everywhere and there's no problem adding it with old code. I suspect many of our uses will fall into this camp.

And when you need the intermediate state, you can always do two migrations.

Unless you're going to continue hard-coding descriptors for these dynamically-created tables, you have to handle the case that the table exists but hasn't been loaded. So I think you could have a single piece of code that loads the table descriptor and creates it (or waits for someone else to do so) if it doesn't exist yet.

bdarnell · 2016-09-19T15:20:43Z

docs/RFCS/cluster_upgrade_tool.md

+  that we can delete the function.
+- The `information_schema` tables could be persisted, removing the specialness
+  around leasing them.
+


I think this is valuable even if it's not a net reduction in complexity. As I said in #7073, I think that allowing every node to spam CREATE TABLE IF NOT EXISTS commands at startup is not the best way to do migrations and will tie our hands when our needs get more complicated. I'd rather designate one node to manage the migration.

bdarnell · 2016-09-28T03:13:38Z

docs/RFCS/cluster_upgrade_tool.md

+amount of work involved in this can be kept bounded using the `/SystemVersion`
+described below). If not, it runs them via a migration coordinator. (We should
+also do this check when a node rejoins a cluster it's been partitioned from, but
+I'm not sure how to do that.)


Why? As long as the migration process completed, we know we won't regress. It may be worth repeating the check periodically to see if the node that started the migration process died without finishing it, but I don't see any reason to tie that to rejoining after a partition.

bdarnell · 2016-09-28T03:16:00Z

docs/RFCS/cluster_upgrade_tool.md

+migration hook.
+
+Migration hooks consist of a name, a work function, and optionally a minimum
+version. In the steady state, there is a map in code between migration names and


I'm not sure that we need this map; I doubt we'll ever have enough migrations here that it's worthwhile to roll them up into a single value. Maintaining this map seems like it would be as much of a headache as giving all the migrations a version number instead of a name.

bdarnell · 2016-09-28T03:17:54Z

docs/RFCS/cluster_upgrade_tool.md

+
+The work function must be idempotent (in case a migration coordinator crashes)
+and runnable on some previous version of the cluster. If there is some minimum
+cluster version it needs, that is checked against the one written to


I would define the migrations in an ordered list in the code, and then always execute them in that order (skipping any that are already applied).

bdarnell · 2016-09-28T03:22:10Z

docs/RFCS/cluster_upgrade_tool.md

+The migration name to version map can be used to order migrations when a cluster
+needs to catch up many versions.
+
+We will introduce a thin cli wrapper `./cockroach cluster-upgrade` around


It sounds like you're talking about running cluster-upgrade with a new binary while the cluster is still running an old version. This sounds like it could be tricky from an operational logistics perspective. It's also troubling from a security perspective since this command would presumably need an all-powerful node certificate instead of a user certificate.

bdarnell · 2016-09-28T03:28:08Z

docs/RFCS/cluster_upgrade_tool.md

+production scenarios. For ease of small clusters and local development, rolling
+one node will also do this upgrade before its normal startup. Any nodes rolled
+after the first, but before the cluster is upgraded, will panic to make it very
+clear the roll is unhealthy.


This seems dangerous. How does a node tell that it is not the first? (what if the first crashes?) What is the recovery procedure if you've rolled out the new binary everywhere and every node but one has crashed? We don't generally guarantee the ability to roll back to an older version.

bdarnell · 2016-09-28T03:29:10Z

docs/RFCS/cluster_upgrade_tool.md

+Altering system tables will require more than one migration. For example, we'd
+like to remove the reference to `experimental_unique_bytes` from the
+`system.eventlog` table so we can delete the function. A migration is run to add
+a new column with a new name and the code in this Cockroach version sets both on


Why a new column? Don't we just need to change the default value to uuid_v4? Or are you talking about changing it to an int?

bdarnell · 2016-09-28T03:45:11Z

docs/RFCS/cluster_upgrade_tool.md

+`system.eventlog` table so we can delete the function. A migration is run to add
+a new column with a new name and the code in this Cockroach version sets both on
+mutations and fetches both (preferring the new one) on reads. Once the entire
+cluster is on this version, another code change stops reading and writing the


Walk through the entire process of a multi-step update like this from the operator's perspective. Is it cluster-upgrade (add column), rollout binary version N (read and write both), cluster-upgrade (backfill new column), rollout binary version N+1 (stop using old column), cluster-upgrade (delete old column), rollout binary version N+2 (maybe not necessary)?

That sounds terrible; we really don't want to put our users through a process like that. It's definitely not worth it to delete a tiny experimental function, although I assume we'll eventually have better reasons for making a similarly complex change. I think we need to be able to tolerate intermediate states so this can look like a single step (from the user's perspective).

We don't need to solve this part now; it's more important that we get something that works for the simple cases. But this is part of why I still think it would be better to roll out the new binary before starting the migration process.

bdarnell · 2016-09-28T03:48:43Z

docs/RFCS/cluster_upgrade_tool.md

+cluster is on this version, another code change stops reading and writing the
+old value. Finally, a second mutation is run to remove the old column. This
+could be done in fewer steps if SQL had something like the "unknown fields"
+support in Protocol Buffers.


It might be worth considering a SQL extension for this (or adopting one if another database has done it). This is a problem for user schema changes too.

bdarnell · 2016-09-28T04:02:00Z

docs/RFCS/cluster_upgrade_tool.md

+to run the `./cockroach cluster-upgrade` cli command using a binary of the
+version being upgraded to. For ease of deployment, rolling one node will also do
+this upgrade before its normal startup. Any nodes rolled after the first, but
+before the cluster is upgraded, will panic to make it very clear the roll is


Unless you're going to continue hard-coding descriptors for these dynamically-created tables, you have to handle the case that the table exists but hasn't been loaded. So I think you could have a single piece of code that loads the table descriptor and creates it (or waits for someone else to do so) if it doesn't exist yet.

a-robinson · 2016-09-28T16:33:14Z

Reviewed 1 of 1 files at r2.
Review status: all files reviewed at latest revision, 12 unresolved discussions, all commit checks successful.

docs/RFCS/cluster_upgrade_tool.md, line 38 at r2 (raw file):

I'm not sure how to do that.)

The migration coordinator starts by heartbeating a kv entry to ensure that there

Is this a system-wide kv entry or a node-local kv entry? If the former, than how are upgrades supposed to be coordinated across multiple nodes?

docs/RFCS/cluster_upgrade_tool.md, line 46 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

I would define the migrations in an ordered list in the code, and then always execute them in that order (skipping any that are already applied).

+1, this pattern makes testing much easier

docs/RFCS/cluster_upgrade_tool.md, line 61 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

This seems dangerous. How does a node tell that it is not the first? (what if the first crashes?) What is the recovery procedure if you've rolled out the new binary everywhere and every node but one has crashed? We don't generally guarantee the ability to roll back to an older version.

We don't? Do we have any plans to support rollbacks?

Comments from Reviewable

bdarnell · 2016-09-28T17:13:32Z

Review status: all files reviewed at latest revision, 11 unresolved discussions, all commit checks successful.

docs/RFCS/cluster_upgrade_tool.md, line 61 at r2 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

We don't? Do we have any plans to support rollbacks?

Only in the sense that we recognize rollbacks as a highly desirable feature. We have no real plan about how we might get there.

Comments from Reviewable

petermattis · 2016-09-28T20:18:55Z

Review status: all files reviewed at latest revision, 14 unresolved discussions, all commit checks successful.

docs/RFCS/cluster_upgrade_tool.md, line 30 at r2 (raw file):

version. In the steady state, there is a map in code between migration names and
the cockroach version they were released in. There is also a list of migrations
added since the last cockroach release.

Would be good to be able to get a list of the migrations that are required to upgrade to a version. Should be easy to provide with the info available.

docs/RFCS/cluster_upgrade_tool.md, line 55 at r2 (raw file):

starting the migration coordinator. When upgrading a Cockroach version, this
command will be run and pointed at the cluster. After it returns, the cluster
can be rolled onto the new version.

Ah, per my earlier comment, there should be a cluster-upgrade -n option that shows the migrations that would be performed without actually performing them. Also, it will be good for this command to provide user feedback. That doesn't need to be spelled out here, though.

docs/RFCS/cluster_upgrade_tool.md, line 61 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Only in the sense that we recognize rollbacks as a highly desirable feature. We have no real plan about how we might get there.

This ease-of-use upgrade seems mildly complicated. To start with I think it would be sufficient for a node to detect that it isn't compatible with the current system tables and exit with an error indicating that an upgrade is needed.

docs/RFCS/cluster_upgrade_tool.md, line 75 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Why a new column? Don't we just need to change the default value to uuid_v4? Or are you talking about changing it to an int?

Or we could change the default to `unique_rowid()::bytes`.

Comments from Reviewable

vivekmenezes · 2016-10-13T19:05:32Z

@a-robinson I'm assuming you're taking over this PR?

a-robinson · 2016-10-13T19:13:36Z

Correct. I've been tied up with some lingering views complexities, but I hope to have some amendments sent out for this by Monday.

a-robinson · 2016-10-19T13:20:35Z

Sorry for the delay, @vivekmenezes! PTAL, I've expanded on the original RFC with a focus on getting something done in the short term while not limiting our longer term options.

Review status: 0 of 1 files reviewed at latest revision, 14 unresolved discussions, some commit checks pending.

docs/RFCS/cluster_upgrade_tool.md, line 46 at r2 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

+1, this pattern makes testing much easier

Although if some migrations are backwards compatible and could be applied immediately, we may not want them to have to wait behind non-backwards-compatible migrations that take multiple steps / release cycles...

Comments from Reviewable

a-robinson · 2016-10-19T13:23:10Z

Review status: 0 of 1 files reviewed at latest revision, 14 unresolved discussions, some commit checks failed.

docs/RFCS/cluster_upgrade_tool.md, line 46 at r2 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

Although if some migrations are backwards compatible and could be applied immediately, we may not want them to have to wait behind non-backwards-compatible migrations that take multiple steps / release cycles...

Whoops, I didn't really mean to publish this, but there's definitely a trade-off between a single linear upgrade path and flexibility in not queueing up all upgrades behind the most complex upgrade.

Comments from Reviewable

danhhz · 2016-10-19T14:38:24Z

(but that's expected from me). I particularly like the way you've handled the short term design while laying out how it plays into the options for later plans.

please wait for @bdarnell to be happy before merging

Review status: 0 of 1 files reviewed at latest revision, 14 unresolved discussions, some commit checks failed.

Comments from Reviewable

petermattis · 2016-10-19T18:12:07Z

docs/RFCS/cluster_upgrade_tool.md

+
+* Add a `system.jobs` table (#7073)
+* Add root user and authentication to the system.users table (#9877)
+* (maybe) Remove the reference to `experimental_unique_bytes` from the


I think this is a "definitely", not a "maybe".

petermattis · 2016-10-19T18:17:48Z

docs/RFCS/cluster_upgrade_tool.md

+the hook version.
+
+Our most pressing initial motivations fall into the first model, so it will be
+the focus of this RFC.


This model doesn't handle "changing the encoding of the RaftAppliedIndexKey and LeaseAppliedIndexKey keys" or "switching from nanoseconds to microseconds for storing timestamps and intervals". I think that's fine, but perhaps worth calling out that the types of migrations this enables is mostly constrained to system table and their schemas.

a-robinson · 2016-10-19T19:35:07Z

Review status: 0 of 1 files reviewed at latest revision, 16 unresolved discussions, some commit checks failed.

docs/RFCS/cluster_upgrade_tool.md, line 24 at r3 (raw file):

Previously, petermattis (Peter Mattis) wrote…

I think this is a "definitely", not a "maybe".

I've removed the "(maybe)", but for what it's worth I was going off @bdarnell's comment that it isn't worth putting users through so much pain to delete a tiny experimental function: https://github.com//pull/9404#discussion_r80838974

docs/RFCS/cluster_upgrade_tool.md, line 75 at r3 (raw file):

Previously, petermattis (Peter Mattis) wrote…

This model doesn't handle "changing the encoding of the RaftAppliedIndexKey and LeaseAppliedIndexKey keys" or "switching from nanoseconds to microseconds for storing timestamps and intervals". I think that's fine, but perhaps worth calling out that the types of migrations this enables is mostly constrained to system table and their schemas.

Serious question: are those pressing concerns for the next 3-6 months? If so, we can make the added effort to focus on a more complete solution now.

Comments from Reviewable

bdarnell · 2016-10-21T10:40:27Z

Review status: 0 of 1 files reviewed at latest revision, 18 unresolved discussions, some commit checks failed.

docs/RFCS/cluster_upgrade_tool.md, line 24 at r3 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

I've removed the "(maybe)", but for what it's worth I was going off @bdarnell's comment that it isn't worth putting users through so much pain to delete a tiny experimental function:
#9404 (comment)

I think it's not worth the multi-step add/remove column dance to replace this function. It probably is worth it to change the default since that should be doable in one step.

docs/RFCS/cluster_upgrade_tool.md, line 75 at r3 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

Serious question: are those pressing concerns for the next 3-6 months? If so, we can make the added effort to focus on a more complete solution now.

I think we should explicitly limit the scope of this to system tables and their schemas. This RFC is for _cluster_-level migrations. Migrations that can be done on a store-by-store basis without synchronizing with the rest of the cluster can simply be done at startup (we already have a few of these). I think the applied index key migrations fall into this category.

docs/RFCS/cluster_upgrade_tool.md, line 172 at r4 (raw file):

process if so desired.

#### Overview of changes needed from short-term design

In the short term design, simple migrations like "create table system.jobs" are run automatically at startup. In "option 1", no migrations are run at startup. Would we still have a way to run simple migrations automatically?

docs/RFCS/cluster_upgrade_tool.md, line 208 at r4 (raw file):

1. Check release notes for the desired version for any required migrations
  1. If none, just carry out the upgrade
  1. If there are some and the output `./cockroach cluster-upgrade` on their

So this is running all migrations that were optional in the previous version? This seems like a weird workflow; I think we'd want to encourage migrations to be run as soon as practical after an upgrade (with a last-minute check here before the next upgrade).

docs/RFCS/cluster_upgrade_tool.md, line 230 at r4 (raw file):

information for the entire cluster and add some code that understands our
semantic versioning scheme sufficiently well to determine which node versions
support which migrations.

This doesn't have to understand our versioning scheme (which may or may not be semantic - our betas are using date versions instead of semantic versions). The nodes could gossip information about the migrations they understand directly instead of condensing it down to a single version identifier designed for humans.

Comments from Reviewable

petermattis · 2016-10-21T13:30:12Z

Review status: 0 of 1 files reviewed at latest revision, 19 unresolved discussions, some commit checks failed.

docs/RFCS/cluster_upgrade_tool.md, line 75 at r3 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

I think we should explicitly limit the scope of this to system tables and their schemas. This RFC is for cluster-level migrations. Migrations that can be done on a store-by-store basis without synchronizing with the rest of the cluster can simply be done at startup (we already have a few of these). I think the applied index key migrations fall into this category.

What Ben said. The applied index key migrations are in a different category than the migrations that this RFC is targeted at.

Comments from Reviewable

a-robinson · 2016-10-21T15:03:13Z

Thanks for the reviews!

Review status: 0 of 1 files reviewed at latest revision, 18 unresolved discussions, some commit checks failed.

docs/RFCS/cluster_upgrade_tool.md, line 24 at r3 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

I think it's not worth the multi-step add/remove column dance to replace this function. It probably is worth it to change the default since that should be doable in one step.

Ack, thanks. Sorry for misrepresenting your opinion!

docs/RFCS/cluster_upgrade_tool.md, line 75 at r3 (raw file):

Previously, petermattis (Peter Mattis) wrote…

What Ben said. The applied index key migrations are in a different category than the migrations that this RFC is targeted at.

Thanks for the clarification. Updated the bullet point about it to clarify

docs/RFCS/cluster_upgrade_tool.md, line 172 at r4 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

In the short term design, simple migrations like "create table system.jobs" are run automatically at startup. In "option 1", no migrations are run at startup. Would we still have a way to run simple migrations automatically?

Yeah, my thinking when I wrote this was that any migration which didn't have a minimum version specified could be run safely at startup.

However, I apparently forgot to write that down, and it also assumes that we go with the approach of specifying a minimum CockroachDB version, which would require solidifying our versioning scheme if we haven't already.

docs/RFCS/cluster_upgrade_tool.md, line 208 at r4 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

So this is running all migrations that were optional in the previous version? This seems like a weird workflow; I think we'd want to encourage migrations to be run as soon as practical after an upgrade (with a last-minute check here before the next upgrade).

Well at some level, it's still the _current_ version :)

The tool would work just as well after an upgrade as before an upgrade, since it doesn't rely on the next version in any way. It's just a matter of admin workflow choice. I laid out the example in this way because waiting to run non-backwards-compatible migrations leaves more room for safe rollbacks, but we could also just describe it as "once you decide you won't want to roll back, run the migrations".

What do you think? Would you like me to change this?

docs/RFCS/cluster_upgrade_tool.md, line 230 at r4 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

This doesn't have to understand our versioning scheme (which may or may not be semantic - our betas are using date versions instead of semantic versions). The nodes could gossip information about the migrations they understand directly instead of condensing it down to a single version identifier designed for humans.

Very good point. I've added that to the doc as an alternative.

Comments from Reviewable

bdarnell · 2016-10-22T12:32:37Z

Review status: 0 of 1 files reviewed at latest revision, 18 unresolved discussions, some commit checks failed.

docs/RFCS/cluster_upgrade_tool.md, line 208 at r4 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

Well at some level, it's still the current version :)

The tool would work just as well after an upgrade as before an upgrade, since it doesn't rely on the next version in any way. It's just a matter of admin workflow choice. I laid out the example in this way because waiting to run non-backwards-compatible migrations leaves more room for safe rollbacks, but we could also just describe it as "once you decide you won't want to roll back, run the migrations".

What do you think? Would you like me to change this?

I think the implementation is fine, but I'd clarify the description a bit. Something like "2: Verify that all migrations for the current version have been completed by running `cockroach cluster-upgrade` with the current binary; 3: perform the upgrade; 4: once the upgrade has completed and rollbacks are deemed no longer necessary, perform migrations for the new version by running `cockroach cluster-upgrade` with the new binary". We want to encourage people to run migrations soon after upgrading because migrations may enable performance improvements or new features and it's generally better to switch over to the new mode instead of running in backwards-compatibility mode indefinitely.

Comments from Reviewable

a-robinson · 2016-10-24T14:10:42Z

Review status: 0 of 1 files reviewed at latest revision, 18 unresolved discussions.

docs/RFCS/cluster_upgrade_tool.md, line 208 at r4 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

I think the implementation is fine, but I'd clarify the description a bit. Something like "2: Verify that all migrations for the current version have been completed by running cockroach cluster-upgrade with the current binary; 3: perform the upgrade; 4: once the upgrade has completed and rollbacks are deemed no longer necessary, perform migrations for the new version by running cockroach cluster-upgrade with the new binary". We want to encourage people to run migrations soon after upgrading because migrations may enable performance improvements or new features and it's generally better to switch over to the new mode instead of running in backwards-compatibility mode indefinitely.

Sounds good, except that checking release notes for the new version is still important in the sense that they may be trying to do an upgrade that skips a few versions, meaning that their desired version may require migrations that aren't available at their current version. I've updated the doc appropriately.

Comments from Reviewable

a-robinson · 2016-10-24T15:08:27Z

Rebased to appease teamcity

a-robinson · 2016-10-24T19:49:28Z

Merged the two commits together at Tamir's suggestion.

vivekmenezes · 2016-10-24T19:55:39Z

Review status: 0 of 1 files reviewed at latest revision, 19 unresolved discussions, some commit checks pending.

docs/RFCS/cluster_upgrade_tool.md, line 151 at r5 (raw file):

The idea is that before upgrading to a new version of CockroachDB, the admin
will look in the release notes for the version that they want to run and

Does the admin need to look at the release notes? Can you imagine an implementation where you point the new version at the current cluster and it run a diff and suggest commands that need to run before the cluster can be upgraded?

Comments from Reviewable

vivekmenezes · 2016-10-24T20:06:41Z

Review status: 0 of 1 files reviewed at latest revision, 19 unresolved discussions, some commit checks pending.

docs/RFCS/cluster_upgrade_tool.md, line 208 at r4 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

Sounds good, except that checking release notes for the new version is still important in the sense that they may be trying to do an upgrade that skips a few versions, meaning that their desired version may require migrations that aren't available at their current version. I've updated the doc appropriately.

I just saw this comment. I'm confused as to why you cannot do this automatically? If migrations aren't available the tool can suggest the min version for the upgrade right?

Comments from Reviewable

a-robinson · 2016-10-24T20:07:10Z

Review status: 0 of 1 files reviewed at latest revision, 19 unresolved discussions, some commit checks pending.

docs/RFCS/cluster_upgrade_tool.md, line 151 at r5 (raw file):

Previously, vivekmenezes wrote…

Does the admin need to look at the release notes? Can you imagine an implementation where you point the new version at the current cluster and it run a diff and suggest commands that need to run before the cluster can be upgraded?

They don't actually have to look at the notes -- as mentioned in the next paragraph, if they try to start up a node at the new version and all required migrations haven't been run, it should exit with an explanatory error message, so things would work fine if they just try to start up a node with the new version.

Comments from Reviewable

a-robinson · 2016-10-24T20:12:27Z

Review status: 0 of 1 files reviewed at latest revision, 19 unresolved discussions, all commit checks successful.

docs/RFCS/cluster_upgrade_tool.md, line 208 at r4 (raw file):

Previously, vivekmenezes wrote…

I just saw this comment. I'm confused as to why you cannot do this automatically? If migrations aren't available the tool can suggest the min version for the upgrade right?

In this design, we've been assuming that only the current binary would be used to run the `cluster-upgrade` command, and the current version has no way of knowing what migrations some arbitrary later version requires.

We could support a workflow where the user downloads the new desired binary and runs some sort of compatibility check command with it (which would be similar in implementation to the startup check for required migrations and the cluster-upgrade command), if there's agreement that it's worth the extra effort on top of maintaining release notes / documentation.

Comments from Reviewable

vivekmenezes · 2016-10-25T14:09:46Z

Review status: 0 of 1 files reviewed at latest revision, 19 unresolved discussions, all commit checks successful.

docs/RFCS/cluster_upgrade_tool.md, line 208 at r4 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

In this design, we've been assuming that only the current binary would be used to run the cluster-upgrade command, and the current version has no way of knowing what migrations some arbitrary later version requires.

We could support a workflow where the user downloads the new desired binary and runs some sort of compatibility check command with it (which would be similar in implementation to the startup check for required migrations and the cluster-upgrade command), if there's agreement that it's worth the extra effort on top of maintaining release notes / documentation.

It's not worth the extra effort at this stage.

Comments from Reviewable

tamird · 2016-10-25T19:33:17Z

Reviewed 1 of 1 files at r5.
Review status: all files reviewed at latest revision, 38 unresolved discussions, all commit checks successful.

docs/RFCS/cluster_upgrade_tool.md, line 20 at r5 (raw file):

software as complex as CockroachDB.

## Examples

heh, this heading is weird. Examples of what? Maybe it's clear when rendered (since this is a subhead of Motivation). Feel free to push back.

docs/RFCS/cluster_upgrade_tool.md, line 48 at r5 (raw file):

some way.

Simple migrations can be discussed in terms which versions of the Cockroach

"in terms which" -- something off about this sentence.

docs/RFCS/cluster_upgrade_tool.md, line 59 at r5 (raw file):

assumes that the table is present begins the post-migration range.

For simplicity, we assume that at most two versions of CockroachDB are ever

Can we enforce this using the same mechanism as schema leases? Something to think about.

docs/RFCS/cluster_upgrade_tool.md, line 82 at r5 (raw file):

Because handling migrations in the general case is a very broad problem
encompassing many potential types of migrations, we choose to first tackle the
simplest case, migrations that are backward-compatible, while leaving the door

nit: either backward-{in,}compatible, or backwards-{in,}compatible, but not both (the difference is the s on backwards)

docs/RFCS/cluster_upgrade_tool.md, line 97 at r5 (raw file):

needs to catch up with more than one of them.

The migration coordinator starts by heartbeating a kv entry to ensure that there

rather than "heartbeating a kv entry", can we describe this as a "migration lease"? I'm imagining the machinery to be virtually identical to range leases.

docs/RFCS/cluster_upgrade_tool.md, line 101 at r5 (raw file):

migrations while a different node is doing migrations will have to block until
the kv entry is released (or expired in the case of node failure). Then, for
each missing migration, the migration coordinator runs the work function for

"each missing migration" is mentioned twice in this sentence.

docs/RFCS/cluster_upgrade_tool.md, line 105 at r5 (raw file):

`/SystemVersion/<MigrationName>`.

Each work function must be idempotent (in case a migration coordinator crashes)

Should be easy enough to smoke-test this in the migration coordinator tests (i.e. before any migrations have been added).

docs/RFCS/cluster_upgrade_tool.md, line 115 at r5 (raw file):

accomplished with one hook and immediately used. To add such a migration, you'd
create a new migration called something like "CREATE system.jobs" with a
function that created the new system table and add it to the end of the list of

s/created/creates/?

docs/RFCS/cluster_upgrade_tool.md, line 142 at r5 (raw file):

* Listing all migrations that have been run on the cluster
* Listing the available migrations, which migrations they depend on, and which

s/and which/and the/

docs/RFCS/cluster_upgrade_tool.md, line 145 at r5 (raw file):

  CockroachDB versions at which it's safe to run each of them
* Running a single migration specified by the admin
* Running all available migrations whose migration dependencies are satisfied

also might be helped by using a "version lease" (same mechanism as schema leases)

docs/RFCS/cluster_upgrade_tool.md, line 168 at r5 (raw file):

version.

The CLI tool will be the recommended way of upgrading Cockroach versions for

nit:s/Cockroach\b/CockroachDB/g

docs/RFCS/cluster_upgrade_tool.md, line 170 at r5 (raw file):

The CLI tool will be the recommended way of upgrading Cockroach versions for
production scenarios. For ease of small clusters and local development,
one-node clusters will be able to do migrations during their normal startup

How do you determine you're a one-node cluster?

docs/RFCS/cluster_upgrade_tool.md, line 172 at r5 (raw file):

one-node clusters will be able to do migrations during their normal startup
process if so desired. Also, migrations with no minimum version specified can
still be run at startup, as in the short-term design, because the lack of a

s/startup, as/startup as/

docs/RFCS/cluster_upgrade_tool.md, line 226 at r5 (raw file):

having total control.

Unlike in the short-term, backwards compatible case, we can't necessarily run

s/short-term,/,/, s/backwards /backwards-/

docs/RFCS/cluster_upgrade_tool.md, line 233 at r5 (raw file):

enough version.

We could ensure this in at least two different ways:

"version leases" may be relevant here as well.

docs/RFCS/cluster_upgrade_tool.md, line 256 at r5 (raw file):

The main difference for this approach is that we'll have to start tracking
either the versions or known migrations of all nodes in the cluster. We could

s/or/of/

docs/RFCS/cluster_upgrade_tool.md, line 258 at r5 (raw file):

either the versions or known migrations of all nodes in the cluster. We could
potentially use gossip for this, but have to be sure that we know the state of
all nodes, not just most of them to be confident that it's safe to run a

"version leases" again

docs/RFCS/cluster_upgrade_tool.md, line 295 at r5 (raw file):

The short-term design doesn't appear to have any obvious drawbacks, as it
solves a couple immediate needs and can be added without limiting future

couple of

docs/RFCS/cluster_upgrade_tool.md, line 300 at r5 (raw file):

The long-term design adds administrative complexity to the upgrade process, no
matter which option we go with. It'll also add a little extra work to the
release process for developers due to the need to include minimum versions for

trailing space, i think?

Comments from Reviewable

a-robinson · 2016-10-25T19:56:35Z

Thanks for the review!

Review status: 0 of 1 files reviewed at latest revision, 38 unresolved discussions.

docs/RFCS/cluster_upgrade_tool.md, line 20 at r5 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

heh, this heading is weird. Examples of what? Maybe it's clear when rendered (since this is a subhead of Motivation). Feel free to push back.

Done.

docs/RFCS/cluster_upgrade_tool.md, line 48 at r5 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

"in terms which" -- something off about this sentence.

Done.

docs/RFCS/cluster_upgrade_tool.md, line 59 at r5 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

Can we enforce this using the same mechanism as schema leases? Something to think about.

Not the exact same mechanism, but the general approach could potentially be use for this. I've opened #10212 to track the idea.

docs/RFCS/cluster_upgrade_tool.md, line 82 at r5 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

nit: either backward-{in,}compatible, or backwards-{in,}compatible, but not both (the difference is the s on backwards)

Done throughout the file.

docs/RFCS/cluster_upgrade_tool.md, line 97 at r5 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

rather than "heartbeating a kv entry", can we describe this as a "migration lease"? I'm imagining the machinery to be virtually identical to range leases.

Sure, changed the wording.

docs/RFCS/cluster_upgrade_tool.md, line 101 at r5 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

"each missing migration" is mentioned twice in this sentence.

Done.

docs/RFCS/cluster_upgrade_tool.md, line 115 at r5 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

s/created/creates/?

Done.

docs/RFCS/cluster_upgrade_tool.md, line 142 at r5 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

s/and which/and the/

Done.

docs/RFCS/cluster_upgrade_tool.md, line 168 at r5 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

nit:s/Cockroach\b/CockroachDB/g

Done.

docs/RFCS/cluster_upgrade_tool.md, line 170 at r5 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

How do you determine you're a one-node cluster?

That's a good question that I don't currently know the answer to. I kept this around from Dan's plan and naively assumed that checking the used node IDs from the store was fairly simple, but perhaps that's not the case? I'm curious if you have an answer, but either way I don't think answering that blocks the RFC as a whole.

docs/RFCS/cluster_upgrade_tool.md, line 172 at r5 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

s/startup, as/startup as/

That doesn't flow well to me, switched to a parenthetical instead

docs/RFCS/cluster_upgrade_tool.md, line 226 at r5 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

s/short-term,/,/, s/backwards /backwards-/

Done.

docs/RFCS/cluster_upgrade_tool.md, line 256 at r5 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

s/or/of/

No, "or" is correct there, to account for the design space of choosing between tracking node versions or tracking which migrations each node knows about.

docs/RFCS/cluster_upgrade_tool.md, line 295 at r5 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

couple of

Done.

docs/RFCS/cluster_upgrade_tool.md, line 300 at r5 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

trailing space, i think?

Done.

Comments from Reviewable

tamird · 2016-10-25T20:38:58Z

Reviewed 1 of 1 files at r6.
Review status: all files reviewed at latest revision, 29 unresolved discussions, all commit checks successful.

docs/RFCS/cluster_upgrade_tool.md, line 170 at r5 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

That's a good question that I don't currently know the answer to. I kept this around from Dan's plan and naively assumed that the we could checking the used node IDs from the store was fairly simple, but perhaps that's not the case? I'm curious if you have an answer, but either way I don't think answering that blocks the RFC as a whole.

Hm, probably the most conservative way is to scan all the node descriptors, if there is only one, we can do this stuff.

docs/RFCS/cluster_upgrade_tool.md, line 256 at r5 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

No, "or" is correct there, to account for the design space of choosing between tracking node versions or tracking which migrations each node knows about.

Ah, my mistake. I think "known migrations of all nodes" is what's confusing here, it seems to imply that a node can have run or unrun migrations.

Actually, the two things we could track that are mentioned here are exactly equivalent, so it might be better to only mention the one we directly care about.

docs/RFCS/cluster_upgrade_tool.md, line 100 at r6 (raw file):

to ensure that there is only ever one running at a time. Other nodes that start
up and require migrations while a different node is doing migrations will have
to block until the kv entry is released (or expired in the case of node

s/kv entry/lease/

Comments from Reviewable

a-robinson · 2016-10-25T21:15:12Z

Review status: all files reviewed at latest revision, 29 unresolved discussions, all commit checks successful.

docs/RFCS/cluster_upgrade_tool.md, line 100 at r6 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

s/kv entry/lease/

Done.

Comments from Reviewable

tamird · 2016-10-25T22:04:26Z

Reviewed 1 of 1 files at r7.
Review status: all files reviewed at latest revision, 28 unresolved discussions, all commit checks successful.

Comments from Reviewable

danhhz assigned vivekmenezes Sep 15, 2016

vivekmenezes reviewed Sep 16, 2016

View reviewed changes

bdarnell reviewed Sep 19, 2016

View reviewed changes

danhhz mentioned this pull request Sep 26, 2016

sql: backup/restore/ingest 2016Q4 work #9293

Closed

23 tasks

danhhz force-pushed the rfc_cluster_upgrade branch from 58485d7 to b005638 Compare September 26, 2016 19:23

bdarnell reviewed Sep 28, 2016

View reviewed changes

danhhz mentioned this pull request Oct 5, 2016

sql: switch to microseconds for storing intervals #9758

Closed

asubiotto mentioned this pull request Oct 11, 2016

sql: add root user to system.users table during bootstrap and add authentication #9877

Closed

petermattis reviewed Oct 19, 2016

View reviewed changes

a-robinson force-pushed the rfc_cluster_upgrade branch from ceb571d to 3c5eec1 Compare October 19, 2016 19:35

a-robinson force-pushed the rfc_cluster_upgrade branch from 3c5eec1 to 05ade6f Compare October 21, 2016 15:03

a-robinson force-pushed the rfc_cluster_upgrade branch from 05ade6f to bd426ec Compare October 24, 2016 14:09

a-robinson force-pushed the rfc_cluster_upgrade branch from bd426ec to 25950e0 Compare October 24, 2016 15:08

a-robinson force-pushed the rfc_cluster_upgrade branch from 25950e0 to dde63d2 Compare October 24, 2016 19:48

a-robinson force-pushed the rfc_cluster_upgrade branch from dde63d2 to 6bf9050 Compare October 24, 2016 19:48

a-robinson mentioned this pull request Oct 25, 2016

gossip: consider enforcing that no more than two CockroachDB binary versions are active in a cluster at once #10212

Closed

a-robinson force-pushed the rfc_cluster_upgrade branch from 6bf9050 to f475049 Compare October 25, 2016 19:55

rfc: Hooks to help with cluster version upgrades

c39a9e3

a-robinson force-pushed the rfc_cluster_upgrade branch from f475049 to c39a9e3 Compare October 25, 2016 21:15

a-robinson merged commit e5f425d into cockroachdb:master Oct 26, 2016

nvanbenschoten mentioned this pull request Oct 31, 2016

sql: allow virtual table descriptors in a concrete db #10319

Closed

a-robinson mentioned this pull request Nov 28, 2016

WIP: Add cluster upgrade logic for backward-compatible migrations #11658

Merged

danhhz deleted the rfc_cluster_upgrade branch January 23, 2017 16:34

rfc: hooks to help with cluster version upgrades #9404

rfc: hooks to help with cluster version upgrades #9404

Conversation

danhhz commented Sep 15, 2016 • edited by rjnn Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bdarnell left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bdarnell Sep 28, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bdarnell Sep 28, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bdarnell Sep 28, 2016 • edited Loading

Choose a reason for hiding this comment

bdarnell Sep 28, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

a-robinson commented Sep 28, 2016

bdarnell commented Sep 28, 2016

petermattis commented Sep 28, 2016

vivekmenezes commented Oct 13, 2016

a-robinson commented Oct 13, 2016

a-robinson commented Oct 19, 2016

a-robinson commented Oct 19, 2016

danhhz commented Oct 19, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

a-robinson commented Oct 19, 2016

bdarnell commented Oct 21, 2016

petermattis commented Oct 21, 2016

a-robinson commented Oct 21, 2016

bdarnell commented Oct 22, 2016

a-robinson commented Oct 24, 2016

a-robinson commented Oct 24, 2016

a-robinson commented Oct 24, 2016

vivekmenezes commented Oct 24, 2016

vivekmenezes commented Oct 24, 2016

a-robinson commented Oct 24, 2016

a-robinson commented Oct 24, 2016

vivekmenezes commented Oct 25, 2016

tamird commented Oct 25, 2016

a-robinson commented Oct 25, 2016 • edited Loading

tamird commented Oct 25, 2016

a-robinson commented Oct 25, 2016

tamird commented Oct 25, 2016

danhhz commented Sep 15, 2016 •

edited by rjnn

Loading

bdarnell Sep 28, 2016 •

edited

Loading

bdarnell Sep 28, 2016 •

edited

Loading

bdarnell Sep 28, 2016 •

edited

Loading

bdarnell Sep 28, 2016 •

edited

Loading

a-robinson commented Oct 25, 2016 •

edited

Loading