Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spec: Add efficient volume cloning support #244

Merged

Conversation

Akrog
Copy link
Contributor

@Akrog Akrog commented May 31, 2018

With the new CSI snapshot feature it's now possible to create volumes
from a snapshots. This may lead to users abusing it for unintended
functionality, such as having a golden image to source new volumes.

Most storage backends support efficient volume cloning, so it would be
benefitial to expose this functionality in the CSI spec.

By supporting cloning we avoid users creating a volume, and then a
snapshot of that volume, just so it can be used as the source for new
volumes.

The VolumeContentSource, that was added by the snapshots feature, will
be used for this feature. New VolumeSource message is added to the
VolumeContentSource.

spec.md Outdated
@@ -602,6 +602,12 @@ This RPC will be called by the CO to provision a new volume on behalf of a user
This operation MUST be idempotent.
If a volume corresponding to the specified volume `name` already exists and is compatible with the specified `capacity_range`, `volume_capabilities` and `parameters` in the `CreateVolumeRequest`, the Plugin MUST reply `0 OK` with the corresponding `CreateVolumeResponse`.

Plugins MAY create 3 types of volumes:

- Empty volumes. When plugin supports `CREATE_DELETE_VOLUME` capability.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the next two are "optional capability" but this is simply "capability" -- but it's also optional. why the difference?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad, I'll update it.

spec.md Outdated
Plugins MAY create 3 types of volumes:

- Empty volumes. When plugin supports `CREATE_DELETE_VOLUME` capability.
- From an existing snapshot. When plugin supports `CREATE_DELETE_SNAPSHOT` optional capability .
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: extra space prior to the period

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

spec.md Outdated
@@ -796,6 +810,8 @@ The CO MUST implement the specified error recovery behavior when it encounters t
| Operation pending for volume | 10 ABORTED | Indicates that there is a already an operation pending for the specified volume. In general the Cluster Orchestrator (CO) is responsible for ensuring that there is no more than one call "in-flight" per volume at a given time. However, in some circumstances, the CO MAY lose state (for example when the CO crashes and restarts), and MAY issue multiple calls simultaneously for the same volume. The Plugin, SHOULD handle this as gracefully as possible, and MAY return this error code to reject secondary calls. | Caller SHOULD ensure that there are no other calls pending for the specified volume, and then retry with exponential back off. |
| Unsupported `capacity_range` | 11 OUT_OF_RANGE | Indicates that the capacity range is not allowed by the Plugin, for example when trying to create a volume smaller than the source snapshot. More human-readable information MAY be provided in the gRPC `status.message` field. | Caller MUST fix the capacity range before retrying. |
| Call not implemented | 12 UNIMPLEMENTED | CreateVolume call is not implemented by the plugin or disabled in the Plugin's current mode of operation. | Caller MUST NOT retry. Caller MAY call `ControllerGetCapabilities` or `NodeGetCapabilities` to discover Plugin capabilities. |
| Source incompatible or not supported | 3 INVALID_ARGUMENT | Besides the general cases, this code MUST also be used to indicate when plugin supporting CREATE_DELETE_VOLUME cannot create a volume from the requested source (`SnapshotSource` or `VolumeSource`). Failure may be caused by not supporting the source (CO shouldn't have provided that source) or incompatibility between `parameters` from the source and the ones requested for the new volume. More human-readable information SHOULD be provided in the gRPC `status.message` field if the problem is the source. | On source related issues, caller MUST use different parameters, a different source, or no source at all. |
| Source does not exist | 5 NOT_FOUND | Indicates that the source specified exist. | Caller MUST verify that the `volume_content_source` is correct, the source is accessible, and has not been deleted, before retrying with exponential back off. |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo? suggest: "Indicates that the specified source does not exist."

Also, the comma before "before" seems strange.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

spec.md Outdated
@@ -1125,6 +1141,11 @@ message ControllerServiceCapability {
// with the snapshot_id as the filter to query whether the
// uploading process is complete or not.
LIST_SNAPSHOTS = 6;
// Plugins supporting efficient volume cloning MAY report this
// capability. Source volume must be managed by the same plugin.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we only use a single space between sentences.

also, suggest "A source volume..."

Copy link
Contributor Author

@Akrog Akrog May 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I've been writing too much on RST lately and I use double space there for readability.

spec.md Outdated
@@ -1125,6 +1141,11 @@ message ControllerServiceCapability {
// with the snapshot_id as the filter to query whether the
// uploading process is complete or not.
LIST_SNAPSHOTS = 6;
// Plugins supporting efficient volume cloning MAY report this
// capability. Source volume must be managed by the same plugin.
// Not all volume sources and parameters combinations MAY work.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MAY indicates a recommendation, or "soft" requirement - does it make sense in this context? or would "may" (non-RFC verbage) suffice?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normal may will suffice in my opinion. Changing it.

spec.md Outdated
// Plugins supporting efficient volume cloning MAY report this
// capability. Source volume must be managed by the same plugin.
// Not all volume sources and parameters combinations MAY work.
// CLONE_VOLUME is NOT REQUIRED.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"... is NOT REQUIRED" seems redundant. It's a capability field because it's optional. I don't think there's a need to state that here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I just copied from the CREATE_DELETE_SNAPSHOT in case this was something that should be done in the specs. Removing it.

spec.md Outdated

- Empty volumes. When plugin supports `CREATE_DELETE_VOLUME` capability.
- From an existing snapshot. When plugin supports `CREATE_DELETE_SNAPSHOT` optional capability .
- From an existing volume. When plugin supports efficient cloning, and reports the optional capability `CLONE_VOLUME`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like an implementation detail. Should we support naive copies and leave it up to the plugin to handle it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to indicate that the idea is for efficient cloning. Not an attach vol1, attach vol2, then use dd.
Plugins can ignore it and internally do that, that's why I didn't add any MUST or REQUIRED.
But plugins using the host assisted cloning mechanism (dd) would give terrible performance and in my opinion would be frustrating.
Do you think I should remove the "efficient" adjective?

@jdef
Copy link
Member

jdef commented May 31, 2018

Are any COs currently in a position to benefit from this?

@cpuguy83
Copy link
Contributor

Not here, but it's something we've discussed. I think it makes sense to support this at the CSI level at least.

@Akrog
Copy link
Contributor Author

Akrog commented Jun 6, 2018

Looks like some non CSI providers are currently implementing this via PVC annotations: kubernetes/cloud-provider-openstack#183

@aglitke
Copy link

aglitke commented Jun 11, 2018

In addition to openstack/cinder, I am aware of at least two other kubernetes storage providers which have some implementation: Gluster and NetApp Trident. On the kubernetes side we are in early discussions with the storage-SIG on how to standardize on an approach. Once this is done, kubernetes would be able to support this API by converting PVC annotations into a volume_source.

@j-griffith
Copy link

@Akrog interested in dusting this off and seeing if we can get it merged? I'm happy to help out if you like. We should be able to just model the create-from-snapshot efforts exactly like you're doing and make the change pretty minimal.

@Akrog
Copy link
Contributor Author

Akrog commented Aug 20, 2018

@j-griffith Yes, I'm interested in dusting this one off. I hadn't noticed the patch was in conflict, fixed now.

spec.md Outdated

- Empty volumes. When plugin supports `CREATE_DELETE_VOLUME` optional capability.
- From an existing snapshot. When plugin supports `CREATE_DELETE_SNAPSHOT` optional capability.
- From an existing volume. When plugin supports efficient cloning, and reports the optional capability `CLONE_VOLUME`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why is "efficient" a prerequisite for this capability? either a plugin has the cap or it doesn't.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A plugin could do host assisted "cloning" by creating a new volume, attaching both volumes, and dd-ing contents from source to destination.

I don't think it would be a good idea mixing both concepts (fast storage backend cloning and slow host assisted cloning) under the same capability. That would probably lead to unfulfilled expectations from the caller side.

We could add two capabilities instead of one, CLONE_VOLUME and EFFICIENT_CLONING.
That way plugins for backends supporting efficient cloning would report them both, and those doing host assisted cloning would just report the CLONE_VOLUME capability. But my initial though is that this would be more confusing/troublesome.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think that in discussions we agreed that we didn't need/want to specify here and wanted to keep this generic in terms of the implementation with the caveat that if we need to we could consider additions for host assisted type cloning in the future? If folks agree it'd be great to remove the specifics here and see if we can move forward with getting this merged.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I removed "efficient" from here, and added a comment below on the attaching and copying as agreed in the meeting.

Copy link

@j-griffith j-griffith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like if we remove the specifics regarding different implementations we can get this going again?

spec.md Outdated

- Empty volumes. When plugin supports `CREATE_DELETE_VOLUME` optional capability.
- From an existing snapshot. When plugin supports `CREATE_DELETE_SNAPSHOT` optional capability.
- From an existing volume. When plugin supports efficient cloning, and reports the optional capability `CLONE_VOLUME`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think that in discussions we agreed that we didn't need/want to specify here and wanted to keep this generic in terms of the implementation with the caveat that if we need to we could consider additions for host assisted type cloning in the future? If folks agree it'd be great to remove the specifics here and see if we can move forward with getting this merged.

spec.md Outdated
@@ -1009,6 +1023,8 @@ The CO MUST implement the specified error recovery behavior when it encounters t
| Operation pending for volume | 10 ABORTED | Indicates that there is a already an operation pending for the specified volume. In general the Cluster Orchestrator (CO) is responsible for ensuring that there is no more than one call "in-flight" per volume at a given time. However, in some circumstances, the CO MAY lose state (for example when the CO crashes and restarts), and MAY issue multiple calls simultaneously for the same volume. The Plugin, SHOULD handle this as gracefully as possible, and MAY return this error code to reject secondary calls. | Caller SHOULD ensure that there are no other calls pending for the specified volume, and then retry with exponential back off. |
| Unsupported `capacity_range` | 11 OUT_OF_RANGE | Indicates that the capacity range is not allowed by the Plugin, for example when trying to create a volume smaller than the source snapshot. More human-readable information MAY be provided in the gRPC `status.message` field. | Caller MUST fix the capacity range before retrying. |
| Call not implemented | 12 UNIMPLEMENTED | CreateVolume call is not implemented by the plugin or disabled in the Plugin's current mode of operation. | Caller MUST NOT retry. Caller MAY call `ControllerGetCapabilities` or `NodeGetCapabilities` to discover Plugin capabilities. |
| Source incompatible or not supported | 3 INVALID_ARGUMENT | Besides the general cases, this code MUST also be used to indicate when plugin supporting CREATE_DELETE_VOLUME cannot create a volume from the requested source (`SnapshotSource` or `VolumeSource`). Failure may be caused by not supporting the source (CO shouldn't have provided that source) or incompatibility between `parameters` from the source and the ones requested for the new volume. More human-readable information SHOULD be provided in the gRPC `status.message` field if the problem is the source. | On source related issues, caller MUST use different parameters, a different source, or no source at all. |
| Source does not exist | 5 NOT_FOUND | Indicates that the specified source does not exist. | Caller MUST verify that the `volume_content_source` is correct, the source is accessible, and has not been deleted before retrying with exponential back off. |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can leverage this to solve the potential of some devices only allowing RO volumes to be cloned by @humblec on the Kubernetes implementation side. IE even if we don't add caveats like RO or optimized etc to the spec we can respond with a reasonable error message if that's a requirement and it's not satisfied. In other words this LGTM and provides what we'd need.

@saad-ali saad-ali added this to the v1.0 milestone Nov 7, 2018
spec.md Outdated
@@ -1012,6 +1026,8 @@ The CO MUST implement the specified error recovery behavior when it encounters t
| Unable to provision in `accessible_topology` | 8 RESOURCE_EXHAUSTED | Indicates that although the `accessible_topology` field is valid, a new volume can not be provisioned with the specified topology constraints. More human-readable information MAY be provided in the gRPC `status.message` field. | Caller MUST ensure that whatever is preventing volumes from being provisioned in the specified location (e.g. quota issues) is addressed before retrying with exponential backoff. |
| Unsupported `capacity_range` | 11 OUT_OF_RANGE | Indicates that the capacity range is not allowed by the Plugin, for example when trying to create a volume smaller than the source snapshot. More human-readable information MAY be provided in the gRPC `status.message` field. | Caller MUST fix the capacity range before retrying. |
| Call not implemented | 12 UNIMPLEMENTED | CreateVolume call is not implemented by the plugin or disabled in the Plugin's current mode of operation. | Caller MUST NOT retry. Caller MAY call `ControllerGetCapabilities` or `NodeGetCapabilities` to discover Plugin capabilities. |
| Source incompatible or not supported | 3 INVALID_ARGUMENT | Besides the general cases, this code MUST also be used to indicate when plugin supporting CREATE_DELETE_VOLUME cannot create a volume from the requested source (`SnapshotSource` or `VolumeSource`). Failure may be caused by not supporting the source (CO shouldn't have provided that source) or incompatibility between `parameters` from the source and the ones requested for the new volume. More human-readable information SHOULD be provided in the gRPC `status.message` field if the problem is the source. | On source related issues, caller MUST use different parameters, a different source, or no source at all. |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: please re-order the new error row entries here so that the error codes flow from lowest to highest

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I failed to notice that errors were ordered.

spec.md Outdated
Plugins MAY create 3 types of volumes:

- Empty volumes. When plugin supports `CREATE_DELETE_VOLUME` optional capability.
- From an existing snapshot. When plugin supports `CREATE_DELETE_SNAPSHOT` optional capability.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When plugin supports CREATE_DELETE_VOLUME and CREATE_DELETE_SNAPSHOT optional capabilities.

spec.md Outdated

- Empty volumes. When plugin supports `CREATE_DELETE_VOLUME` optional capability.
- From an existing snapshot. When plugin supports `CREATE_DELETE_SNAPSHOT` optional capability.
- From an existing volume. When plugin supports cloning, and reports the optional capability `CLONE_VOLUME`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When plugin supports cloning, and reports the optional capabilities CREATE_DELETE_VOLUME and CLONE_VOLUME.

spec.md Outdated
@@ -691,8 +697,16 @@ message VolumeContentSource {
string id = 1;
}

message VolumeSource {
// Contains identity information for the existing source volume.
// This field is REQUIRED. Plugins reporting CLONE_VOLUME
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: be consistent about one or two spaces after period.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about that. I always leave two, but I know in this spec one is preferred, so I try to, but muscle memory sometimes gets the best of me. ;-)

spec.md Outdated
// combinations may work. Plugins MUST NOT implement this feature
// by attaching 2 volumes and copying the content, but COs may
// implement it that way for plugins that don't support this
// feature.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should just drop:

      // Plugins MUST NOT implement this feature
      // by attaching 2 volumes and copying the content, but COs may
      // implement it that way for plugins that don't support this
      // feature.

Seems like an implementation detail. Good plugins won't do it, we don't need to spell it out.

Copy link
Contributor Author

@Akrog Akrog Nov 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove it, but after the "efficient cloning" discussion on this PR, we decided in last weeks meeting (if I understood it right) to spell it out in the comments to ensure that plugins and COs were clear on this point, and users could expect a consistent behavior across plugins: cloning is a fast operation.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove it, but after the "efficient cloning" discussion on this PR, we decided in last weeks meeting (if I understood it right) to spell it out in the comments to ensure that plugins and COs were clear on this point, and users could expect a consistent behavior across plugins: cloning is a fast operation.

We did talk about some comments but I think having it in the form of "MUST NOT" is too strong here, and it is an implementation detail so if someone wants to do it then hey, that's up to them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We did talk about some comments but I think having it in the form of "MUST NOT" is too strong here, and it is an implementation detail so if someone wants to do it then hey, that's up to them.

That is the situation I wanted to avoid, unmet expectations of clone on some plugins/backends. If a backend doesn't support efficient cloning and they report they do, then they are not following the specs, and users can call them out on it.

But maybe it's not such a big deal to people as I'm making it to be.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meh... I don't know. Anyway, LGTM; I'm still not a contributor so you just get my moral support in the form of a +1

Copy link
Member

@saad-ali saad-ali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@saad-ali
Copy link
Member

saad-ali commented Nov 8, 2018

Please rebase and we will merge this.

Copy link
Member

@jieyu jieyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. please rebase

@xing-yang
Copy link
Contributor

LGTM but need a rebase.

With the new CSI snapshot feature it's now possible to create volumes
from a snapshots.  This may lead to users abusing it for unintended
functionality, such as having a golden image to source new volumes.

Most storage backends support efficient volume cloning, so it would be
benefitial to expose this functionality in the CSI spec.

By supporting cloning we avoid users creating a volume, and then a
snapshot of that volume, just so it can be used as the source for new
volumes.

The VolumeContentSource, that was added by the snapshots feature, will
be used for this feature.  New VolumeSource message is added to the
VolumeContentSource.
@j-griffith
Copy link

LGTM

1 similar comment
@xing-yang
Copy link
Contributor

LGTM

@jieyu jieyu merged commit 7a8d2d5 into container-storage-interface:master Nov 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants