Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build cache format in registry is incorrect #173

Closed
manishtomar opened this issue Oct 23, 2019 · 22 comments
Closed

build cache format in registry is incorrect #173

manishtomar opened this issue Oct 23, 2019 · 22 comments

Comments

@manishtomar
Copy link

manishtomar commented Oct 23, 2019

Following is an example of cache pushed to hub registry using --cache-to=type=registry flag:

{
  "schemaVersion": 2,
  "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json",
  "manifests": [
    {
      "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
      "digest": "sha256:0057xxx",
      "size": 24532,
      "annotations": {
        "buildkit/createdat": "2019-10-21T12:27:45.739079211Z",
        "containerd.io/uncompressed": "sha256:b23fxxx"
      }
    },
   ...
}

The media type inside the manifests array should be either application/vnd.docker.distribution.manifest.v2+json or application/vnd.docker.distribution.manifest.v1+json according to the docker manifest list spec. Moreover while it is ok to use annotations with docker images since it is backward compatible the right place to use them should be OCI index/images. cc @dmcgowan

This is reported in docker/hub-feedback#1906.

@thaJeztah
Copy link
Member

ping @tonistiigi ptal

@xinfengliu
Copy link

A customer running DTR 2.7.x ran into a similar issue:

docker@dtr1:~/tmp$ docker buildx bake -f bb-build.yml  --set bb.cache-to=192.168.105.44/admin/test-buildx:cache --set bb.output=type=registry,push=true
[+] Building 4.3s (6/6) FINISHED
 => [internal] load build definition from test-buildx.dockerfile                                                               0.0s
 => => transferring dockerfile: 43B                                                                                            0.0s
 => [internal] load .dockerignore                                                                                              0.0s
 => => transferring context: 2B                                                                                                0.0s
 => [internal] load metadata for docker.io/library/busybox:latest                                                              3.7s
 => CACHED [1/1] FROM docker.io/library/busybox@sha256:6915be4043561d64e0ab0f8f098dc2ac48e077fe23f488ac24b665166898115a        0.0s
 => => resolve docker.io/library/busybox@sha256:6915be4043561d64e0ab0f8f098dc2ac48e077fe23f488ac24b665166898115a               0.0s
 => exporting to image                                                                                                         0.2s
 => => exporting layers                                                                                                        0.0s
 => => exporting manifest sha256:83e76e889aa8761d630d2b7d9f2ab2a86b7a4b4befbf35169a0f67c8ce9980fa                              0.0s
 => => exporting config sha256:a32c174ba1603e28e82147491b37102deb9054e49c4fa37b8e6c29e81067ac85                                0.0s
 => => pushing layers                                                                                                          0.1s
 => => pushing manifest for 192.168.105.44/admin/test-buildx:test                                                              0.0s
 => exporting cache                                                                                                            0.3s
 => => preparing build cache for export                                                                                        0.0s
 => => writing config sha256:d234c457916abb4259aa3ed69789842c587b8563689ed0de0436a815b6c5f9fb                                  0.1s
 => => writing manifest sha256:acad012385c4693fedf97c6a9731371c9b0c619087a67ba84920872a6c7e9c66                                0.2s

From above output, it doesn't seem that a manifest list will be set for cache tag.
However, from the DTR log,http.request.contenttype is application/vnd.docker.distribution.manifest.list.v2+json:

{
  "auth.user.name": "08e84535-c0b0-4fc7-8e0c-9fb17201b3ba",
  "go.version": "go1.12.9",
  "http.request.contenttype": "application/vnd.docker.distribution.manifest.list.v2+json",
  "http.request.host": "192.168.105.44",
  "http.request.id": "8fe19f58-84b0-4bd6-af2a-c06c04096899",
  "http.request.method": "PUT",
  "http.request.remoteaddr": "172.18.0.1",
  "http.request.uri": "/v2/admin/test-buildx/manifests/cache",
  "http.request.useragent": "containerd/1.3.0+unknown",
  "level": "info",
  "msg": "dispatching tag payload",
  "payload": {
    "namespace": "admin",
    "repository": "test-buildx",
    "tag": "cache",
    "digest": "sha256:acad012385c4693fedf97c6a9731371c9b0c619087a67ba84920872a6c7e9c66",
    "imageName": "admin/test-buildx:cache",
    "os": "",
    "architecture": "",
    "author": "admin",
    "pushedAt": "2020-01-10T13:20:35.687868126Z"
  },
  "time": "2020-01-10T13:20:35.687875356Z",
  "vars.name": "admin/test-buildx",
  "vars.reference": "cache"
}

This results to the entry for the cache tag in manifests table of DTR rethinkdb shows up as manifest list:

{ Annotations: null,
    architecture: '',
    config: <Buffer >,
    configDigest: '',
    configMediaType: '',
    createdAt: 2020-01-10T13:20:35.618Z,
    deleted: false,
    digest: 'sha256:acad012385c4693fedf97c6a9731371c9b0c619087a67ba84920872a6c7e9c66',
    dockerfile: [],
    layers: [],
    manifestListDigests: [ '/@sha256:d234c457916abb4259aa3ed69789842c587b8563689ed0de0436a815b6c5f9fb' ],
    markedAt: 2020-01-10T13:20:35.634Z,
    mediaType: 'application/vnd.docker.distribution.manifest.list.v2+json',
    originalAuthor: '08e84535-c0b0-4fc7-8e0c-9fb17201b3ba',
    os: '',
    payload: <Buffer 7b 22 73 63 68 65 6d 61 56 65 72 73 69 6f 6e 22 3a 32 2c 22 6d 65 64 69 61 54 79 70 65 22 3a 22 61 70 70 6c 69 63 61 74 69 6f 6e 2f 76 6e 64 2e 64 6f ... >,
    pk: 'admin/test-buildx@sha256:acad012385c4693fedf97c6a9731371c9b0c619087a67ba84920872a6c7e9c66',
    repository: 'admin/test-buildx',
    size: 0 },

This causes DTR error when listing tags because sha256:d234c457916abb4259aa3ed69789842c587b8563689ed0de0436a815b6c5f9fb contained in manifestListDigests is not in manifests table. It should have been a configDigest.

@tonistiigi
Copy link
Member

@xinfengliu that's an issue on how objects are tracked internally in the registry implementation(or registry manager in this case). The descriptors in the manifests lists are pulled from either manifest store or blob store(this case).

@crazy-max
Copy link
Member

Fixed through docker/hub-feedback#1906

@joaodrp
Copy link

joaodrp commented Jul 2, 2021

@crazy-max @thaJeztah,

As far as I can tell, this issue is not fixed. I'm on v0.5.1-docker 11057da37336192bfc57d81e02359ba7ba848e4a and just bumped into it (credit to @deleteriousEffect).

Steps to reproduce:

$ cat Dockerfile
FROM alpine:3.14.0
RUN echo 'foo' > /too

$ docker buildx build \
    --cache-from=type=registry,ref=alpine \
    --cache-to=type=registry,ref=myregistry.com/myrepo/cache,mode=max  \
    --platform linux/amd64,linux/arm64,linux/arm/v7 \
    -t myregistry.com/myrepo:latest \
    --push -f Dockerfile .

Looking at myrepo/cache on the storage backend (using distribution/distribution as registry), locating the manifest and inspecting its payload shows the following:

{
    "schemaVersion": 2,
    "mediaType": "application/vnd.oci.image.index.v1+json",
    "manifests": [
        {
            "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
            "digest": "sha256:136482bf81d1fa351b424ebb8c7e34d15f2c5ed3fc0b66b544b8312bda3d52d9",
            "size": 2427917,
            "annotations": {
                "buildkit/createdat": "2021-07-02T17:08:09.095229615Z",
                "containerd.io/uncompressed": "sha256:3461705ddf3646694bec0ac1cc70d5d39631ca2f4317b451c01d0f0fd0580c90"
            }
        },
        {
            "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
            "digest": "sha256:5843afab387455b37944e709ee8c78d7520df80f8d01cf7f861aae63beeddb6b",
            "size": 2811478,
            "annotations": {
                "buildkit/createdat": "2021-07-02T17:08:09.105436766Z",
                "containerd.io/uncompressed": "sha256:72e830a4dff5f0d5225cdc0a320e85ab1ce06ea5673acfe8d83a7645cbd0e9cf"
            }
        },
        {
            "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
            "digest": "sha256:58ab47519297212468320b23b8100fc1b2b96e8d342040806ae509a778a0a07a",
            "size": 2709626,
            "annotations": {
                "buildkit/createdat": "2021-07-02T17:08:09.119131923Z",
                "containerd.io/uncompressed": "sha256:5e04d10b60a48b2fef3614c8b2bf64312b03380ac01bcec220e16885fe660be5"
            }
        },
        {
            "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
            "digest": "sha256:885ec02eb06bf860aabdcba20550c15a1559db252b8e2c2b82b9e4bc0b0db488",
            "size": 109,
            "annotations": {
                "buildkit/createdat": "2021-07-02T17:08:10.210558638Z",
                "containerd.io/uncompressed": "sha256:ff9f76d538cb1447482480082a666c51274fa481c8b6cf5d5585f03b1cc0c8f6"
            }
        },
        {
            "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
            "digest": "sha256:9bed0b03f6313c7c311f0fa11e915b167e9ed346dfa1d7e3fa2694648bab79be",
            "size": 109,
            "annotations": {
                "buildkit/createdat": "2021-07-02T17:08:09.979974774Z",
                "containerd.io/uncompressed": "sha256:9c87c34fcbdd36a0ddffc61a4e9f8107faae60cafa6f214f1f6514bdd848202e"
            }
        },
        {
            "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
            "digest": "sha256:9bed0b03f6313c7c311f0fa11e915b167e9ed346dfa1d7e3fa2694648bab79be",
            "size": 109,
            "annotations": {
                "buildkit/createdat": "2021-07-02T17:08:09.979974774Z",
                "containerd.io/uncompressed": "sha256:9c87c34fcbdd36a0ddffc61a4e9f8107faae60cafa6f214f1f6514bdd848202e"
            }
        },
        {
            "mediaType": "application/vnd.buildkit.cacheconfig.v0",
            "digest": "sha256:7aa23086ec6b7e0603a4dc72773ee13acf22cdd760613076ea53b1028d7a22d8",
            "size": 1654
        }
    ]
}

According to the OCI spec, image indexes are meant to aggregate manifests, not layers or configs.

In this index, all elements within manifests are layers, except the last one, which is a config. So none of them should be here? I suspect that this should have been saved as a manifest, not an index. Something like this:

{
    "schemaVersion": 2,
    "config": {
        "mediaType": "application/vnd.buildkit.cacheconfig.v0",
        "size": 1654,
        "digest": "sha256:7aa23086ec6b7e0603a4dc72773ee13acf22cdd760613076ea53b1028d7a22d8"
      },
    "layers": [
        {
            "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
            "digest": "sha256:136482bf81d1fa351b424ebb8c7e34d15f2c5ed3fc0b66b544b8312bda3d52d9",
            "size": 2427917,
            "annotations": {
                "buildkit/createdat": "2021-07-02T17:08:09.095229615Z",
                "containerd.io/uncompressed": "sha256:3461705ddf3646694bec0ac1cc70d5d39631ca2f4317b451c01d0f0fd0580c90"
            }
        },
        {
            "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
            "digest": "sha256:5843afab387455b37944e709ee8c78d7520df80f8d01cf7f861aae63beeddb6b",
            "size": 2811478,
            "annotations": {
                "buildkit/createdat": "2021-07-02T17:08:09.105436766Z",
                "containerd.io/uncompressed": "sha256:72e830a4dff5f0d5225cdc0a320e85ab1ce06ea5673acfe8d83a7645cbd0e9cf"
            }
        },
        {
            "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
            "digest": "sha256:58ab47519297212468320b23b8100fc1b2b96e8d342040806ae509a778a0a07a",
            "size": 2709626,
            "annotations": {
                "buildkit/createdat": "2021-07-02T17:08:09.119131923Z",
                "containerd.io/uncompressed": "sha256:5e04d10b60a48b2fef3614c8b2bf64312b03380ac01bcec220e16885fe660be5"
            }
        },
        {
            "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
            "digest": "sha256:885ec02eb06bf860aabdcba20550c15a1559db252b8e2c2b82b9e4bc0b0db488",
            "size": 109,
            "annotations": {
                "buildkit/createdat": "2021-07-02T17:08:10.210558638Z",
                "containerd.io/uncompressed": "sha256:ff9f76d538cb1447482480082a666c51274fa481c8b6cf5d5585f03b1cc0c8f6"
            }
        },
        {
            "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
            "digest": "sha256:9bed0b03f6313c7c311f0fa11e915b167e9ed346dfa1d7e3fa2694648bab79be",
            "size": 109,
            "annotations": {
                "buildkit/createdat": "2021-07-02T17:08:09.979974774Z",
                "containerd.io/uncompressed": "sha256:9c87c34fcbdd36a0ddffc61a4e9f8107faae60cafa6f214f1f6514bdd848202e"
            }
        },
        {
            "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
            "digest": "sha256:9bed0b03f6313c7c311f0fa11e915b167e9ed346dfa1d7e3fa2694648bab79be",
            "size": 109,
            "annotations": {
                "buildkit/createdat": "2021-07-02T17:08:09.979974774Z",
                "containerd.io/uncompressed": "sha256:9c87c34fcbdd36a0ddffc61a4e9f8107faae60cafa6f214f1f6514bdd848202e"
            }
        }
    ]
}

This looks like a serious bug, which has been around for a while already (judging by the creation date of this issue). Can you please confirm this behavior?

@tonistiigi
Copy link
Member

Cache manifest is not an image. It is a list of individual descriptors. Buildkit will analyze the annotations and pull one or some of them. Some of the objects have layer types but they are not layered on top of each other or make up any "cache object" together.

from moby/buildkit#2220 (comment)

This is a common pattern used by other tools like containerd checkpoints or some cnab collections. Docker spec does not forbid specific mediatypes in the descriptors. Reference registry has always supported this since 2016 so no Buildkit specific hacks were done to enable this. OCI spec was written years after(by other people), doesn't strictly forbid it but leaves it to the implementation.

All the blobs are pushed with blob API, not manifest. So registries should not hardcode any buildkit specific mediatypes but just should not handle descriptors as manifests if they don't have a manifest mediatype. If you want to propose changes to OCI spec that make this more clear I support that.

The issue in here was that some internal tooling that did tracking of how objects relate to each other was looking up the descriptor from the manifest store, although the descriptor did not have a manifest mediatype.

@deleteriousEffect
Copy link

OCI spec was written years after(by other people), doesn't strictly forbid it but leaves it to the implementation.

I don't believe that this is true reading the spec: https://github.com/opencontainers/image-spec/blob/master/image-index.md

manifests array of objects

This REQUIRED property contains a list of manifests for specific platforms. While this property MUST be present, the size of the array MAY be zero.

A layer is not a manifest, so having a layer within the manifests array here violates the spec from my understanding.

@tonistiigi
Copy link
Member

"a layer" is as meaningless concept as "a manifest". The objects are descriptors, with digest, mediatype, size, annotations etc. A long time ago I had a PR to rename one of these keys (I think it was "layers" in distribution manifest, not this specific case) but it was decided against the rename because it only carried textual value and would have broken existing tooling.

The reason I said "implementation-specific" was because "An encountered mediaType that is unknown to the implementation MUST be ignored."

I'm not anti-OCI, buildkit uses it wherever possible (eg. these mediatypes should actually be docker but we changed them in a recent release because some registries refused to work without the rename and all our registries worked with either), but I'm not willing to break existing users because someone wrote a new doc or didn't understand the implications of code that they copy-pasted.

If OCI provides an alternative structure that is widely supported and someone makes a Buildkit cache implementation with it it will not be blocked by the maintainers.

@joaodrp
Copy link

joaodrp commented Jul 2, 2021

Even the Docker spec says:

The manifest list is the “fat manifest” which points to specific image manifests

https://docs.docker.com/registry/spec/manifest-v2-2/#manifest-list

For OCI, the spec is clear as well, an index should reference manifests only (plain text JSON payload with all the required fields).

What I'm seeing in this index is a bunch of tarballs (doesn't matter their purpose or nature), one config, and some annotations.

All of this data could live in a simple manifest, making buildx conformant with the OCI spec. As is, it's not.

Beyond conformance, the side effect of the current behaviour is that registries (and non-buildx clients) will face unexpected errors when attempting to parse one of the index references, expecting it to be a manifest when it in fact is a tarball.

@joaodrp
Copy link

joaodrp commented Jul 2, 2021

If OCI provides an alternative structure that is widely supported and someone makes a Buildkit cache implementation with it it will not be blocked by the maintainers.

Wouldn't an Image Manifest suffice for this purpose? Why is a (non conformant) Image Index being used for this?

@tonistiigi
Copy link
Member

tonistiigi commented Jul 2, 2021

is that registries (and non-buildx clients) will face unexpected errors when attempting to parse one of the index references, expecting it to be a manifest when it in fact is a tarball.

Wouldn't an Image Manifest suffice for this purpose?

No, it is the opposite. Eg.

» docker pull ghcr.io/akihirosuda/hello-apt-transport-oci:latest                                                                                                           !4160
latest: Pulling from akihirosuda/hello-apt-transport-oci
0929565a7dc0: Extracting [==================================================>]  1.108kB/1.108kB
bc0c033c4038: Download complete
invalid rootfs in image configuration

breaks in the middle of extraction because image manifest validation rules are not satisfied(you can also see it pulled a blob that was definitely not a layer or part of an image). I believe also breaks on push but didn't bother to test atm. Even fails with containerd.

While

docker pull tonistiigi/test:cache-example                                                                                                                                !4173
cache-example: Pulling from tonistiigi/test
no matching manifest for linux/amd64 in the manifest list entries

Errors with image not being found for the current arch. (If it would be added you could even pull the image part).

So the difference is that latter is compatible with clients/registries starting from Jan 2016. While the OCI (artifact) thing is a new concept that does not maintain backward compatibility (not just the media-types but rules about how layers are validated). It only works if both registry and client understand this new concept(and want to enable it I believe).

Again, this is not to say that buildkit will never support the artifact rules. Just that if someone is advocating for a new backward-incompatible solution to replace the already existing backward-compatible solution they need to put in the work required for the transition and they can't expect that all existing users cases should be broken because of them.

@joaodrp
Copy link

joaodrp commented Jul 2, 2021

@tonistiigi, I understand the concerns about changing this. That and semantics aside, can we agree on the following?

1. As-is, if a client, registry or human, looks at a cache manifest and sees the below, it's reasonable to expect that such manifest conforms with the OCI Image Index specification:

{
    "schemaVersion": 2,
    "mediaType": "application/vnd.oci.image.index.v1+json",
    ...
}

2. An OCI Image Manifest would be enough to describe a cache image, there is no technical need (ignoring the fact that some tools expect it to be like this, because it's been for a while) for an OCI Image Index.

As an example, this:

{
    "schemaVersion": 2,
    "mediaType": "application/vnd.oci.image.index.v1+json",
    "manifests": [
        {
            "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
            "digest": "sha256:136482bf81d1fa351b424ebb8c7e34d15f2c5ed3fc0b66b544b8312bda3d52d9",
            "size": 2427917,
            "annotations": {
                "buildkit/createdat": "2021-07-02T17:08:09.095229615Z",
                "containerd.io/uncompressed": "sha256:3461705ddf3646694bec0ac1cc70d5d39631ca2f4317b451c01d0f0fd0580c90"
            }
        },
        ...
        {
            "mediaType": "application/vnd.buildkit.cacheconfig.v0",
            "digest": "sha256:7aa23086ec6b7e0603a4dc72773ee13acf22cdd760613076ea53b1028d7a22d8",
            "size": 1654
        }
    ]
}

Could be represented as:

{
    "schemaVersion": 2,
    "config": {
        "mediaType": "application/vnd.buildkit.cacheconfig.v0",
        "size": 1654,
        "digest": "sha256:7aa23086ec6b7e0603a4dc72773ee13acf22cdd760613076ea53b1028d7a22d8"
      },
    "layers": [
        {
            "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
            "digest": "sha256:136482bf81d1fa351b424ebb8c7e34d15f2c5ed3fc0b66b544b8312bda3d52d9",
            "size": 2427917,
            "annotations": {
                "buildkit/createdat": "2021-07-02T17:08:09.095229615Z",
                "containerd.io/uncompressed": "sha256:3461705ddf3646694bec0ac1cc70d5d39631ca2f4317b451c01d0f0fd0580c90"
            }
        },
        ...
    ]
}

@tonistiigi
Copy link
Member

@joaodrp

  1. Yes, this object was changed to oci mediatype later, without revalidation with the spec. I think it was done to make Google's Artifact registry (or maybe Quay) work. I do think that the current object is actually spec-compatible. Even if you disagree with manifest vs layer, the spec clearly says to ignore all objects with non-manifest mediatype unless registry has own implementation for supporting them. Therefore registry either should accept this object as empty or support it if it understands how to deal with non-manifests.

  2. Two aspects of this.

  • First, as far as I understand there are two quite different things, OCI Image manifest and OCI artifact support (or OCI Image manifest with artifact support?). Former is quite widely supported now and mostly is just a rename of Docker's mediatypes. Latter is a new set of rules that, for example, are not compatible with any version of Docker, I believe are not compatible with Docker Hub and shouldn't be compatible with any client/registry pre-2020. As shown before, that compatibility gap is made worse by the old clients not being able to detect that they are not compatible and failing in some internal extraction/validation routines. Cache manifests can't be implemented with image manifests; theoretically, I think they can be implemented with artifacts.

  • Secondly, I don't think cache manifest is compatible with OCI Image manifest spec, even with artifacts. For example, the spec has very strict requirements for the layers objects, how they need to be ordered and how they are applied on top of each other. As I mentioned, all cache manifest objects are independent of each other and pulled separately, there is no order, and they don't apply on top of each other. This is quite important for compatibility with other clients that would just pull all of them and try to merge them.

@joaodrp
Copy link

joaodrp commented Jul 5, 2021

I just found that there is an oci-mediatypes option that we can set to false. This results in an application/vnd.docker.distribution.manifest.list.v2+json being used instead of an OCI Index. Example:

{
    "schemaVersion": 2,
    "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json",
    "manifests": [
        {
            "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
            "digest": "sha256:136482bf81d1fa351b424ebb8c7e34d15f2c5ed3fc0b66b544b8312bda3d52d9",
            "size": 2427917,
            "annotations": {
                "buildkit/createdat": "2021-07-02T17:08:09.095229615Z",
                "containerd.io/uncompressed": "sha256:3461705ddf3646694bec0ac1cc70d5d39631ca2f4317b451c01d0f0fd0580c90"
            }
        },
     ...

But the underlying problem persists, as it's not compliant with the spec, this time the Docker Image v2 spec, which says:

(manifests.mediaType): The MIME type of the referenced object. This will generally be application/vnd.docker.distribution.manifest.v2+json, but it could also be application/vnd.docker.distribution.manifest.v1+json if the manifest list references a legacy schema-1 manifest.

@joaodrp
Copy link

joaodrp commented Jul 5, 2021

Yes, this object was changed to oci mediatype later, without revalidation with the spec. I think it was done to make Google's Artifact registry (or maybe Quay) work.

@tonistiigi, thanks for providing some historical context. From what I see, it was due to Quay. However, today, it's still not possible to push/pull cache images to/from Quay, just like for the majority of the SaaS registries (including ECR, GCR, Artifactory).

The underlying reason seems to be the same as the one we're concerned about here. So I don't think it's fair to say this is "because someone wrote a new doc or didn't understand the implications of code that they copy-pasted".

Even if you disagree with manifest vs layer, the spec clearly says to ignore all objects with non-manifest mediatype unless registry has own implementation for supporting them

The spec says: "An encountered mediaType that is unknown to the implementation MUST be ignored." The problem is that the media type being used (application/vnd.oci.image.layer.v1.tar+gzip) is not unknown to OCI registries. It's well known, and it's the media type of an OCI layer, not a manifest.

Still, even with a fake media type, this would fail against most registries because these are not uploaded through the manifest API but rather the blob API (because they are not manifests), so they are not linked in the right places then found when the index arrives. So this ties back to the manifest vs. layer conversation, but we don't seem to agree on that.

Please note that the only reason this currently works against the Distribution registry (maintainer here) is because of an old workaround (which converged blobs and manifests links), for which there is a PR to revert it (distribution/distribution#3365). So the existing compatibility with this specific registry is due to a side effect, not by design (IMO).


First, as far as I understand there are two quite different things, OCI Image manifest and OCI artifact support (or OCI Image manifest with artifact support?).

That's correct, and this seems to be a great fit for artifacts. However, even for those, the manifest vs. layer is respected whenever applicable. Otherwise, it's not compatible with most registries, like cache images. For Helm charts (layer order doesn't matter as well), for example (source):

Index:

{
  "schemaVersion": 2,
  "manifests": [
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "digest": "sha256:31fb454efb3c69fafe53672598006790122269a1b3b458607dbe106aba7059ef",
      "size": 354,
      "annotations": {
        "org.opencontainers.image.ref.name": "localhost:5000/myrepo/mychart:2.7.0"
      }
    }
  ]
}

Manifest:

{
  "schemaVersion": 2,
  "config": {
    "mediaType": "application/vnd.cncf.helm.config.v1+json",
    "digest": "sha256:8ec7c0f2f6860037c19b54c3cfbab48d9b4b21b485a93d87b64690fdb68c2111",
    "size": 117
  },
  "layers": [
    {
      "mediaType": "application/tar+gzip",
      "digest": "sha256:1b251d38cfe948dfc0a5745b7af5ca574ecb61e52aed10b19039db39af6e1617",
      "size": 2487
    }
  ]
}

This does not go against any of the relationship/reference rules, so it should work with most registries if they follow the spec and "ignore an unknown media type", which applies perfectly to this case.


Cache manifests can't be implemented with image manifests

For example, the spec has very strict requirements for the layers objects, how they need to be ordered and how they are applied on top of each other. As I mentioned, all cache manifest objects are independent of each other and pulled separately, there is no order, and they don't apply on top of each other. This is quite important for compatibility with other clients that would just pull all of them and try to merge them.

Yeah, the order is important, but I'd argue that that only matters if we're concerned about creating containers based on these images, which is not the case for cache images, as you mention.


Based on all of this, if a manifest was used instead of an index/list, the result would be:

  • Cache images could be pushed and pulled from all OCI compliant registries without any customizations in any of them. As of today, it does not work for most;
  • Pulling an image with Docker would still fail as it does today (just like it does for Helm images), just with a different error message;
  • Buildkit could still pick whatever layers it wants from the manifest. Their order in the JSON payload wouldn't matter.
  • Anything else I'm missing here?

In regards to backward compatibility, maybe we could:

  • Build/export: From now onwards, a manifest would be used to represent all new cache images;
  • Import: If a pulled cache image is an index/list, then parse it as it was done before (the code would be kept around for backward compatibility). If it's a manifest, parse it with the new logic.

Does this seem technically feasible?

@thaJeztah
Copy link
Member

As far as I'm aware, manifest lists were created to provide a list of assets that allow for a "shallow" pull (i.e., implementation to make a selection from the list based on their metadata and only pull what it needs / understands).

This is contrary to "image manifests", which is a lot more "strict"; it defines a list of layers that MUST all be pulled (if mediative is known), and applied in order, the end result being a rootfs; https://github.com/opencontainers/image-spec/blob/069d186692cf734b8f5119a0cfdf4ff1ad741934/manifest.md#image-manifest-property-descriptions

layers array of objects

Each item in the array MUST be a descriptor. The array MUST have the base layer at index 0. Subsequent layers MUST then follow in stack order (i.e. from layers[0] to layers[len(layers)-1]). The final filesystem layout MUST match the result of applying the layers to an empty directory. The ownership, mode, and other attributes of the initial empty directory are unspecified.

From that, I think a manifest list is a closer match to an list (index) of cache "blobs" that may be pulled by the client (BuildKit), based on metadata.

For Helm charts (layer order doesn't matter as well)
This does not go against any of the relationship/reference rules, so it should work with most registries if they follow the spec and "ignore an unknown media type", which applies perfectly to this case.

Given the spec being explicit about how layers should be handled, I think "layer order doesn't matter" is outside of the spec.

That said; there's still a lot of ambiguity in the specs which (IMO) should be addressed w.r.t. "ignoring unknown mediatypes" (see the discussion on opencontainers/image-spec#803).

@joaodrp
Copy link

joaodrp commented Jul 5, 2021

As far as I'm aware, manifest lists were created to provide a list of assets that allow for a "shallow" pull (i.e., implementation to make a selection from the list based on their metadata and only pull what it needs / understands).

I understand that nowadays we have different needs, but the Docker spec is very clear and precise about this IMO, and registries have been following that:

The manifest list is the “fat manifest” which points to specific image manifests for one or more platforms.


Given the spec being explicit about how layers should be handled, I think "layer order doesn't matter" is outside of the spec.

Fully agree. I guess the point is, while we don't have an ultimate solution (which will take a long time to be adopted), what's the easiest and smallest change that can:

  • Preserve UX (end-to-end, not just on the client/buildkit side)
  • Make this compatible with all or almost all OCI compliant registries

I might be biased, but changing how several registries work to conform with something out of spec (IMO) seems to be the hardest way.

@SteveLasker
Copy link

I realize this has been closed, however the discussion for using an oci.image.manifest, consistent with how OCI Artifacts.
The manifest is the same, which all registries support. Having different layer mediaTypes, or different manifest.config are consistent with artifacts, and how helm charts are persisted across most registries. Even Docker Hub is completing their work to enable helm, which would support all artifact types that use different config.mediaTypes.

As for chasing registries, or asking registries to chase buildx, this has sorta happened already, in an inconsistent way. ACR didn't support buildx caches, because it was submitting blobs in the index. It didn't fail, as ACR didn't validate the index.manifests collection to realize they were blobs. However, this "sneaked" into the registry, and these blobs were garbage collected, because ACR couldn't find them in any valid manifest. ACR did "fix this", as we had customers complain their build cache layers were missing.
My point is, ACR fixed it to support the scenario. Howeer, this was a one-off scenario. Rarther than asking every registry to support on-off fixes, how about we align on a common pattern that supports Helm, WASM, OPA, BuildX, and all new artifact types?
In this case, switching builx to use oci.manifest, setting the manifest.config.mediaType and using the existing manifest.layers collection. Since it's only the buildx client that pulls this, it doesn't matter it's not an original overlay collection.

As users have issues with specific registries, they log bugs to support buildx, which follows the oci artifact approach, and now we start creating a standard for all artifact types across all registries.

@tonistiigi
Copy link
Member

tonistiigi commented Jul 12, 2021

I realize this has been closed,

This issue is closed because it was based on an internal tool failure that is not relevant anymore.

If someone creates a new issue (in Buildkit, not buildx), proposing cache manifests based on another OCI approved structure(with info about current registry support and spec compatibility) it will not be closed. Even if the build cache use-case matches more with the design of indexes, I personally have no issue with stating that consistency with other tools is more important. Historically we considered both but found that only with indexes we could avoid breaking old clients in bad ways and didn't have to force everyone to update their registries.

Once there is a solid proposal, I don't know how well we will be able to prioritize this work atm. It's possible we would need help from the community. Once the code is there we need to consider the support of the current registries at that time to understand if we can switch the default or need to make this opt-in at first. For some time importing should definitely support both formats. Once we can switch the default we can start deprecating the old formats.

how about we align on a common pattern that supports

I agree, but the index pattern was first used years before these other tools started to use a different method and this split emerged. It was done with the oversight of the people who wrote the initial manifest list spec and implementation, when docker distribution/hub was likely the only manifest implementation out there. I admit that in the process, spec wording was not updated that could have avoided this whole issue. But because of this history, we can't expect that the workflows of the previous users should be broken because there is a new spec but provide time and resources for a smooth migration. For the publicly available code that still seems to be used and looks like the same pattern to me, you might also want to look at https://github.com/containerd/containerd/blob/main/task_opts.go#L51-L72 .

@SteveLasker
Copy link

Thanks @tonistiigi,
The proposal @joaodrp made here would be consistent with the helm approach and quickly working across all registries.

The index support is interesting, as not all registries supported oci.index. Only in the last year has this started to get round out by the major cloud providers.
Then, many, including ACR, or distribution/distribution implementations didn't do enough validation on ingress, so users wound up with inconsistent, and unexpected results.

Appreciate the dialog and we'll open the conversation in buildkit to see how this can be standardized, in a consistent manner, support existing customers.

For the spec wordings, IMHO, some things only get realized when ambiguity is surfaced. There are lots of great studies where humans tend to read things with our own biases. This is the beauty of two people reading the same book with different outcomes. Only through questions like these do we realize where clarity is needed.

@joaodrp
Copy link

joaodrp commented Jul 13, 2021

Thanks for following up on this @SteveLasker.

I'll open a more detailed issue in the buildkit repo later today.

@stevvooe
Copy link

Just chiming in here: a manifest can be any typed content. The spec does not restrict it to just "manifests".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants