Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move signature content from a blob to annotations #123

Open
SteveLasker opened this issue Jan 14, 2022 · 6 comments
Open

Move signature content from a blob to annotations #123

SteveLasker opened this issue Jan 14, 2022 · 6 comments
Milestone

Comments

@SteveLasker
Copy link
Contributor

Capturing proposal from @justincormack to lift signature data to annotations: https://github.com/justincormack/sign-index

Details to be completed, with an excerpt from Justin:

Annotations in the prototype are

  • org.notaryproject.signature.version The version of the signing protocol, 0.2 in this prototype. Maybe should
    remove this and have ``org.notaryproject.signature.v1. ...` instead.
  • org.notaryproject.signature.type The type of the signature. I implemented ssh in this prototype for
    ssh signatures as these are now supported in git and so are likely
    to be useful. Also everyone has an ssh key so this is easy to test out. They support embedded certificate chains
    and many signature formats, and have native Go support, although the prototype shells out. We could add a small
    number of options here, such as the JWT format we are discussing.
  • org.notaryproject.signature.identity is a hint for the identity of the signer. For SSH signatures this
    is the email address, which is used to look up signatures; it could be a public key identifier. If more identifiers
    are needed for finding keys we should add additional fields, but for many signing schemes a single one is enough.
  • org.notaryproject.signature.descriptor This is the signed descriptor, base64 encoded. As this is what
    is actually signed, all the data in it can be trusted. We verify that the descriptor for the image that is signed
    in the image index matches this descriptor.
  • org.notaryproject.signature.data This is the signature data, base64 encoded.
@imjasonh
Copy link

imjasonh commented Feb 8, 2022

Is there any concern about this approach potentially increasing the size of manifests to an unhealthy degree? I seem to recall that being a source of concern in opencontainers/image-spec#826, which tried to address the same sort of concern about multiplying connections to fetch small blobs.

FWIW, my position remains that multi-hundred-KB, even multiple-single-digit-MB, manifests should be acceptable to registries and clients, so I don't really share the concern about stuffing this data into annotations either. I just want to understand if there's some fundamental reason annotations are a more suitable place for this potentially-large-ish data, while inlined data blobs aren't.

@SteveLasker
Copy link
Contributor Author

Hi Jason,
I'm fully supportive of annotations and would like to create standards around them so they can be used for indexing and querying across all content the user has access to within the registry. Lifting these as annotations enables generic annotation APIs to query for specific signatures.

Without rehashing the feedback on 826 the differences in this PR are:

  1. The signature would be persisted as a set of annotations, that are strings, and short.
  2. The ORAS Artifact manifest doesn't require blobs, so a user can push a set of annotations without blobs.
  3. The relevant issue to this pr is the data proposal dupes the data stored in blobs. As a registry operator, I'm always thinking about the customer impact. These annotations are small, never shared across different artifacts, so there are no caching/data de-duping scenarios that are lost. A data element is an open door, with no implied constraints.

@imjasonh
Copy link

imjasonh commented Feb 8, 2022

Thanks, that's helpful context.

I guess my question is, what guarantees the signatures will be short, since that seems to be an implicit assumption?

For example, it's sometimes a feature of signing schemes that signers can also include other metadata they'd like to state and possibly have verifiers verify (e.g., the Git commit SHA the signed image was built from). I don't see that listed above, so maybe that's just out of scope for your needs, and that's fine. But if you do decide to add support for this in the future, it's worth considering that that means users can violate your assumption about these values being "short".

It's worth thinking through what the consequences would be if that assumption was violated.

A data element is an open door, with no implied constraints.

Also worth noting: as we learned in #826, OCI's image-spec is also "an open door, with no implied constraints", so long as it requires that implementations ignore arbitrary unknown fields.

It's still incumbent on registry operators to block overly large manifests (based on their own definition of "large"), and for clients to be aware that registries might do that. That's the case whether largeness comes from data fields, or annotations, or any other arbitrary field that OCI requires be supported.

@SteveLasker
Copy link
Contributor Author

Great questions:

Gaurentees of shortness

This is a challenge. This is one reason I'm supportive of annotations, that have an implied length, where data carries an implied larger payload. We will have more details for how we manage the aspects of the payload in some updates to the signature spec.

As for violating the assumption, this is why I'd again raise the question of annotation breakers, and manifest main breakers. Perhaps that's a topic to revisit in the refType working group, as the attendees were the most vocal.

it's still incumbent on registry operators to block overly large manifests (based on their own definition of "large"), and for clients to be aware that registries might do that. That's the case whether largeness comes from data fields, or annotations, or any other arbitrary field that OCI requires to be supported.

This is where I'm most concerned. One of the biggest values of registries, and the standards is the ability for a user to promote content from public sources, across clouds and on-prem. As discussed in the 826 feedback promoting the use of arbitrary, unbounded properties in the manifest will degrade the interop and promotion scenarios as different registries limit different sizes.
Imagine roads where bridges have no minimal height expectations. A standard of 13.5' is set, and we know what happens when bridges don't support those constraints, and stupid humans are, well human.

@imjasonh
Copy link

This is a challenge. This is one reason I'm supportive of annotations, that have an implied length, where data carries an implied larger payload. We will have more details for how we manage the aspects of the payload in some updates to the signature spec.

That's all well and good, but the implication that annotations are short is not spelled out in any spec. https://github.com/opencontainers/image-spec/blob/main/annotations.md makes no mention of assumed or recommended length limits, only:

This property contains arbitrary metadata.

If your signature spec intends to suggest/recommend/enforce length limits that's fine, but it could also do the same thing with referenced blobs, and inline them. You could also add another field, that's supposed to be supported by registries already.

Re: manifest size concerns: opencontainers/distribution-spec@dd38b7e was recently merged to recommend registries enforce size limits, and to advise clients that size limits may be enforced.

You could propose some similar language to limit annotations as well, though I believe no registries limit annotations today at all, only total manifest request size.

@dtzar dtzar modified the milestones: alpha-2, rc.1 Jun 8, 2022
@NiazFK NiazFK modified the milestones: RC-1, Discuss Jul 14, 2022
@iamsamirzon
Copy link

This is an optimization which can be deferred. This will not be a breaking change if we decide to implement it later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Todo
Development

No branches or pull requests

5 participants