Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sanitize: Add a config-sanitization command #219

Closed
wants to merge 1 commit into from

Conversation

wking
Copy link
Contributor

@wking wking commented Sep 22, 2016

Based on image-spec, which currently exposes:

  • User (user and group by ID or name).
    Represented in runtime-spec by process.user.*, although I'm clearing
    additionalGids for now because the old formats gave no way to
    represent that.
  • Memory (limit).
    Represented in runtime-spec by linux.resources.memory.limit and
    solaris.cappedMemory.physical.
  • MemorySwap (memory + swap limit).
    Represented in runtime-spec by linux.resources.memory.swap and
    solaris.cappedMemory.physical + solaris.cappedMemory.swap.
  • CpuShares (relative weight vs. other containers).
    Represented in runtime-spec by linux.resources.cpu.shares. Solaris
    has an ncpus property, but no shares analog.
  • ExposedPorts (a set of port/protocol entries).
    This is not covered by runtime-spec, but image-spec was planning on
    stuffing it into annotations
    .
  • Env (array of environment variables).
    Represented in runtime-spec by process.env.
  • Entrypoint (array of positional arguments).
    Represented in runtime-spec by process.cmd.
  • Cmd (array of positional arguments).
    Represented in runtime-spec by process.cmd.
  • Volumes (set of directories which should have data volumes mounted
    on them).
    Represented in runtime-spec by mounts, although Volumes doesn't
    include source locations.
  • WorkingDir (initial working directory of container process).
    Represented in runtime-spec by process.cwd.

Entrypoint and Cmd are slightly different and have a few different forms each. Both are represented in the runtime-spec config by process.args.

This commit sets all other config settings to their default values, and it clears mounts[].source to mimic the Volumes information. image-spec currently sets up source-less bind-mounts when translating Volumes.

With the sanitization command, any runtime configuration can be sanitized to be just as safe and limited as the current image-spec config. That allows us to replace the current:

  1. Unpack the rootfs.
  2. Fetch the config, translate it to runtime-spec's format, and save it as config.json.

with:

  1. Unpack the bundle.
  2. Sanitize config.json if the file exists in the unpacked bundle.

That gets image-spec out of the business of cherry-picking portions of runtime-spec's format that are portable and/or extending it to support things like ExposedPorts. Miss-matches made some images impossible (there was no way to say “containers launched from this image will need a terminal” without using annotations) and the translation was sometimes awkward (opencontainers/image-spec#87 suggested putting Entrypoint in annotations, and the translation only supported integer IDs for users and groups while claiming support for names as well).

I'm sure there's room for improving the set of properties that are being sanitized, but I'd rather punt on that level of discussion until there's a consensus about whether this approach has legs at all ;).

Based on image-spec, which currently exposes [1]:

* User (user and group by ID or name)
  Represented in runtime-spec by process.user.*, although I'm clearing
  additionalGids for now because the old formats gave no way to
  represent that.
* Memory (limit)
  Represented in runtime-spec by linux.resources.memory.limit and
  solaris.cappedMemory.physical.
* MemorySwap (memory + swap limit)
  Represented in runtime-spec by linux.resources.memory.swap and
  solaris.cappedMemory.physical + solaris.cappedMemory.swap.
* CpuShares (relative weight vs. other containers)
  Represented in runtime-spec by linux.resources.cpu.shares.  Solaris
  has an ncpus property, but no shares analog.
* ExposedPorts (a set of port/protocol entries)
  This is not covered by runtime-spec, but image-spec was planning on
  stuffing it into annotations [2].
* Env (array of environment variables)
  Represented in runtime-spec by process.env.
* Entrypoint (array of positional arguments)
  Represented in runtime-spec by process.cmd.
* Cmd (array of positional arguments)
  Represented in runtime-spec by process.cmd.
* Volumes (set of directories which should have data volumes mounted
  on them)
  Represented in runtime-spec by mounts, although Volumes doesn't
  include source locations.
* WorkingDir (initial working directory of container process)
  Represented in runtime-spec by process.cwd.

Entrypoint and Cmd are slightly different and have a few different
forms each [3,4].  Both are represented in the runtime-spec config by
process.args.

This commit sets all other config settings to their default values,
and it clears mounts[].source to mimic the Volumes information.
image-spec currently sets up source-less bind-mounts when translating
Volumes [5].

With the sanitization command, *any* runtime configuration can be
sanitized to be just as safe and limited as the current image-spec
config.

[1]: https://github.com/opencontainers/image-spec/blob/v0.5.0/serialization.md#container-runconfig-field-descriptions
[2]: opencontainers/image-spec#87 (comment)
     Subject: Convert a serialization config JSON to a OCI runtime
       configuration.
[3]: https://docs.docker.com/engine/reference/builder/#entrypoint
[4]: https://docs.docker.com/engine/reference/builder/#cmd
[5]: https://github.com/opencontainers/image-spec/blob/v0.5.0/image/config.go#L127-L136

Signed-off-by: W. Trevor King <wking@tremily.us>
@Mashimiao
Copy link

sorry, I can't get the point.
You mean image should contain config.json and reset items to be null which also exposed by image-spec?

@wking
Copy link
Contributor Author

wking commented Sep 22, 2016

On Thu, Sep 22, 2016 at 01:45:10AM -0700, Ma Shimiao wrote:

You mean image should contain config.json…

Yes. And then it can drop ‘config’ from 1 as redundant.

… and reset items to be null which also exposed by image-spec?

I think you have that turned around. It's resetting (to the default)
everything which is not currently exposed by ‘config’.

Of course, replacing ‘config’ with a runtime-spec config.json doesn't
have to happen when this command lands. This is just tooling that
would facilitate such a pivot. Even without an image-spec pivot, this
command is useful for sanitizing third-party config.jsons (however you
acquire them) to get the same level of security currently provided by
application/vnd.oci.image.config.v1+json's ‘config’ property..

@vbatts
Copy link
Member

vbatts commented Sep 22, 2016

@wking this is more confusing in PR than it was on the call yesterday.

I get that having the image and runtime spec decoupled is something to think through, but this PR seems like left-field.

@wking
Copy link
Contributor Author

wking commented Sep 22, 2016

On Thu, Sep 22, 2016 at 11:59:29AM -0700, Vincent Batts wrote:

@wking this is more confusing in PR than it was on the call yesterday.

This PR is just a mask over config.json that passes through fields
which image-spec currently sets from 1 and stuff that seemed very
tightly related (e.g. the Solaris memory properties). I don't see
anything I can to to make that goal less confusing.

On the other hand, perhaps the implementation would be less confusing
if I replaced ccd3f6e's blacklist approach to mutating the original
config instance with a whitelist approach that created a new config
instance and populated it from the original config. Something like:

// Santize removes dangerous and questionably-portable properties from container configurations.
func Sanitize(config *rspec.Spec) (sanitizedConfig *rspec.Spec, err error) {
sanitizedConfig.ociVersion = config.ociVersion
sanitizedConfig.Platform = config.Platform

return santizedConfig, err
}

If you feel like that would help, I'm happy to reroll.

@wking
Copy link
Contributor Author

wking commented Oct 12, 2016 via email

wking added a commit to wking/image-spec that referenced this pull request Nov 6, 2016
We probably need to keep application/vnd.oci.image.config.v1+json
untouched, since e94aa35 (schema: add a docker v2.2 backwards compat
test, 2016-06-15, opencontainers#145) and other maintainer activity suggest a goal
of bit-for-bit compatibility with the current Docker schemas
(excepting media types).  However, requiring Docker support doesn't
mean we can't *also* require support for configuration formats that
are easier for image authors to use.

Of course, with the (greatly) increased flexibility comes a lot more
risk.  Image consumers in general, and runtime-spec-based-image
consumers in particular, should use a sanitization tool like [1].

The runtime-spec config also lacks support for diffIDs, but local
image tooling is still welcome to record the digests of uncompressed
layers and use that for local optimizations.  You have to fetch the
compressed layer at least once to perform the uncompression, but you'd
have to do that to verify the old diffID anyway.

[1]: opencontainers/runtime-tools#219

Signed-off-by: W. Trevor King <wking@tremily.us>
wking added a commit to wking/image-spec that referenced this pull request Nov 8, 2016
We probably need to keep application/vnd.oci.image.config.v1+json
untouched, since e94aa35 (schema: add a docker v2.2 backwards compat
test, 2016-06-15, opencontainers#145) and other maintainer activity suggest a goal
of bit-for-bit compatibility with the current Docker schemas
(excepting media types).  However, requiring Docker support doesn't
mean we can't *also* require support for configuration formats that
are easier for image authors to use.

Of course, with the (greatly) increased flexibility comes a lot more
risk.  Image consumers in general, and runtime-spec-based-image
consumers in particular, should use a sanitization tool like [1].

The runtime-spec config also lacks support for diffIDs, but local
image tooling is still welcome to record the digests of uncompressed
layers and use that for local optimizations.  You have to fetch the
compressed layer at least once to perform the uncompression, but you'd
have to do that to verify the old diffID anyway.

[1]: opencontainers/runtime-tools#219

Signed-off-by: W. Trevor King <wking@tremily.us>
@Mashimiao
Copy link

@wking Do we still need this PR? image-tools already has unpack command.

@wking
Copy link
Contributor Author

wking commented Nov 30, 2017

Do we still need this PR? image-tools already has unpack command.

I still prefer this approach. But given the maintainer resistance to any type of config sharing (e.g. discussion in opencontainers/runtime-spec#611), and the apparent ongoing enthusiasm for translating from the image-spec config (e.g. opencontainers/image-spec#711), I'm not holding my breath on this landing here ;). And folk who do like this approach can always do their own whitelisting or blacklisting (e.g. with jq). So if a maintainer wants to close this PR, that's fine with me.

@Mashimiao
Copy link

OK, I'm gonging to close this PR,

@Mashimiao Mashimiao closed this Nov 30, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants