RPM with Copy on Write #1470

malmond77 · 2020-12-29T20:53:59Z

This is part of https://fedoraproject.org/wiki/Changes/RPMCoW

The majority of changes are in two new programs:

rpm2extents

Modelled as a 'stream processor'. It reads a regular .rpm file on stdin,
and produces a modified .rpm file on stdout. The lead, signature and
headers are preserved 1:1 to allow all the normal metadata inspection,
signature verification to work as expected. Only the 'payload' is
modified.

The primary motivation for this tool is to re-organize the payload as a
sequence of raw file extents (hence the name). The files are organized
by their digest identity instead of path/filename. If any digest is
repeated, then the file is skipped/de-duped. Only regular files are
represented. All other entries like directories, symlinks, devices are
fully described in the headers and are omitted.

The files are padded so they start on sysconf(_SC_PAGESIZE) boundries
to permit 'reflink' syscalls to work in the reflink plugin.

At the end of the file is a footer with 3 sections:

List of calculated digests of the input stream. This is used in
librepo because the file written is a derivative, and not the
same as the repo metadata describes. rpm2extents takes one or more
positional arguments that described which digest algorithms are
desired. This is often just SHA256. This program is only measuring
and recording the digest - it does not express an opinion on whether
the file is correct. Due to the API on most compression libraries
directly reading the source file, the whole file digest is measured
using a subprocess and pipes. I don't love it, but it works.
Sorted List of file content digests + offset pairs. This is used in
the plugin with a trivial binary search to locate the start of file
content. The size is not needed because it's part of normal headers.
(offset of 1., offset of 2., 8 byte MAGIC value) triple

reflink plugin

Looks for the 8 byte magic value at the end of the rpm file. If present
it alters the RPMTAG_PAYLOADFORMAT in memory to clon, and reads in
the digest-> offset table.

rpmPackageFilesInstall() in fsm.c is
modified to alter the enumeration strategy from
rpmfiNewArchiveReader() to rpmfilesIter() if not cpio. This is
needed because there is no cpio to enumerate. In the same function, if
rpmpluginsCallFsmFilePre() returns RPMRC_PLUGIN_CONTENTS then
fsmMkfile() is skipped as it is assumed the plugin did the work.

The majority of the work is in reflink_fsm_file_pre() - the per file
hook for RPM plugins. If the file enumerated in
rpmPackageFilesInstall() is a regular file, this function will look up
the offset in the digest->offset table and will try to reflink it, then
fall back to a regular copy. If reflinking does work: we will have
reflinked a whole number of pages, so we truncate the file to the
expected size. Therefore installing most files does involve two writes:
the reflink of the full size, then a fork/copy on write for the last
page worth.

If the file passed to reflink_fsm_file_pre() is anything other than a
regular file, it return RPMRC_OK so the normal mechanics of
rpmPackageFilesInstall() are used. That handles directories, symlinks
and other non file types.

New API for internal use

rpmReadPackageRaw() is used within rpm2extents to read all the
headers without trying to validate signatures. This eliminates the
runtime dependency on rpmdb.
rpmteFd() exposes the Fd behind the rpmte, so plugins can interact
with the rpm itself.
RPMRC_PLUGIN_CONTENTS in rpmRC_e for use in
rpmpluginsCallFsmFilePre() specifically.
pgpStringVal() is used to help parse the command line in
rpm2extents - the positional arguments are strings, and this
converts the values back to the values in the table.

Nothing has been removed, and none of the changes are intended to be
used externally, so I don't think a soname bump is warranted here.

DemiMarie · 2020-12-30T01:07:32Z

How will package signatures be verified? More specifically, will rpm2extents verify the signed digest of files before decompressing them? Otherwise, this seems like a potential security risk, in case there is a bug in the decompression library.

malmond77 · 2020-12-30T05:45:00Z

@DemiMarie : this is an excellent point. There is verification of the whole rpm file in librepo (see rpm-software-management/librepo#222) and rpm signature verification is done after that, but there remains the possibility of a rogue mirror offering bad data into a compression library.

My answer for tonight is that this depends on whether you trust your mirrors. This proposal assumes you do, and I will update the linked proposal with this caveat tomorrow. The bottom line: this is opt in, and not intended to be enabled anywhere by default today.

In the immediate future, I will investigate potential solutions. The one I came up with tonight involves PAYLOADDIGESTALT or another header with a list of digests per reasonable sized block of compressed data, pre-decompression. Example: SHA256 with 1MiB block size would add a trivial overhead, but allow data to be verified before feeding into a decompression library. This would require support for a digest of the headers in the repodata's primary.xml[1], and would in turn be part of the checksum in repomd.xml and the metalink mirror infra. I am looking at [1] for related reasons.

I am definitely open to suggestions. Thank you for this very valuable feedback.

DemiMarie · 2020-12-30T06:04:52Z

@DemiMarie : this is an excellent point. There is verification of the whole rpm file in librepo (see rpm-software-management/librepo#222) and rpm signature verification is done after that, but there remains the possibility of a rogue mirror offering bad data into a compression library.

My answer for tonight is that this depends on whether you trust your mirrors. This proposal assumes you do, and I will update the linked proposal with this caveat tomorrow. The bottom line: this is opt in, and not intended to be enabled anywhere by default today.

Thank you. I certainly do not trust my mirrors, so this would not work for me.

In the immediate future, I will investigate potential solutions. The one I came up with tonight involves PAYLOADDIGESTALT or another header with a list of digests per reasonable sized block of compressed data, pre-decompression. Example: SHA256 with 1MiB block size would add a trivial overhead, but allow data to be verified before feeding into a decompression library. This would require support for a digest of the headers in the repodata's primary.xml[1], and would in turn be part of the checksum in repomd.xml and the metalink mirror infra. I am looking at [1] for related reasons.

This should be acceptable, but only if the repository metadata was signed and that signature was checked. Currently, Fedora doesn’t sign its metadata, but this may be fixed in the future. Signing the metadata is definitely a good idea for other reasons.

I am definitely open to suggestions. Thank you for this very valuable feedback.

You’re welcome! What if the RPM signature included a signature of the RPM header, and the RPM header included a list of hashes? That would allow rpmkeys -K and the remaining RPM signature verification machinery to continue to work as normal, with essentially no loss of security. It would also protect against decompression bomb denial of service attacks.

lnussel · 2021-01-04T15:01:13Z

Hi! Great to see someone thinking about how to leverage CoW file system features for RPM :-) I had some ideas on the topic a while ago too, coming from a different direction. Thought about making the build system produce (and sign) uncompressed rpms directly https://lnussel.github.io/2020/07/07/rpm-delta-updates/

malmond77 · 2021-01-04T18:30:35Z

Hi! Great to see someone thinking about how to leverage CoW file system features for RPM :-) I had some ideas on the topic a while ago too, coming from a different direction. Thought about making the build system produce (and sign) uncompressed rpms directly https://lnussel.github.io/2020/07/07/rpm-delta-updates/

That is an awesome writeup. I think we're definitely thinking along the same lines, so I'd love to chat about this. I'll email you directly for a followup.

pmatilai · 2021-01-05T11:58:27Z

Haven't had a chance to properly look review and think through the concept etc yet, but a few preliminary review remarks to follow...

lib/fsm.c

pmatilai · 2021-01-05T12:03:13Z

lib/rpmtypes.h

@@ -106,7 +106,8 @@ typedef	enum rpmRC_e {
    RPMRC_NOTFOUND	= 1,	/*!< Generic not found code. */
    RPMRC_FAIL		= 2,	/*!< Generic failure code. */
    RPMRC_NOTTRUSTED	= 3,	/*!< Signature is OK, but key is not trusted. */
-    RPMRC_NOKEY		= 4	/*!< Public key is unavailable. */
+    RPMRC_NOKEY		= 4,	/*!< Public key is unavailable. */
+    RPMRC_PLUGIN_CONTENTS = 5     /*!< fsm_file_pre plugin is handling content */


NAK for adding what effectively is an internal corner case to the highly visible RPMRC enum.
RPMRC was originally supposed to be a package open result, but is (mis)used for all sorts of bad and worse purposes throughout rpm, more likely we should move the plugins to use a separate error code system.

I did an attempt at decoupling plugins from rpmRC in #1909 .

lib/rpmplugins.c

plugins/reflink.c

pmatilai · 2021-01-05T12:41:56Z

I concur with @DemiMarie 's security concerns: we only just got the full payload pre-transaction verification in place finally in 4.14.2, but this effectively disables not just that but all digest and signature verification for the incoming package (in rpm2extent), which is nothing but an untrusted binary from somewhere AIUI. That's not acceptable, really.

Note that you don't need rpmdb to verify signatures, you just need the keys which can be populated from any source you like. It's just the default setup that relies on rpmdb.

Another point that might be of relevance is that despite saying so in the payload tag, the payload isn't always "cpio" these days, packages with large files are handled with a different format which only uses an integer as the file "header" in the payload. Which might be more reusable for your purposes, and if that was used for the package originally then the alt payload could perhaps be calculated more easily. I don't remember all the details so might be missing something here, but I think there should be something in that direction...

pmatilai · 2021-01-05T12:56:26Z

Another broader thought is that perhaps it might be better to add a new plugin slot for this kind of purpose, which gets the fd as an argument and so doesn't need rpmteFd() which is something I'm not really comfortable in exposing in the external API. That would perhaps eliminate the need for that special PLUGIN_CONTENT return too.

pmatilai · 2021-01-05T13:50:17Z

Oh and yet another related remark: nothing against having rpm support reflink where possible, it's actually something I've wanted to do for a long time. Rpm would need to track per-filesystem capabilities somehow (there are several other use-cases for that). Related to that, something mentioned in the fedora devel discussion: unpacking to per-filesystem "temporary" hidden location is something that would be useful to rpm beyond this case. And related to that in turn: a long, long overdue thing is to have rpm first unpack the whole package and only if everything up to that point succeeds, replace existing files in one final swoop. That is not an if but when, so you'll want to make sure you don't build too many assumptions around the current broken file by file unpack + replace operation.

pmatilai · 2021-01-05T13:55:28Z

@lnussel , @malmond77 - if you want to talk about CoW on rpm outside the context of this PR, please just open a ticket here instead of going private email.

malmond77 · 2021-01-05T19:36:11Z

@pmatilai thanks for all the feedback. In no specific order, here are some responses/followups:

(most important) I am looking at how to add partial or full signature/digest verification during transcode. I will use the same key store as rpm expects (so files + rpmdb).
Transcoding definitely changes the payload, so when rpm gets the file for installation: PAYLOADDIGEST definitely can't work. I will see if I can add a new kind of verification that is higher priority, and gets evidence from the transcoder step. Note that value can't be meaningfully signed locally. I've expressed on the change that transcoded rpms in cache shouldn't be moved off host/shared and this is one of the reasons.
I like the idea of staging all files before renaming in, but it adds a lot of complexity in terms of decision on how to backup/replace files. I think the point of that is to avoid replacing anything if you're out of space, or unable to write to all locations - you can bail out more gracefully. I have tried to use/reuse as much of the regular file installation mechanic as possible. If this changes, I'll be glad to adjust accordingly.
On cpio and large file support: I'll look at some examples. I think I'm not making any actual assumptions on whether something really is a cpio or not (aside from the tag) - I use the regular functions to enumerate the rpm in the transcoder. As for whether the large file format helps: I don't think so: a transcoded rpm needs raw bytes page size aligned, de-duped, and indexed by digest, not file number.
I'll look at changing the plugin API to eliminate the RPMRC value and rpmtdFd() addition. I am concerned that this might break backward compatibility and force an soname bump which is super painful when backporting the change. Is my concern founded?
Comments and bools, you got it!

pmatilai · 2021-01-07T12:07:41Z

The plugin API is internal-only still, so you can't break it by adding something new. Making it public is actually pending changing the unpacking procedure to unpack everything first and only then replace.

What I meant by the possibly useful different format in the large file support is that it's much easier to reconstruct than the cpio header. So while you obviously can't reasonably reconstruct PAYLOADDIGEST, maybe that would make it possible to reconstruct the uncompressed PAYLOADDIGESTALT. Perhaps not.

I have gut feeling that there are some things/features with mutual interests in this all.

mlschroe · 2021-01-08T12:37:46Z

I could imagine an option that overwrites the payloadloadformat. Using this does two things:

it tells rpm that the payload has been changed and needs to be unpacked in a different way
it works as a contract telling rpm that the caller has done all the needed data verification and disables payload digests/signatures

I prefer to do it with an option so that the header does not need to be modified and therefore header-only signatures still work.

I see an additional use case for this: deltarpms. If we have that option we no longer need to construct the compressed payload.

malmond77 · 2021-01-11T18:52:21Z

@pmatilai the large file format / PAYLOADDIGESTALT is a good idea for constructing original format rpms that most clients can use. I will spend more time thinking about it.

@mlschroe I agree that there should be some way to tell RPM that a(n original) rpm file already passed verify locally. I expect to do this implicitly in rpm2extents by failing the transcode if the verify fails. I will look to add a check early in the verify step that looks for the transcoded signature and if so, short circuits the other digest types, like you said in your second point :)

This is part of https://fedoraproject.org/wiki/Changes/RPMCoW The majority of changes are in two new programs: = rpm2extents Modeled as a 'stream processor'. It reads a regular .rpm file on stdin, and produces a modified .rpm file on stdout. The lead, signature and headers are preserved 1:1 to allow all the normal metadata inspection, signature verification to work as expected. Only the 'payload' is modified. The primary motivation for this tool is to re-organize the payload as a sequence of raw file extents (hence the name). The files are organized by their digest identity instead of path/filename. If any digest is repeated, then the file is skipped/de-duped. Only regular files are represented. All other entries like directories, symlinks, devices are fully described in the headers and are omitted. The files are padded so they start on `sysconf(_SC_PAGESIZE)` boundries to permit 'reflink' syscalls to work in the `reflink` plugin. At the end of the file is a footer with 3 sections: 1. List of calculated digests of the input stream. This is used in `librepo` because the file *written* is a derivative, and not the same as the repo metadata describes. `rpm2extents` takes one or more positional arguments that described which digest algorithms are desired. This is often just `SHA256`. This program is only measuring and recording the digest - it does not express an opinion on whether the file is correct. Due to the API on most compression libraries directly reading the source file, the whole file digest is measured using a subprocess and pipes. I don't love it, but it works. 2. Sorted List of file content digests + offset pairs. This is used in the plugin with a trivial binary search to locate the start of file content. The size is not needed because it's part of normal headers. 3. (offset of 1., offset of 2., 8 byte MAGIC value) triple = reflink plugin Looks for the 8 byte magic value at the end of the rpm file. If present it alters the `RPMTAG_PAYLOADFORMAT` in memory to `clon`, and reads in the digest-> offset table. `rpmPackageFilesInstall()` in `fsm.c` is modified to alter the enumeration strategy from `rpmfiNewArchiveReader()` to `rpmfilesIter()` if not `cpio`. This is needed because there is no cpio to enumerate. In the same function, if `rpmpluginsCallFsmFilePre()` returns `RPMRC_PLUGIN_CONTENTS` then `fsmMkfile()` is skipped as it is assumed the plugin did the work. The majority of the work is in `reflink_fsm_file_pre()` - the per file hook for RPM plugins. If the file enumerated in `rpmPackageFilesInstall()` is a regular file, this function will look up the offset in the digest->offset table and will try to reflink it, then fall back to a regular copy. If reflinking does work: we will have reflinked a whole number of pages, so we truncate the file to the expected size. Therefore installing most files does involve two writes: the reflink of the full size, then a fork/copy on write for the last page worth. If the file passed to `reflink_fsm_file_pre()` is anything other than a regular file, it return `RPMRC_OK` so the normal mechanics of `rpmPackageFilesInstall()` are used. That handles directories, symlinks and other non file types. = New API for internal use 1. `rpmReadPackageRaw()` is used within `rpm2extents` to read all the headers without trying to validate signatures. This eliminates the runtime dependency on rpmdb. 2. `rpmteFd()` exposes the Fd behind the rpmte, so plugins can interact with the rpm itself. 3. `RPMRC_PLUGIN_CONTENTS` in `rpmRC_e` for use in `rpmpluginsCallFsmFilePre()` specifically. 4. `pgpStringVal()` is used to help parse the command line in `rpm2extents` - the positional arguments are strings, and this converts the values back to the values in the table. Nothing has been removed, and none of the changes are intended to be used externally, so I don't think a soname bump is warranted here.

The existing code contains some variability in formatting. I'm not sure if { is meant to be on the end of the line, or on a new line, but I've standardized on the former. The indentation is intended to match the existing convention: 4 column indent, but 8 column wide tab characters. This is easy to follow/use in vim, but is surprisingly difficult to get right in vscode. I am doing this reformat here and now, and future changes will be after this. I'm keen to fold the patches together, but for now, I'm trying to keep the history of rpm-software-management#1470 linear so everyone can follow along.

There were some mismatches on field "sizes". This should eliminate the error messages.

malmond77 · 2021-02-01T05:55:58Z

I've addressed some of the nits. That's the easy bit.

For a more general update, see https://bugzilla.redhat.com/show_bug.cgi?id=1915976#c4 .

malmond77 · 2021-02-02T23:42:21Z

@pmatilai I'd like to address the remaining work with regards to timelines. I'm really keen on shipping as much as possible for Fedora 34 as it likely benefits the base code used for CentOS 9 Stream. I'm aware there are only a few more days left before Fedora 34 is frozen. I'd like to get my expectations in order, I'd like to know if the primary blockers are primarily related to the plugin API, and addressing the concern on untrusted decompression? If it helps, I'd like to chat on IRC or however you'd like, with a promise to write up a summary of our conversation for the record. Is there anyone else you'd like to weigh in here?

pmatilai · 2021-02-03T10:41:16Z

Sorry, I just don't have the bandwidth to deal with this in any sort of detail at this time. What seems clear to me though is that this isn't going to be acceptable in F34, beyond that the crystal ball grows dimmer.

That said, what little background processing I've managed to do on this, the more I think this isn't a use-case we'd want to be maintaining going forward in rpm itself. It's forced there right now because the plugin API is not public, and that API probably isn't quite enough to handle this as it is, especially getting the plugin API public is a long-standing todo-item, getting increasingly more important. As are a thousand other things, unfortunately, but at least the plugin API is something we can justify spending cycles on because it benefits so much more than just somebody's very special use-case.

I don't have time to dig into the details, but as a general guideline, try to approach this from the perspective of enhancing the plugin API to the goal of making this possible without changes to core rpm.

malmond77 · 2021-02-03T16:42:51Z

@pmatilai thanks for the feedback. I'm generally in agreement wrt the plugin API and externalizing optional code. I'm left wondering about rpm2extents.c: this isn't a plugin, but it does make extensive use of internal functions as they are the canonical implementation. Exposing those in librpm is likely not part of the goals for the regular API. What direction do you recommend I look at here?

The existing code contains some variability in formatting. I'm not sure if { is meant to be on the end of the line, or on a new line, but I've standardized on the former. The indentation is intended to match the existing convention: 4 column indent, but 8 column wide tab characters. This is easy to follow/use in vim, but is surprisingly difficult to get right in vscode. I am doing this reformat here and now, and future changes will be after this. I'm keen to fold the patches together, but for now, I'm trying to keep the history of rpm-software-management#1470 linear so everyone can follow along.

chantra · 2022-02-03T01:45:24Z

@pmatilai ,

I am actually taking @malmond77 's work and working on it. I suspect this PR got popped back into your list because of the commits going into my branch.

@malmond77 existing work, but ported to work on top of master (and more exactly to work on top of #1534 ), is https://github.com/chantra/rpm/tree/cow

From there, I have been trying to resolve @DemiMarie 's concerns in https://github.com/chantra/rpm/tree/cow_signvalidation with this approach:

while reading the file from stdin the content is passed into rpmpkgVerifySigs, at least some form of it, slightly modified to not log to stdout so we can capture the results of the signature verifications.
capture the result of that call and put it into the transcoded file footer metadata.
rpmkeys, can then check if the file is transcoded, and if so, return the rc code and text that was generated during transcoding.

This approach alllows to validate the authenticity of the rpm during transcode time. From there on, if deemed valid at transcode time, the result is essentially cached in the file.

Now, I understand you have concerns about the use of internal APIs, and I totally agree with your concerns, and as much I am willing to put some effort in making this happen, some guidance from your side will be useful and very much appreciated to define what should/could become public, and what could not. So would I appreciate a reasonable compromise that would allows moving forward without a whole refactoring of the current APIs.

In rpm-software-management#1470, it was mentioned that currently the plugin API is too coupled to rpmRC code and would benefit from [getting its own return codes](rpm-software-management#1470 (comment)) and eventually become a [public API](rpm-software-management#1470 (comment)). This diff defines a new emum `rpmPluginRC` and its accompanying `RPMPLUGINRC_{OK,NOTFOUND,FAIL}`, which are currently being used by plugins, and adjust the existing plugin to use it. CI build: https://gist.github.com/chantra/c8f61770b133b66c24cfbd5c2ed60a69

pmatilai · 2022-02-09T09:04:43Z

@chantra - just a heads-up: this is a most unfortunate time to be working on this stuff as the fsm will see an even bigger change (than #1534) in a month or two (hopefully!) to deal with the multiple open symlink CVEs.

Decoupling the plugin rc's is probably a good thing anyhow, but as for the initial use-case of that I think we just need a better API instead, one that is designed to offload payload content handling to a plugin rather than shoehorn it into the clearly ill-fitting existing one. As payload format is a per-package thing, this should be determined on rpmte level rather than per-file level, and the writing side needs to be something like rpm handing the plugin an open fd and the plugin placing the content there by whatever means.

chantra · 2022-02-09T20:43:17Z

Thanks for the heads up @pmatilai !

Decoupling the plugin rc's is probably a good thing anyhow, but as for the initial use-case of that I think we just need a better API instead, one that is designed to offload payload content handling to a plugin rather than shoehorn it into the clearly ill-fitting existing one.

Agreed, and I actually started iterating on this yesterday on my branch: https://github.com/chantra/rpm/commits/cow_signvalidation.

For this specific use-case, there is 2 entry points that we would need.

ability to "install a file"
ability to provide a RPM file iterator to rpmPackageFilesInstall and as we iterate through the content, possibly offload it to the plugin to install the file.

file install hook and its usage . Likewise for the rpmfi and its usage

A plugin hook should be able to:

process or not a call
whether it was processed or not, it should be able to pass the handling through the chain (which is the current behaviour)
it should also be able to stop the chaining. This is the current attempt signal via RPMRC_PLUGIN_CONTENT
when we run through
There is definitely ways to improve this API.

With those new hooks, @malmond77 's original patch is almost fully isolated outside of core RPM code. There is still some helper functions like pgpStringVal, rpmteFd (this is the next thing I want to look into), rpmReadPackageRaw, but other than that, no more special casing whether it is a standard cpio rpm or not.
Being able to have such hooks in the short term would at least help in easily developing a feature such as RPM CoW without getting too intermingled with RPM core code.

I would appreciate if we could come to an agreement on what those hooks could look like so those hooks could be addressed separately to the main PR and merged, allowing to then rebase this work off those hooks. Given that the API is currently internal only, it gives us some flexibility to experiment and iterate if needs be. #1909 is possibly a good place to start at?

As I was working on a (not perfect but better than nothing) signature validation solution, it feels like plugins should also be able to act on signature/digest verification.

The current work done on RPM CoW is at chantra/rpm@master...cow_signvalidation .

I still have some rough edges to address, want to test fully in prod.... but your feedback would be very appreciated so I can make progress toward something sustainable and PR-able eventually when it is a better time.

I will take some time to summarize more thoughts in https://hackmd.io/@chantra/BJ1Hedw0F once I am done with experimenting with this WIP.

DemiMarie · 2022-02-09T21:38:39Z

@malmond77 would you be willing to work on improving signature validation? Since the header is signed it is possible to include a list of digests there. RPM would need to check to check the header’s signature before it begins reading in the payload; I am not sure if it does this already. With each digest being of a 128KiB chunk, and the list of digests being in an binary entry, packages up to 2GiB are supported before running into RPM array size limits. If these limits are loosened for binary data, or if base64-encoded string arrays are used, much larger packages can be supported. Maximum amount of data that needs to be buffered is 128KiB and overhead is limited to 64 bytes (for SHA-512) in the header for each 128KiB chunk of payload. Stock RPM will simply ignore this entry.

DemiMarie · 2022-02-09T21:51:45Z

I will take some time to summarize more thoughts in https://hackmd.io/@chantra/BJ1Hedw0F once I am done with experimenting with this WIP.

If (as this document implies) the entire non-transcoded file is buffered on disk, things are far simpler. The RPMTAG_PAYLOADDIGESTALT entry can be used, which allows standard RPM tools to verify the signature. Furthermore, the file can be checked against an additional metadata entry.

chantra · 2022-02-09T22:47:46Z

@DemiMarie

If (as this document implies) the entire non-transcoded file is buffered on disk, things are far simpler.

No, it is not buffered on disk. Let me know what makes you think so, so i can clarify it.

Other than that, I am with you on improving the signature validation, but at this stage, it would require support from the packager side. TBH, being able to validate chunks on the fly may also be beneficial to non-cow RPM (if it were to no buffer files on disk before unarchiving).
The current transcoded metadata allows for an in-between solution that allows verification, but with the decompression library still being exposed to vulnerability. The attack surface can be lowered by dropping privileges/chrooting... the transcoder.

In any cases, even if chunk validation is done, a summary of the validation will still need to be store in the transcoded file metadata so the transcoded file can be "verified", or at least, it can be verified that it had been verified, after the fact.

Before getting into this though, I like to see how RPM CoW can become a thing in RPM, iterate on that, and time will come to improve signature validation.

DemiMarie · 2022-02-09T22:54:17Z

@DemiMarie

If (as this document implies) the entire non-transcoded file is buffered on disk, things are far simpler.

No, it is not buffered on disk. Let me know what makes you think so, so i can clarify it.

You mention the need for the file to be seekable.

Other than that, I am with you on improving the signature validation, but at this stage, it would require support from the packager side. TBH, being able to validate chunks on the fly may also be beneficial to non-cow RPM (if it were to no buffer files on disk before unarchiving). The current transcoded metadata allows for an in-between solution that allows verification, but with the decompression library still being exposed to vulnerability. The attack surface can be lowered by dropping privileges/chrooting... the transcoder.

That is definitely a step forward, especially if the hashing is performed in the parent process. I imagine SECCOMP_SET_MODE_STRICT would be pretty hard to break out of, and would dramatically reduce my worries about this patch. Is setting up SECCOMP_SET_MODE_STRICT before decompression a viable option?

In any cases, even if chunk validation is done, a summary of the validation will still need to be store in the transcoded file metadata so the transcoded file can be "verified", or at least, it can be verified that it had been verified, after the fact.

Absolutely (and I have no problem with this).

Before getting into this though, I like to see how RPM CoW can become a thing in RPM, iterate on that, and time will come to improve signature validation.

What if RPM CoW is merged but not enabled by default? Once signature validation is up to par with standard RPM it can be enabled by default.

chantra · 2022-02-09T23:21:59Z

You mention the need for the file to be seekable.

Ha. Yeah, so in order to be able to verify a transcoded file, you need to be able to seek in the file (the footer contains the magic + metadata).

So, something like:

cat mytranscoded.rpm | rpmkeys -Kv -

would not work, but
So, something like:

rpmkeys -Kv - <mytranscoded.rpm
or
rpmkeys -Kv mytranscoded.rpm

does. dnf currently use the latter approach.

That is definitely a step forward, especially if the hashing is performed in the parent process. I imagine SECCOMP_SET_MODE_STRICT would be pretty hard to break out of, and would dramatically reduce my worries about this patch. Is setting up SECCOMP_SET_MODE_STRICT before decompression a viable option?

rpm2extents in essence reads from stdin and writes to stdout so it should be pretty easy to sandbox. There is likely a bit more to it dues to expectations from the rpm library, but as much as privileges, it needs none. I have not looked into the specifics here yet, but from a high-level, it should be reasonably easy.

What if RPM CoW is merged but not enabled by default? Once signature validation is up to par with standard RPM it can be enabled by default.

As it is, RPM CoW only gets enabled when dnf-plugin-cow is enabled.

chantra · 2022-02-10T19:05:02Z

As payload format is a per-package thing, this should be determined on rpmte level rather than per-file level

This is somehow what is being done here. In the psm_pre hook, reflink determine that the package is a transcoded one and is to be handled by itself. I understand this hook is at rpmte level.

and the writing side needs to be something like rpm handing the plugin an open fd and the plugin placing the content there by whatever means

This is currently emulated by providing an archive_reader hook, that lets the plugin define how to iterate over each files (using header or content of payload). And then, for each entries, install of the files can be offloaded to the plugin with the install_file hook. If nothing is performed by a plugin, it defaults to core RPM behaviour.

Does that somehow align? Any changes you like to see in how this is done? Shall I start a PR on top of #1909 with those new hooks?

chantra · 2022-02-11T17:46:56Z

That is definitely a step forward, especially if the hashing is performed in the parent process. I imagine SECCOMP_SET_MODE_STRICT would be pretty hard to break out of, and would dramatically reduce my worries about this patch. Is setting up SECCOMP_SET_MODE_STRICT before decompression a viable option?

@DemiMarie so I looked a bit more into seccomp. Unfortunately strict mode is.... too strict with only read/write/_exit, nd sigreturn.
With a bit of fiddling with SECCOMP_RET_TRAP, this is the list I came up with in order to get the decompression to work:
https://gist.github.com/chantra/0fd33c338ba7465d87222b68bab76e70

As much as portability, AFAIK, not all architectures are covered, but I may be wrong. There is likely some corner cases here. For one, the FD_t layer adds stats computation to the read/write operations and within that lays calls to gettimeofday, which may or may not be a vsdo, so probably another one to add to the list.

DemiMarie · 2022-02-11T17:58:31Z

That is definitely a step forward, especially if the hashing is performed in the parent process. I imagine SECCOMP_SET_MODE_STRICT would be pretty hard to break out of, and would dramatically reduce my worries about this patch. Is setting up SECCOMP_SET_MODE_STRICT before decompression a viable option?

@DemiMarie so I looked a bit more into seccomp. Unfortunately strict mode is.... too strict with only read/write/_exit, nd sigreturn. With a bit of fiddling with SECCOMP_RET_TRAP, this is the list I came up with in order to get the decompression to work: https://gist.github.com/chantra/0fd33c338ba7465d87222b68bab76e70

That still looks fairly tight. You need to make sure that all unnecessary file descriptors have been closed, since otherwise a compromised process could use them without interference from seccomp. I suggest pointing FDs 0, 1, and 2 at /dev/null, and closing all of the others except for the ones that the decompressor is using. seccomp also supports parameter filtering, and I suggest using it for futex and other syscalls. Also, as a defense-in-depth measure, it is best to drop privileges by chrooting into /var/empty or a deleted directory, and by switching to an unprivileged user. I recommend SECCOMP_RET_KILL_PROCESS instead of SECCOMP_RET_KILL, since it kills all threads.

For sandboxing to be useful, the output of the sandboxee must be assumed untrusted. Hash computations must be done in the parent, for example. Also note that programs that call RPM may be multi-threaded, and using any non-async-signal-safe operation in the child process after fork() is undefined behavior for a multi-threaded parent. Ideally, that means you should execve() a new process that would then sandbox itself. In practice, I believe that glibc makes at least malloc() work. Using seccomp in a dedicated thread will appear to work, but it won’t actually provide any benefit since the address space is still shared.

As much as portability, AFAIK, not all architectures are covered, but I may be wrong. There is likely some corner cases here. For one, the FD_t layer adds stats computation to the read/write operations and within that lays calls to gettimeofday, which may or may not be a vsdo, so probably another one to add to the list.

That is a good idea. I recommend disabling RPM-CoW by default on architectures where a sandbox is not available.

chantra · 2022-02-11T19:05:22Z

@DemiMarie thanks for the feedback.

That still looks fairly tight. You need to make sure that all unnecessary file descriptors have been closed, since otherwise a compromised process could use them without interference from seccomp. I suggest pointing FDs 0, 1, and 2 at /dev/null, and closing all of the others except for the ones that the decompressor is using.

Agreed, cleaning up fds, in general, not just for this specific process, is still on my plate.

seccomp also supports parameter filtering, and I suggest using it for futex and other syscalls.

yes, I saw that but did not go down that rabbit-hole just yet. I merely wanted to PoC what leveraging seccomp, when available, would look like.

Also, as a defense-in-depth measure, it is best to drop privileges by chrooting into /var/empty or a deleted directory, and by switching to an unprivileged user.

yeah, @malmond77 and I were talking about plainly refusing to run this as root, forcing a caller to at least ensure that they are not calling rpm2extents as root. As much as which user. systemd dynamic users would be a good fit here vs a widely used nobody or such. As much as chrooting, it appears that rpm already has some userns/chrooting integration which makes this much easier than if we had to re-implement it.

I recommend SECCOMP_RET_KILL_PROCESS instead of SECCOMP_RET_KILL, since it kills all threads.

Good point, I just used the widely available seccomp-bpf.h from https://outflux.net/teach-seccomp/ as a PoC.

For sandboxing to be useful, the output of the sandboxee must be assumed untrusted. Hash computations must be done in the parent, for example.

This is being done in a separate process. The sandboxee here is only performing the transcoding, reading the results from digest/validation from a pipe and tacking this to the end of the transcoded file. I am not sure how we could consider the output untrusted given that it is the responsibility of the transcoder to transcode. e.g, whatever the transcoder has produced will be believed to be a legitimate transcoded version of the original content. By sandboxing we mostly want to ensure that vulnerability in the decompression library is not being exploited to gain more privileges.
To your point though, it should be the parent that collects the result of digest/signature and put this into the file (e.g spit it out to stdout).

Also note that programs that call RPM may be multi-threaded, and using any non-async-signal-safe operation in the child process after fork() is undefined behavior for a multi-threaded parent. Ideally, that means you should execve() a new process that would then sandbox itself. In practice, I believe that glibc makes at least malloc() work. Using seccomp in a dedicated thread will appear to work, but it won’t actually provide any benefit since the address space is still shared.

rpm2extents is used as part of a pipeline, taking a streamed rpm in stdin and writing its transcoded version to stdout. The one invoking rpm2extents in its pipeline should be exec-ing it. rpm2extents is single-threaded and spawn worker processes to perform individual actions.

cgwalters · 2022-02-11T19:26:00Z

Bear in mind that rpm is also used inside containers, not necessarily with CAP_SYS_ADMIN privileges as is typical on "host systems". And inside containers, one can't rely necessarily on the ability to recursively apply container/sandboxing features.

Carrying a seccomp policy is a very heavy investment.

In rpm-ostree we have a mixed usage of systemd and bwrap, but since we now are trying to also run inside a container as part of ostree native containers it's forcing us to not have those as hard dependencies.

DemiMarie · 2022-02-11T19:32:11Z

Bear in mind that rpm is also used inside containers, not necessarily with CAP_SYS_ADMIN privileges as is typical on "host systems". And inside containers, one can't rely necessarily on the ability to recursively apply container/sandboxing features.

RPM-CoW can be simply disabled in this case. It’s an optional feature.

chantra · 2022-02-11T19:33:06Z

Thanks for the heads-up @cgwalters .

Bear in mind that rpm is also used inside containers, not necessarily with CAP_SYS_ADMIN privileges as is typical on "host systems". And inside containers, one can't rely necessarily on the ability to recursively apply container/sandboxing features.

So rpm2extents needs zero capability, so I am hoping caps would not be an issue. But to your point, relying on userns to be able to chroot as an unprivileged user would probably break within a container.

Carrying a seccomp policy is a very heavy investment.

In rpm-ostree we have a mixed usage of systemd and bwrap, but since we now are trying to also run inside a container as part of ostree native containers it's forcing us to not have those as hard dependencies.

Tanks for the link.

chantra · 2022-02-14T19:13:58Z

The discussion/feedback that have been going on recently were very useful, but I would like to re-focus the original intent of this PR so we can make progress moving forward.

@pmatilai , @ffesti how would you like to proceed with this?

From my end, I see a few individual chunks that would help in getting stuff out incrementally without it being a big change.

rpm2extents change: This is the program to use in the pipeline to transcode the rpm into a RPMCoW-enabled rpm. This is mostly a standalone change, the only place where it touches rpm is to be able to leverage the existing signature verification logic. It also need the new rpmReadPackageRaw and having pgpStringVal to match digest algoritm from a string passed as an argument.
improve plugin API
- Decouple from rpmRC ([rpmplugin] decouple rpmplugin return code from rpmRC #1909)
- handle the RPMRC_PLUGIN_CONTENTS logic, e.g being able to have a plugin stop the plugin chain rather than iterating over single plugins.
- add new hooks when installing files and when getting a rpmfi
with the plugin API above, we will be able to integrate reflink as a plugin and not interference with RPM core.

I believe if we were to address the 3 items above in individual PRs, it will be easier to digest this original PR, iterate through each individual items to come up with something that we are all happy with.
Any suggestions?

In rpm-software-management#1470, it was mentioned that currently the plugin API is too coupled to rpmRC code and would benefit from [getting its own return codes](rpm-software-management#1470 (comment)) and eventually become a [public API](rpm-software-management#1470 (comment)). This diff defines a new emum `rpmPluginRC` and its accompanying `RPMPLUGINRC_{OK,NOTFOUND,FAIL}`, which are currently being used by plugins, and adjust the existing plugin to use it. CI build: https://gist.github.com/chantra/c8f61770b133b66c24cfbd5c2ed60a69

chantra · 2022-04-13T15:38:24Z

@chantra - just a heads-up: this is a most unfortunate time to be working on this stuff as the fsm will see an even bigger change (than #1534) in a month or two (hopefully!) to deal with the multiple open symlink CVEs.

@pmatilai has the storm passed? Is this now a better time to move forward with this PR? I would really love that we can start setting milestones to chip at in order to make RPM CoW mergeable. Thanks.

chantra · 2022-04-21T16:14:23Z

@pmatilai , @ffesti , have things settled down on your side that we could resume work on getting the needed helpers/foundations for this PR going?
How would you suggest we go about it?

chantra · 2022-04-27T16:23:29Z

@pmatilai , @ffesti , is there anything I can do to help getting this moving forward? Are you still busy with the CVEs and can't spare cycles? Or is there any other blockers that I am not aware of?
Thanks

pmatilai · 2022-05-05T12:17:55Z

For the time being, I think the most productive thing to do would be just opening a discussion on the API needs of the plugin-to-be. When something concrete gets decided there, it's time to talk about PRs.

chantra · 2022-05-05T21:09:15Z

Thanks @pmatilai . I will try to get something concise in a discussion thread then.

malmond77 force-pushed the cow branch 2 times, most recently from 2d0cda5 to 58f8d8b Compare December 31, 2020 18:29

pmatilai reviewed Jan 5, 2021

View reviewed changes

lib/fsm.c Outdated Show resolved Hide resolved

pmatilai reviewed Jan 5, 2021

View reviewed changes

lib/rpmplugins.c Show resolved Hide resolved

pmatilai reviewed Jan 5, 2021

View reviewed changes

plugins/reflink.c Outdated Show resolved Hide resolved

Conan-Kudo mentioned this pull request Jan 31, 2021

Add support for rpm2extents transcoder rpm-software-management/librepo#222

Open

malmond77 added 4 commits January 31, 2021 17:11

Remove use of bool type for consistency

9de362d

Fix printf formatting in reflink.c

19694b7

There were some mismatches on field "sizes". This should eliminate the error messages.

malmond77 force-pushed the cow branch from 58f8d8b to 19694b7 Compare February 1, 2021 01:32

chantra mentioned this pull request Feb 4, 2022

[rpmplugin] decouple rpmplugin return code from rpmRC #1909

Closed

lnussel mentioned this pull request Mar 22, 2022

rpm package header based db backend #1959

Closed

This was referenced Feb 1, 2023

Cow denylist refactoring rphibel/rpm#1

Closed

RPM with Copy on Write #2378

Closed

RPM with Copy on Write #1470

RPM with Copy on Write #1470

Conversation

malmond77 commented Dec 29, 2020

rpm2extents

reflink plugin

New API for internal use

DemiMarie commented Dec 30, 2020

malmond77 commented Dec 30, 2020

DemiMarie commented Dec 30, 2020

lnussel commented Jan 4, 2021

malmond77 commented Jan 4, 2021

pmatilai commented Jan 5, 2021

pmatilai Jan 5, 2021

Choose a reason for hiding this comment

chantra Feb 4, 2022

Choose a reason for hiding this comment

pmatilai commented Jan 5, 2021

pmatilai commented Jan 5, 2021 • edited Loading

pmatilai commented Jan 5, 2021

pmatilai commented Jan 5, 2021

malmond77 commented Jan 5, 2021

pmatilai commented Jan 7, 2021

mlschroe commented Jan 8, 2021

malmond77 commented Jan 11, 2021

malmond77 commented Feb 1, 2021

malmond77 commented Feb 2, 2021

pmatilai commented Feb 3, 2021

malmond77 commented Feb 3, 2021

chantra commented Feb 3, 2022

pmatilai commented Feb 9, 2022

chantra commented Feb 9, 2022

DemiMarie commented Feb 9, 2022

DemiMarie commented Feb 9, 2022

chantra commented Feb 9, 2022

DemiMarie commented Feb 9, 2022

chantra commented Feb 9, 2022

chantra commented Feb 10, 2022

chantra commented Feb 11, 2022

DemiMarie commented Feb 11, 2022

chantra commented Feb 11, 2022

cgwalters commented Feb 11, 2022

DemiMarie commented Feb 11, 2022

chantra commented Feb 11, 2022

chantra commented Feb 14, 2022

chantra commented Apr 13, 2022 • edited Loading

chantra commented Apr 21, 2022

chantra commented Apr 27, 2022

pmatilai commented May 5, 2022

chantra commented May 5, 2022

pmatilai commented Jan 5, 2021 •

edited

Loading

chantra commented Apr 13, 2022 •

edited

Loading