Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve SBOM storage/format #3068

Open
marshall007 opened this issue Oct 3, 2024 · 3 comments
Open

Improve SBOM storage/format #3068

marshall007 opened this issue Oct 3, 2024 · 3 comments
Labels
enhancement ✨ New feature or request

Comments

@marshall007
Copy link

Describe what should be investigated or refactored

Currently the sboms.tar layer contains both JSON documents and generated HTML for an "SBOM viewer" page for each of the images in the Zarf package. The current approach has several downsides:

  1. adds non-trivial overhead to the size of every Zarf package stored in OCI
  2. downloading individual SBOMs (ex. for a particular image) is impossible
  3. the JSON documents are in a Syft-specific JSON format (not SPDX or CycloneDX) and thus not consumable by other tooling, like Trivy
  4. there is no way to tell what specific tooling/version was used to generate the SBOMs
$ oras blob fetch ghcr.io/defenseunicorns/packages/uds/gitlab@sha256:3269b4c33b0d452e6935fc7782dbf64da197b2e6225eb0ba7699831a6cabe877 --output sboms.tar
$ ls -lah sboms.tar
-rw-rw-r-- 1 marshall007 marshall007  77M Oct  3 15:07 sboms.tar

For comparison, compressed tarballs that contain only the JSON documents are <10x the size:

$ ls -lah sboms.tar
-rw-rw-r-- 1 marshall007 marshall007  77M Oct  3 15:07 sboms.tar
-rw-rw-r-- 1 marshall007 marshall007 5.4M Oct  3 15:09 sboms.tar.gz
-rw-rw-r-- 1 marshall007 marshall007 1.7M Oct  3 15:09 sboms.tar.xz

Proposed solution

  1. adopt standard SPDX JSON format
  2. store SPDX SBOM documents as OCI artifacts using in-toto attestations
  3. if we wish to keep the HTML "SBOM viewer", consider baking it into the CLI tool (i.e. start a web server that looks at local or remote SPDX JSON documents)
@marshall007 marshall007 added the tech-debt 💳 Debt that the team has charged and needs to repay label Oct 3, 2024
@Racer159
Copy link
Contributor

Racer159 commented Oct 4, 2024

Currently Zarf has prioritized an agnostic format for SBOMs to capture the maximum amount of data that Syft (the tool Zarf uses under the hood) can give Zarf. The Syft JSON files can be downconverted to other formats and conversion is covered in the latter half of this docs section: https://docs.zarf.dev/ref/sboms/#extracting-a-packages-sbom

@schristoff schristoff added enhancement ✨ New feature or request and removed tech-debt 💳 Debt that the team has charged and needs to repay labels Oct 4, 2024
@AustinAbro321
Copy link
Contributor

AustinAbro321 commented Oct 4, 2024

For the tooling/version used are you looking to see Zarf or Syft?

As of v0.41.0, the Syft json has .descriptor.name and .descriptor.version, which evaluate to Zarf and Zarf version respectively. Additionally, under the .schema field there's the schema version of the Syft json.

@marshall007
Copy link
Author

Thanks guys, the Syft JSON makes sense. I'm sold.

For the tooling/version used are you looking to see Zarf or Syft?

I think I'd expect to see Syft, but maybe this is not so important afterall. I'm still looking into it but maybe all that matters is the schema version. I need to see if different syft versions produce different results (outside of schema version changes).

As of v0.41.0, the Syft json has .descriptor.name and .descriptor.version, which evaluate to Zarf and Zarf version respectively.

I see these fields, but Zarf is failing to populate .descriptor.version in the SBOMs I've looked at so far.


Another thing I discovered today is that Zarf is not preserving the original manifest digests in the generated SBOM. Here is the diff between the .source section of an SBOM in the Zarf package vs what I get from scanning with syft directly:

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement ✨ New feature or request
Projects
Status: Triage
Development

No branches or pull requests

4 participants