Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate SBOMs for Python and Rust dependencies #15

Open
legoktm opened this issue Sep 17, 2024 · 8 comments
Open

Generate SBOMs for Python and Rust dependencies #15

legoktm opened this issue Sep 17, 2024 · 8 comments

Comments

@legoktm
Copy link
Member

legoktm commented Sep 17, 2024

SBOMs are the cool new thing that are now becoming a best practice. Some links from 2 seconds of web searching:

@legoktm
Copy link
Member Author

legoktm commented Dec 16, 2024

Interesting note from https://github.com/rust-secure-code/cargo-auditable?tab=readme-ov-file#does-this-protect-against-supply-chain-attacks

Software Bills of Materials (SBOMs) do not prevent supply chain attacks. They cannot even be used to assess the impact of such an attack after it is discovered, because any malicious library worth its bytes will remove itself from the SBOM. This applies to nearly every language and build system, not just Rust and Cargo.

Do not rely on SBOMs when dealing with supply chain attacks!

I was trying to figure out whether this is something we should do at build time or at development time (it would theoretically be harder, but not impossible, for a malicious library to infect the latter). CPython has a partially manual process in which they commit the SBOM: https://devguide.python.org/developer-workflow/sbom/ but I think most other projects generate it at build time. Especially for Rust I think that'll be necessary, rust-lang/rfcs#3553 explains that it needs to be done with cargo's dependency resolution and also execution of build.rs files.

@legoktm
Copy link
Member Author

legoktm commented Dec 16, 2024

There are two primary SBOM standards: SPDX (Linux Foundation project, also maintains license identifier list) and CycloneDX (OSWAP project). https://www.ntia.gov/files/ntia/publications/sbom_formats_survey-version-2021.pdf has a comparison of them, my take is that the format itself really doesn't matter at this stage, if needed we can use tools like https://github.com/protobom/sbom-convert to convert between different formats.

As far as tooling goes, the CycloneDX ecosystem looks much better in Python and Rust:

But there are no "official" SPDX tools AFAICT.

Other ones I looked at:

Also I expect that the various package managers and language ecosystems will have their own built-in tools so it probably doesn't matter in the long-run what we pick now.

There's a good point at psf/sboms-for-python-packages#7 (comment) that SBOMs have a legal/regulatory impact (similar-ish to copyright tracking) so it's important to be cautious on what we're promising (e.g. the CPython SBOM deliberately excludes licensing information). I read the summary of https://www.ntia.gov/files/ntia/publications/sbom_minimum_elements_report.pdf and skimmed the rest, but I haven't looked at what's required by the EU.

@legoktm
Copy link
Member Author

legoktm commented Dec 16, 2024

My initial proposal is:

We generate a CycloneDX SBOMs that includes our Python and Rust dependencies, but do not include any Debian-installed dependencies. For now we'll create separate SBOM files for each Python component and Rust workspace, but in the future we can use tools to merge the SBOM files into one.

The SBOM files are committed to Git in a "sbom" folder; any updates to prod dependencies will require it to be updated, and we'll have CI verify that. In the folder we'll have a README with a disclaimer like: "The following SBOMs are available for the primary purpose of vulnerability scanning. They are incomplete; if you need a SBOM for regulatory or legal purposes, please contact the SecureDrop team directly."

We'll ship the SBOM files inside the deb/rpm packages, probably under /usr/share/doc or something. This way they're signed by proxy too (but that doesn't work for the admin workstation).

@zenmonkeykstop
Copy link

Admin workstation does (if used correctly) involve checking the signed tag - is that enough or should we be hashing and signing sbom/ with the release key or something similar?

@legoktm
Copy link
Member Author

legoktm commented Dec 16, 2024

I totally overlooked that, that's good enough I think. Eventually I think we should more formally publish our SBOMs alongside e.g. our build metadata, but I think that should wait until we have a better idea of the legal/regulatory impact and we're more confident that it's complete.

@legoktm
Copy link
Member Author

legoktm commented Dec 16, 2024

To get a sense of how "incomplete" the generated SBOM files are, here's additional dependencies we have just by looking at the build-debs.sh scripts in securedrop-client:

  • bash (host)
  • git (host)
  • podman (host)
  • debian:bookworm container image (idk if you also need to specify the packages inside the container)
  • securedrop-builder Git repository (this is its own project because each of these pre-build wheels should have their own SBOM files)
  • rustup / rustup-init
  • Rust toolchain
  • all the Debian packages listed in the buildinfo file.

I think it's doable to get to a 100% complete SBOM, but because much of it isn't tracked in a machine-readable way, it'll require manual work that I am skeptical is worth it based on current requirements/tooling.

legoktm added a commit to freedomofpress/securedrop-client that referenced this issue Dec 17, 2024
These focus on the Python and Rust dependencies, without touching the
myriad Debian and other dependencies we pull in at build time.

This builds the foundation for us to start adding more stuff.

Refs <freedomofpress/securedrop-tooling#15>.
legoktm added a commit to freedomofpress/securedrop-client that referenced this issue Dec 17, 2024
These focus on the Python and Rust dependencies, without touching the
myriad Debian and other dependencies we pull in at build time.

This builds the foundation for us to start adding more stuff.

Refs <freedomofpress/securedrop-tooling#15>.
@adaFPF
Copy link

adaFPF commented Dec 20, 2024

Just noticed this as I was prepping to log-off yesterday. Figured I'd drop some commentary today for folks to come back to in the new year.

Eventually I think we should more formally publish our SBOMs alongside e.g. our build metadata, but I think that should wait until we have a better idea of the legal/regulatory impact and we're more confident that it's complete.

I agree with this in spirit, and relates to the main reason why I haven't been pushing for our dev teams to do so. Part of the unspoken bit here is that SBOMs are less useful for software that is intended for end-users, and way more for stuff that is used to build other things. SBOMs are also the weakest instance (but also the better developed) of various efforts to make software supply chain security better.

In our context, it doesn't add much in the way of value for folks making use of our software except to soothe nerves of folks who have a shallow idea that "SBOMs are good and important for software to have." I see us taking the time to generate and bundle them as more of a "good neighbor" thing to do by virtue of increasing overall adoption of the practice and (hopefully) therefore increase the likelihood that more of OUR dependencies will be making them available. But not something with any urgency, while also allowing it's a pretty easy thing to make happen.

I was trying to figure out whether this is something we should do at build time or at development time

This is something that folks have had differing opinions on, and there are tradeoffs for the decision, some of which you articulated. My own preference is with an eye towards having things simpler in the long-term and generate them at build time as a part of generating build provenance.

The laziest version of SBOM generation happens at dev time--it's really not much more than signing a specially formatted version of your dependency file. That's a reason to be even more cautious when it comes to other projects' SBOMs generated at that stage.

I did enjoy seeing somebody's work at a conference recently where they poked holes in SBOMs / "improved" them by analyzing ASTs of a piece of software at dev time to discover where folks where making use of functions from from inherited dependencies which weren't included in the SBOMs, which typically are explicit about only covering first-order dependencies.

There are two primary SBOM standards: SPDX (Linux Foundation project, also maintains license identifier list) and CycloneDX (OSWAP project)

You probably saw this in your investigation, but for completeness' sake: There's also SWID and VEX, neither of which are super conformant to current expectations around SBOMs. SWID was explicitly listed as one of the "main 3" types of SBOMs by the US govt a couple years back, but by and large is considered not actually viable for use by most folks except some govt related folks. VEX ended up rightly recognized as its own thing is now considered a separate-but-similar effort for increasing supply chain security.

my take is that the format itself really doesn't matter at this stage

Agree. No clear winner. Most major tooling that lets you drop-in an SBOM serializer handles both. If there comes a time when there's a clearly dominant form, I expect there will be scaffolding in place to help make the shift if you picked the "wrong" one.

As far as tooling goes,

One option is to go intentionally format agnostic with something like protobom. Protocol buffers themselves don't natively support rust, but there are 3rd party extensions for it.

SBOMs have a legal/regulatory impact

As far as I have heard, nobody is worried about being exposed to liability for having an SBOM available. In practice, it's treated (sadly) like a lot of compliance-related requirements: a ticky box to be checked off by folks who are forced to tick them off--and have heard tales of folks engaged in software development/consumption internal to the US govt literally using an unsigned word document to be able to say they've ticked the box--on both sides of the process.

That's not to say that won't change, particularly given the nature of the incoming US federal executive administration. I expect it's reasonable to explicitly say what is in scope for an SBOM we choose to start making available, and to stick to what current conventions are--particularly and especially limiting it to top-level dependencies only. You could go further (especially if you wanted to iterate on making the resulting SBOM comprehensive over time, which looks to be your initial proposal precisely) and be explicit about being limited to langauge-specific dependencies, as well, expanding that as you bring other languages and artifacts explicitly into scope. I would actually avoid the vague disclaimer suggested, as I suspect that will more likely invite bureaucratic tedium and headache more than it would offer any real protection against liability.

The projects I've seen that generate SBOMs rarely have things like OS images to worry about capturing, or if they do then make them explicitly out of scope. At best, they'll treat an (unmodified, pinned) image as a whole as a dependency, with the expectation that if you want to know what you're getting upstream via it, it's up to you to investigate directly. This is in line with the overall SBOM strategy (such as there is one) of eventually working towards indirectly making transitive dependency inspection possible through wide adoption of SBOM bundling. The goal is to have dependency inspection work like dominoes--and can help add a validation layer once things get to the point where automation can bundle (in a differential fashion) information about transitive dependencies and then comparing those to SBOMs directly associated to each transitive dependency's specific release referenced by the top-level SBOM at time of software consumption.

In other words, I would expect tooling to improve over time (granted, at a slow pace) for handling things like including dependencies inherited from an OS, and not stress myself too much about that for now.

@adaFPF
Copy link

adaFPF commented Dec 20, 2024

Part of the unspoken bit here is that SBOMs are less useful for software that is intended for end-users, and way more for stuff that is used to build other things.

The goal is to have dependency inspection work like dominoes

Combine these two statements, and you basically have the motivation behind me wanting to do this instead of pushing for us to make our own SBOMs. With (hopefully) a new IT person arriving soon, maybe something to revisit 2025?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants