Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make our Dangerzone image reproducible #1049

Open
wants to merge 27 commits into
base: main
Choose a base branch
from
Open

Conversation

apyrgio
Copy link
Contributor

@apyrgio apyrgio commented Jan 14, 2025

This PR makes our container image reproducible, and enforces it with a CI job on every commit. To do so, we also change our base image from Alpine Linux to Debian Stable, and we use some prior art and tooling from the work that @AkihiroSuda has done on reproducible containers.

Fixes #1046
Fixes #1047
Fixes #1048

@apyrgio apyrgio force-pushed the 1046-reproducibility branch 2 times, most recently from c1f5d75 to e02dbfd Compare January 14, 2025 10:29
Copy link
Contributor

@almet almet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work on this!

I added a few comments here and there, and will probably come back to it once I've run it on my laptop. It's pretty neat!

.github/workflows/build.yml Outdated Show resolved Hide resolved
.github/workflows/ci.yml Outdated Show resolved Hide resolved
Dockerfile Outdated Show resolved Hide resolved
Dockerfile Outdated Show resolved Hide resolved
Dockerfile Outdated Show resolved Hide resolved
docs/developer/reproducibility.md Outdated Show resolved Hide resolved
docs/developer/reproducibility.md Outdated Show resolved Hide resolved
docs/developer/reproducibility.md Show resolved Hide resolved
install/common/build-image.py Outdated Show resolved Hide resolved
install/linux/vendor-pymupdf.py Show resolved Hide resolved
@@ -0,0 +1,103 @@
#!/bin/bash
#
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also had a discussion about this file, and if it should be included in this repository or not (the other option would be to download it on the fly, which could be done by the build-image.py script.).

We didn't came up with a conclusion on this one just yet. It feels odd to have it included here, but might work on the short run.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, would a third party notice make things better? Check out 4595b3a

@almet almet added this to the 0.9.0 milestone Jan 20, 2025
@apyrgio apyrgio force-pushed the 1046-reproducibility branch 3 times, most recently from eaf3de9 to cbb7ed9 Compare January 20, 2025 15:46
Copy link
Contributor

@almet almet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've checked-out your branch and used it locally. I'm glad to report that it works well! 🎉

I've added a few other questions / remarks, the main one being about ARM64 support for reproducibility.

dev_scripts/reproduce.py Outdated Show resolved Hide resolved
dev_scripts/reproduce.py Outdated Show resolved Hide resolved
dev_scripts/reproduce.py Outdated Show resolved Hide resolved
@almet
Copy link
Contributor

almet commented Jan 21, 2025

(Also happy to report that the size of the image dropped from 577 MB to 445 MB with the changes in this branch)

Copy link
Contributor

@almet almet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the changes, I believe we're good to merge as soon as the CI is green !

@apyrgio apyrgio force-pushed the 1046-reproducibility branch 4 times, most recently from c8a016c to ac6931f Compare January 21, 2025 22:28
almet and others added 14 commits January 23, 2025 12:52
The previous library we were using for this (`appdirs`) is dead upstream
and not supported anymore in debian testing.

Fixes #1058
Move container-only build context (currently just the entrypoint script)
from `dangerzone/gvisor_wrapper` to `dangerzone/container_helpers`.
Update the rest of the scripts to use this location as well.
Download and copy the following artifacts that will be used for building
a Debian-based Dangerzone container image in the subsequent commits:
* The APT key for the gVisor repo [1]
* A helper script for building reproducible Debian images [2]

[1] https://gvisor.dev/archive.key
[2] https://github.com/reproducible-containers/repro-sources-list.sh/blob/d15cf12b26395b857b24fba223b108aff1c91b26/repro-sources-list.sh
Remove the need to copy the Dangerzone container image (used by the
inner container) within a wrapper gVisor image (used by the outer
container). Instead, use the root of the container filesystem for both
containers. We can do this safely because we don't mount any secrets to
the container, and because gVisor offers a read-only view of the
underlying filesystem

Fixes #1048
Switch base image from Alpine Linux to Debian Stable, in order to reduce
our image footprint, improve our security posture, and build our
container image reproducibly.

Fixes #1046
Refs #1047
Remove all the scaffolding in our `build-image.py` script for using the
`poetry.lock` file, now that we install PyMuPDF from the Debian repos.
Rename the `vendor-pymupdf.py` script to `debian-vendor-pymupdf.py`,
since it's used only when building Debian packages.
apyrgio and others added 13 commits January 23, 2025 23:25
Remove our suggestions for not using the container cache, which stemmed
from the fact that our Dangerzone image was not reproducible. Now that
we have switched to Debian Stable and the Dockerfile is all we need to
reproducibly build the exact same container image, we can just use the
cache to speed up builds.
Add jinja2-cli as a package dependency, since it will be used to create
the Dockerfile from some user parameters and a template.
Allow updating the Dockerfile from a template and some envs, so that
it's easier to bump the dates in it.
Update the Debian snapshot date to the current one, so that we always
scan the latest image for CVEs.

Refs #1057
Allow setting a tag for the container image, when building it with the
`build-image.py` script. This should be used for development purposes
only, since the proper image name should be dictated by the script.
Add a dev script for Linux platforms that verifies that a source image
can be reproducibly built from the current Git commit. The
reproducibility check is enforced by the `diffoci` tool, which is
downloaded as part of running the script.
Add a CI job that uses the `reproduce.py` dev script to enforce image
reproducibility, for every PR that we send to the repo.

Fixes #1047
Co-authored-by: Alexis Métaireau <[email protected]>
Mask some paths of the outer container in the OCI config of the inner
container. This is done to avoid leaking any sensitive information from
Podman / Docker / gVisor, since we reuse the same rootfs

Refs #1048
@apyrgio apyrgio force-pushed the 1046-reproducibility branch from 06fe63b to 3cf34e6 Compare January 24, 2025 08:14
@apyrgio
Copy link
Contributor Author

apyrgio commented Jan 24, 2025

I've updated this PR with a way to separate the rootfs of the inner and outer container image, but reuse the same /usr dir, which takes the main bulk of the space. Check out:

I've also rebased it on top of the platformdirs branch, so that the tests can pass. I believe we can merge the symlink approach for now, with the caveat that it's too "magic". I'd prefer something more straight-forward, but in this case it's not that simple.

Let me know your thoughts, thanks!

@apyrgio
Copy link
Contributor Author

apyrgio commented Jan 24, 2025

Sigh, I just realized something important while scanning the final image with Grype; Grype does not detect vulnerabilities for the inner container image. It could do that for the Alpine one, but it seems it doesn't do it for the Debian one. This could also be due to the fact that we use scratch images (see anchore/grype#383), or because we haven't copied the /var layer into the inner container image, which has some apt info.

So, it seems that I need to dig more into it :-/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: PR Review
2 participants