Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust compiler options used to build Python #1566

Merged
merged 1 commit into from
Apr 16, 2024
Merged

Conversation

edmorley
Copy link
Member

@edmorley edmorley commented Apr 16, 2024

In order to improve parity with the upstream Docker Hub Python image builds, the build scripts used for our Python binary builds have been adjusted as follows:

After these changes, our compiler/linker options are now closer to:
https://github.com/docker-library/python/blob/330331fbe3c8d19befaba10ee329c5bf3a9dc225/3.12/slim-bookworm/Dockerfile#L70-L89

These changes are being made now since we'll soon be generating new Python binaries/archives under a new URL structure, which will provide a safer/more convenient transition point to switching to these new compiler options (vs overwriting the existing archives on S3, or only making this change for new Python releases onwards).

GUS-W-14217295.


For reference, using Ubuntu 22.04 dpkg-buildflags --get CFLAGS currently returns:
-g -O2 -ffile-prefix-map=/=. -flto=auto -ffat-lto-objects -flto=auto -ffat-lto-objects -fstack-protector-strong -Wformat -Werror=format-security

And dpkg-buildflags --get LDFLAGS returns:
-Wl,-Bsymbolic-functions -flto=auto -ffat-lto-objects -flto=auto -Wl,-z,relro

@edmorley edmorley self-assigned this Apr 16, 2024
@edmorley
Copy link
Member Author

An example build using these new options (it's based on a branch for a later PR, but uses these new options):
https://github.com/heroku/heroku-buildpack-python/actions/runs/8703318932

@edmorley edmorley marked this pull request as ready for review April 16, 2024 15:15
@edmorley edmorley requested a review from a team as a code owner April 16, 2024 15:15
@edmorley edmorley force-pushed the adjust-compiler-options branch from 46281a1 to eb87db0 Compare April 16, 2024 15:20
In order to improve parity with the upstream Docker Hub Python image builds, the build scripts used for our Python binary builds have been adjusted as follows:
- The Ubuntu security hardening compiler/linker flags are now retrieved
  using `dpkg-buildflags` and passed to the `make` invocation. See:
    - https://wiki.ubuntu.com/ToolChain/CompilerFlags
    - https://wiki.debian.org/Hardening
    - docker-library/python#810
- Configure is now called with an explicit `--build` architecture.
- The directory into which Python is installed during packaging
  has been changed to make it clearer that this it is only a
  temporary packaging path (and so why this path doesn't match
  that used in the CNB for example), since Python is relocated by
  both this buildpack and the CNB into different locations.

After these changes, our compiler/linker options are now closer to:
https://github.com/docker-library/python/blob/330331fbe3c8d19befaba10ee329c5bf3a9dc225/3.12/slim-bookworm/Dockerfile#L70-L89

These changes are being made now since we'll soon be generating new
Python binaries/archives under a new URL structure, which will provide
a safer/more convenient transition point to switching to these new
compiler options (vs overwriting the existing archives on S3, or only
making this change for new Python releases onwards).

GUS-W-14217295.
@edmorley edmorley force-pushed the adjust-compiler-options branch from eb87db0 to 6022b86 Compare April 16, 2024 15:41
@edmorley edmorley merged commit 84684a0 into main Apr 16, 2024
5 checks passed
@edmorley edmorley deleted the adjust-compiler-options branch April 16, 2024 16:04
edmorley added a commit that referenced this pull request Apr 18, 2024
As part of the CNB multi-architecture support work, we need to change
the Python runtime archive S3 URLs to include the architecture name.
In addition, for the CNB transition from "stacks" to "targets", it would
be helpful to switch from stack ID references (such as `heroku-22`) in
the URL scheme, to the distro name+version (eg `ubuntu` and `22.04`)
available to CNBs via the CNB targets feature. See:
https://github.com/buildpacks/spec/blob/buildpack/0.10/buildpack.md#targets-1

Rather than duplicate the Python archives on S3 under different
filenames/locations, it makes sense to migrate this buildpack to the new
archive names too, so the same S3 archives can be used by both this
buildpack and the CNB.

Moving to new archive names/URLs also means we can safely regenerate all
existing Python versions to pick up the changes in #1566 (and changes
made in the past, such as #1319, #1320, #1321 and #1322), since we won't
have to worry about overwriting the old archives (which is something
we've typically avoided, since it isn't compatible with the model of
being able to roll back to an older buildpack version to return to prior
behaviour).

Since we're changing the S3 URLs anyway, now is also a good time to make
another change that would otherwise cause churn in the S3 URLs again
(which affects people that pin buildpack version): Switching archive
compression format from gzip to Zstandard (something that we've been
wanting to do for a while).

Zstandard (aka zstd) is a much superior compression format over gzip
(smaller archives and much faster decompression), and is seeing
widespread adoption across multiple ecosystems (eg APT packages,
Docker images, web browsers etc).

See:
https://github.com/facebook/zstd
https://github.com/facebook/zstd/blob/dev/programs/README.md#usage-of-command-line-interface

Our base images already have `zstd` installed (and for Rust for the CNB,
there is the [zstd](https://crates.io/crates/zstd) crate available), so it's an easy switch.

Various compression levels were tested using zstd's benchmarking feature
and in the end the highest level of compression picked, since:
1. Unlike some other compression algorithms, zstd's decompression speed
   is generally not affected by the compression level.
2. We only have to perform the compression once (when compiling Python).
3. Even at the highest compression ratio, it only takes 20 seconds to
   compress the Python archives compared to the 10 minutes it takes to
   compile Python itself (when using PGO+LTO).

For the Ubuntu 22.04 Python 3.12.3 archive, switching from gzip to zstd
(level 22, with long window mode enabled) results in a 26% reduction in
compressed archive size.

GUS-W-15158299.
GUS-W-15505556.
@heroku-linguist heroku-linguist bot mentioned this pull request Apr 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants