Inefficiencies in the build process #41

sunnyflunk · 2022-12-08T03:39:32Z

Spent a couple of hours mapping out the build process and little tweaks that can help a lot to the overall experience and performance.

Create root

Not much value in making it parallel and just ensure all the directories are available for next steps.

Allow mounting directories as tmpfs. Many writes saved and blitting/caching is a lot faster! Easy 1-2s on builds with many deps.

Fetching upstreams and packages

The ideal would be to fetch index and start fetching upstreams while calculating deps and packages to also fetch. A lot of hassle really (and we go back to that moss-jobs style overhead) if we simply make the index fetch and dep calc fast we can get most gains even with the current approach and the following modifications.

This mixes up moss and boulder work queues so would need a refactor if it's worth implementing at all.

Fetch upstreams with the packages! We can add them together to the same work queue without any problem really. The benefit of current approach is early exit if upstream doesn't validate, but would assume you would fix and try again therefore still want the packages fetched.
- While adding upstreams to the queue (no benefit of this otherwise), we can also create a function to extract, (or setup) the upstream in the build root. Fetchable is pretty awesome really! There is also having to deal with the PGO build and having to recreate the upstreams during the build (that seems easy enough to reuse the same functions to setup the builddir extractions).

This is only within moss

Cache stones that have already been downloaded in the work queue. Currently it sits waiting for all packages to fetch first and only caches downloaded packages. See build(deps): bump the cargo group with 1 update tools#23 (comment) for an easy strategy for adding them to the same queue. This will save seconds when you only have a few packages to fetch.
Potential to add blitting to the workqueue (i.e. cache then blit in the function). Currently it's done later, one at a time. Shouldn't be too parallel as only caching one package at a time.

Build stages

Most areas to speed up are via clang and options so won't be explored here. Still a couple of options.

Make fakeroot optional. Two options, make fakeroot only for the install stage which will save 95% of the overhead. Then can only create files in install, during install stage. Alternatively make it opt in and overhaul file permissions in moss-format to handle it.

Analyse

This needs quite an overhaul to handle our coming requirements (and stop reading files twice). The benefit of not reading files twice doesn't save that much time but opens new opportunities with a refactor.

When reading files to generate the xxhash, size, path etc, we can also collect extra information from file. isELF? hasHashBang? isHardLink? FileType (PNG, JPG, ELF, text etc), we then no longer have to reread each file to see if it's ELF (which we will have to do again to test for #!).
- Use isELF for ELF sieve
Add a providers key to package_definition. Basically rundeps, but to manually add a provider. Can cut a lot of code to handle the ld additions in glibc and then easily support all future compat-symlink providers (and there will be quite a few).
Deal with hardlink detection so we can add back stripping. Not entirely sure but was seeing bad performance when rebuilding clang without it (has Wl,-q relocation info in it)
- Combine debuginfo and strip functions. If doing debug files, we can strip in llvm-objcopy to save a call to llvm-strip
- Also strip comments -R .comment

I've also considered switching from the sieve based approach to directly operating on files that the sieve target (they're pretty specific) rather than iterating each file. But provided we get isELF, hasHashBang on first read, iterating a sieve becomes basically costless and the alternative is hard to make parallel.

Emit packages

New zstd bindings that are fast. Use --long and tweak level (is -16 best?)
Make emitting packages parallel, compressing is not well threaded
Compress debug files in either analyse step (parallel) or during build. Not sure which would be faster and then no need to compress debug packages (and smaller debug files for everyone). Ideally want zstd compression, but support is early for that.
Try sorting the content payload by file type ala https://serpentos.com/blog/2021/10/04/optimal-file-locality. Not expecting a huge gain here, but would be nice to try and works best at the compression levels we use.

Other ideas

Plumb in ABI output
Cache packages to a global boulder DB. The main concern is running multiple builds. Provided we lock these so only one can handle pre-build phase at a time (when passing off to run the build stages, the next build can start). Only want the content, layout and metadata shared, and create a DB for anything else that's needed per build
Copy packages from host DBs. I think I'd rather keep the Host DBs and boulder separate for safety, but there's no reason you couldn't take the information from the host and reuse them (and reflink the files if supported). Could save some time and reduce the load on the servers and also copy from boulder to host as boulder will have newer versions of the packages.

The text was updated successfully, but these errors were encountered:

joebonrichie · 2023-01-15T08:45:59Z

Compress debug files in either analyse step (parallel) or during build. Not sure which would be faster and then no need to compress debug packages (and smaller debug files for everyone). Ideally want zstd compression, but support is early for that.

Compressed debug sections (zlib)
272K nano-dbginfo-7.1-4-1-x86_64.stone
Uncompressed debug sections
256K nano-dbginfo-7.1-4-1-x86_64.stone

Compression of pre-compressed assets increases the .stone size

sunnyflunk · 2023-01-16T00:13:33Z

Compression of pre-compressed assets increases the .stone size

Any use of any compression of files will increase the stone size.

The plan was in relation to the eventual use of debuginfod. Here individual files are requested by users rather than fetching the packages, so the files being smaller is most important...otherwise you spend hours downloading the debug files ala Fedora before you can even look at the valgrind.

With the files compressed there would be no need to compress the -debug packages either which can save a lot of time given how huge they can be.

Quick n' Dirty Benchmarks glibc : 28s -> 8s curl : 1.17s -> 668ms nano : 451ms -> 311ms Sub task of serpent-os#41.

sunnyflunk added the enhancement label Dec 8, 2022

joebonrichie added a commit to joebonrichie/boulder that referenced this issue Feb 27, 2023

mason/emitter: emit pkgs in parallel

2c89655

Quick n' Dirty Benchmarks glibc : 28s -> 8s curl : 1.17s -> 668ms nano : 451ms -> 311ms Sub task of serpent-os#41.

joebonrichie mentioned this issue Feb 27, 2023

mason/emitter: emit pkgs in parallel #64

Merged

joebonrichie added a commit to joebonrichie/boulder that referenced this issue Feb 27, 2023

mason/emitter: emit pkgs in parallel

94aae28

Quick n' Dirty Benchmarks glibc : 28s -> 8s curl : 1.17s -> 668ms nano : 451ms -> 311ms Sub task of serpent-os#41.

livingsilver94 added chat: brainstorming Wild ideas to spur on the inspiration. and removed enhancement labels Jul 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inefficiencies in the build process #41

Inefficiencies in the build process #41

sunnyflunk commented Dec 8, 2022 •

edited

Loading

joebonrichie commented Jan 15, 2023

sunnyflunk commented Jan 16, 2023

Inefficiencies in the build process #41

Inefficiencies in the build process #41

Comments

sunnyflunk commented Dec 8, 2022 • edited Loading

Create root

Fetching upstreams and packages

Build stages

Analyse

Emit packages

Other ideas

joebonrichie commented Jan 15, 2023

sunnyflunk commented Jan 16, 2023

sunnyflunk commented Dec 8, 2022 •

edited

Loading