Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Identical code folding #38

Open
Kobzol opened this issue Jul 8, 2022 · 8 comments
Open

Identical code folding #38

Kobzol opened this issue Jul 8, 2022 · 8 comments

Comments

@Kobzol
Copy link
Contributor

Kobzol commented Jul 8, 2022

Since Rust can't do polymorphization properly yet, using generics generates a lot of duplicated functions because of monomorphization. These functions take space in the binary, even though they have completely the same instructions.

Some linkers (gold, lld) can deduplicate these identical functions using Identical Code Folding, and thus reduce binary size (and potentially also improve usage of the i-cache).

You can specify this linker option using a linker flag, for example like this:

$ RUSTFLAGS="-Clink-args=-fuse-ld=lld -Clink-args=-Wl,--icf=all" cargo build

I measured the binary size change for the following program:

fn foo() {
    let mut a: Vec<u8> = vec![1, 2, 3];
    a.pop();

    let mut b: Vec<u32> = vec![1, 2, 3];
    b.pop();

    let mut c: Vec<i32> = vec![1, 2, 3];
    c.pop();
}

fn main() {
    foo();
}

Here are binary sizes are after running strip on them:

Linker Mode Binary size (B) ICF (Identical Code Folding)
gold debug 342696 No
gold debug 330408 Yes
gold release 322216 No
gold release 318120 Yes
lld debug 330968 No
lld debug 321840 Yes
lld release 310616 No
lld release 306848 Yes
@johnthagen
Copy link
Owner

johnthagen commented Jul 8, 2022

It would be interesting to see how this interacts with -Z build-std=std,panic_abort as then the linker might also be able to fold identical code in std as well? 🤔

@Kobzol
Copy link
Contributor Author

Kobzol commented Jul 8, 2022

I tried it, but it didn't change the binary size. I suppose that's expected, since the program links statically to libstd, the code of libstd is already inside the binary when ICF runs, so it's applied to it even without build-std. It's done on the linker/binary level, so it won't do much for libstd unless we link to it as a dynamic library I guess :)

@the8472
Copy link

the8472 commented Dec 5, 2024

The upside is that this is a stable option, so unlike build-std it doesn't require nightly to get the optimization.

@janflyborg
Copy link

Not here to provoke. I really want to use Rust for my applications, but something must be inherently wrong with the code generation in Rust. This is the same program in Nim (please note that it is also using generics):

proc foo() =
  var a: seq[uint8] = @[1, 2, 3]
  discard a.pop

  var b: seq[uint32] = @[1, 2, 3]
  discard b.pop

  var c: seq[int32] = @[1, 2, 3]
  discard c.pop

proc main() =
  foo()

main()

Compiled with:
$ nim c --cpu:amd64 --os:linux --mm:orc --panics:on -d:useMalloc --threads:off -d:release --opt:size --passC:-flto=auto --passL:-flto=auto -d:strip push.nim

Which is pretty standard for minimizing binaries in Nim, while still keeping stuff like bounds checking, panic handling and memory management support in the binary.

This produces a binary that is 18 kB in size with no other dependencies than libc. What is happening in the Rust compiler?

@Kobzol
Copy link
Contributor Author

Kobzol commented Dec 19, 2024

Does that binary have support for Unicode, and for debug info symbolification of stack traces? Both of that is contained in a Rust binary by default, which contributes to its binary size.

@janflyborg
Copy link

The answer to your first question is no. We are not using Unicode here (and neither is the Rust program) so why should that support go into the binary? I guess that your question points to a part of the problem. The Rust compiler is simply not very good at removing things that are not needed and the monomorphization implementation of generics doesn't help either.

For the second question, the answer is also no. As compiled now, it dies with a panic, instead of exceptions (that contains complete stack traces). This leads to smaller code size and greater opportunities for the compiler do optimizations and is normally used in production for constrained environments. However, if I turn on complete stack traces and recompile, the size of the binary goes up to 31 kB. That's still ten times smaller than what is achieved in the Rust example above.

For most people this doesn't matter and Rust is an excellent language, but when working in an embedded Linux setting (with a constrained amount of flash and RAM), this is a real issue that is preventing us from adopting Rust at a grand scale.

@Kobzol
Copy link
Contributor Author

Kobzol commented Dec 20, 2024

Well, for the program shown in this issue, you can get down to 15 KiB using this Cargo.toml:

[package]
name = "foo"
version = "0.1.0"
edition = "2021"

[dependencies]

[profile.release]
opt-level = "s"
codegen-units = 1
lto = "fat"
strip = true
panic = "abort"

and the following build command:

RUSTFLAGS="-Zlocation-detail=none -Zfmt-debug=none" cargo +nightly build --release -Z build-std=std,panic_abort -Zbuild-std-features="optimize_for_size" -Z build-std-features=panic_immediate_abort --target=x86_64-unknown-linux-gnu

And that's still using the standard library (including bounds checking and heap allocations); without it, it could probably get even smaller.

So it's not that small binary size is unattainable in Rust, just that it currently prefers programmer ergonomics (i.e. having panics, debuginfo, symbolification, Unicode, using a precompiled standard library etc.) by default, which is arguably the right choice for a large fraction of use-cases.

While the compiler could potentially be smarter in removing unneeded stuff by default (without forcing the user to use these flags), the question is what does "unused" mean. The program in this issue does not seem like it needs panics, symbolification, backtraces or Unicode, but if it does actually panic at runtime (and in general, it's pretty much impossible to tell if a program will panic or not), all these things will be needed, to produce a nice readable stack trace. A big part of the resulting binary is the gimli debuginfo/symbolification library, which could be in theory removed if we compile with debuginfo stripped, but that would require the user to recompile the standard library, which makes compilation times worse, and is currently not fully working and stabilized (see below).

For embedded, you have to do a lot of custom stuff now, which I agree is not great. Stabilizing build-std (which is one of the Rust Project Goals for 2025 - https://rust-lang.github.io/rust-project-goals/2025h1/build-std.html) should go a long way towards making this experience more pleasant.

(Btw, this discussion is quite off-topic for this issue :) )

@janflyborg
Copy link

Thanks for the answer. Very interesting. Even if this is off-topic I have learned something new.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants