-
Notifications
You must be signed in to change notification settings - Fork 215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Identical code folding #38
Comments
It would be interesting to see how this interacts with |
I tried it, but it didn't change the binary size. I suppose that's expected, since the program links statically to |
The upside is that this is a stable option, so unlike build-std it doesn't require nightly to get the optimization. |
Not here to provoke. I really want to use Rust for my applications, but something must be inherently wrong with the code generation in Rust. This is the same program in Nim (please note that it is also using generics):
Compiled with: Which is pretty standard for minimizing binaries in Nim, while still keeping stuff like bounds checking, panic handling and memory management support in the binary. This produces a binary that is 18 kB in size with no other dependencies than libc. What is happening in the Rust compiler? |
Does that binary have support for Unicode, and for debug info symbolification of stack traces? Both of that is contained in a Rust binary by default, which contributes to its binary size. |
The answer to your first question is no. We are not using Unicode here (and neither is the Rust program) so why should that support go into the binary? I guess that your question points to a part of the problem. The Rust compiler is simply not very good at removing things that are not needed and the monomorphization implementation of generics doesn't help either. For the second question, the answer is also no. As compiled now, it dies with a panic, instead of exceptions (that contains complete stack traces). This leads to smaller code size and greater opportunities for the compiler do optimizations and is normally used in production for constrained environments. However, if I turn on complete stack traces and recompile, the size of the binary goes up to 31 kB. That's still ten times smaller than what is achieved in the Rust example above. For most people this doesn't matter and Rust is an excellent language, but when working in an embedded Linux setting (with a constrained amount of flash and RAM), this is a real issue that is preventing us from adopting Rust at a grand scale. |
Well, for the program shown in this issue, you can get down to 15 KiB using this [package]
name = "foo"
version = "0.1.0"
edition = "2021"
[dependencies]
[profile.release]
opt-level = "s"
codegen-units = 1
lto = "fat"
strip = true
panic = "abort" and the following build command:
And that's still using the standard library (including bounds checking and heap allocations); without it, it could probably get even smaller. So it's not that small binary size is unattainable in Rust, just that it currently prefers programmer ergonomics (i.e. having panics, debuginfo, symbolification, Unicode, using a precompiled standard library etc.) by default, which is arguably the right choice for a large fraction of use-cases. While the compiler could potentially be smarter in removing unneeded stuff by default (without forcing the user to use these flags), the question is what does "unused" mean. The program in this issue does not seem like it needs panics, symbolification, backtraces or Unicode, but if it does actually panic at runtime (and in general, it's pretty much impossible to tell if a program will panic or not), all these things will be needed, to produce a nice readable stack trace. A big part of the resulting binary is the gimli debuginfo/symbolification library, which could be in theory removed if we compile with debuginfo stripped, but that would require the user to recompile the standard library, which makes compilation times worse, and is currently not fully working and stabilized (see below). For embedded, you have to do a lot of custom stuff now, which I agree is not great. Stabilizing (Btw, this discussion is quite off-topic for this issue :) ) |
Thanks for the answer. Very interesting. Even if this is off-topic I have learned something new. |
Since Rust can't do polymorphization properly yet, using generics generates a lot of duplicated functions because of monomorphization. These functions take space in the binary, even though they have completely the same instructions.
Some linkers (
gold
,lld
) can deduplicate these identical functions using Identical Code Folding, and thus reduce binary size (and potentially also improve usage of the i-cache).You can specify this linker option using a linker flag, for example like this:
$ RUSTFLAGS="-Clink-args=-fuse-ld=lld -Clink-args=-Wl,--icf=all" cargo build
I measured the binary size change for the following program:
Here are binary sizes are after running
strip
on them:The text was updated successfully, but these errors were encountered: