-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding padding to precode #111819
base: main
Are you sure you want to change the base?
Adding padding to precode #111819
Conversation
Tagging subscribers to this area: @mangod9 |
We try to keep these precodes as small as possible. We allocate a lot of them. Is this change going to be measurable memory consumption regression? |
@jkotas there is no change in the size. The precode size is a max(precode data, precode code). The data size was what we pad the code to in this change. This fix fixes the problem when due to the code being smaller, the remaining code part after the real precode code was filled by code from the next block of code, which caused misclassification of the StubPrecode as FixupPrecode in PrecodeStubManager::DoTraceStub. |
It does suggest that in the future we might be able to improve the perf by more efficiently packing the code but I'm happy to separate optional future perf improvements from the current PR which fixes a bug and incurs no perf regression. |
@mikelle-rogers could you please make the same change for x86, x64 and arm too so that we are consistent? The riscv64 and loongarch64 have the code sizes larger than the data size, so there is no need to change those. You can find the FixupPrecode sizes here: runtime/src/coreclr/vm/precode.h Lines 226 to 252 in f91ff5e
And the StubPrecode sizes here: runtime/src/coreclr/vm/precode.h Lines 94 to 112 in f91ff5e
It would be also good to change the following asserts: runtime/src/coreclr/vm/precode.cpp Line 422 in f91ff5e
and runtime/src/coreclr/vm/precode.cpp Line 434 in f91ff5e
To use == instead of <= . There would make sure the code fills in the precode "slot".
|
@noahfalk I was considering that when I have implemented these stubs during the W^X work, but that would actually lead to a waste of RW memory unless the size of the data was an integer multiple of the (possibly padded) code size. Currently, we interleave code and data pages in memory 1:1. If the data size was e.g. 1.5 times the code size, then we would need 2 data pages for 1 code page, but the 2nd page would always be half empty. Regarding the StubPrecode, there actually is a different opportunity to reduce the size. The StubPrecode data has an extra pointer sized slot for a type, because it is used both for StubPrecode and NDirectImportPrecode. I think I've made that slot pointer sized to ensure the code start is always aligned on the pointer size for perf reasons, but I am not sure. We could experiment here to see if making it smaller would work fine perf wise or not. |
Thanks for more background into @janvorli! I didn't mean to imply that optimizing it further would be easy or that anyone should prioritize doing it, only that in theory it seemed possible. I agree with you that it would certainly add complexity and probably involve some other tradeoffs. I'm happy to leave it to you and others working in that area to decide if and when such an optimization would be worthwhile. |
Add padding to precode to align the StubPrecode with the offset of the data.
We have a single constant “StubPrecode::CodeSize” that is 24 in this case. This address is used to help access the the size of the code that gets copied from the template code. The effect is that instead of having 12 bytes of StubPrecode code followed by e.g. 12 zeros, we have 12 bytes of StubPrecode code followed by 12 bytes from the beginning of the FixupPrecodeCode.
That leads to the misdetection of stubStartAddress - FixupPrecode::FixupCodeOffset as FixupPrecode, because there in fact is part of its code.