-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BLST throws illegal instruction error on AMD K10 CPUs (Windows) #200
Comments
From the README:
|
We do use portable for BLST. We run on a wide range of CPUs (100,000+ users), so we know portable works very well. AMD K10s are the only problem reports we have received. Thanks! |
AMD K10 doesn't have SSSE3, eg From BLST: sha256_block procedure for x86_64. This module is stripped of AVX and even scalar code paths, with a) AVX1 is [justifiably] faster than SSSE3 code path only on one So BLST can't run on AMD K10s without throwing illegal instructions :-( |
You are mentioning src/asm/sha256-x86_64.pl but the portable one is src/asm/sha256-portable-x86_64.pl Alternatively you can compile BLST without assembly: https://github.com/supranational/blst/blob/56f9198/src/no_asm.h#L1225 |
Either way, we are building |
Ok, in the So we include for windows anyway the |
I'll note our current released version uses some custom CMake scripts to build BLST and uses it in a Python wheel However, our upcoming release uses a rust crate for this and leverages the provided BLST rust build environment from |
Actually it crashed in the same code and instruction (pshufb) in the rust version. The offset (which I got from event viewer on the 905e system) is different of course. https://github.com/supranational/blst/blob/master/src/asm/sha256-x86_64.pl#L408 |
Wow! How deep is it reasonable to go? The rationale behind omitting What to do? Note that there is |
As a point of clarification. Non-assembly builds are not actually supported on x86_64 and aarch64 platforms. Attempt to build it even should fail... |
On a related note. What's your VC version? It would have to be pretty old, wouldn't it? I mean newer versions don't support older Windows, so I can imagine some jumping-through-the-hoops is going on when Windows builds are produced. Question is what's more tricky, throwing together a mingw environment or putting together out-of-support VC installation? |
Since it's not exclusively about Rust, it might be appropriate to clarify that mingw option is not limited to Rust. Of course not. And in addition to that, if you control the build procedure, you might find it useful to know that you can compile build/assemble.S with |
Hmm, after double-checking the suggestion that this would work on pre-SSSE3 processors turned out to be wrong. Sorry! However! It's way easier to fix that by simply harmonizing it with ELF than to square the [msvc] circle in build.rs. So the question "is this sufficient" still stands. A variant of it. In other words, the suggestion is to make |
As in #201. |
The box I am using for testing with the AMD 905e CPU is running Windows 10 Home which isn't EOL until 2025. I'll admit it is pretty slow!
Unfortunately we have to use the Visual C++ calling convention since Python is compiled using MSVC++ and we link in under that, so mingw doesn't work for our situation. We are compiling using the Visual Studio version featured by the github runners, which I think is MSVC++ 2022. Thanks for looking into this. We recently switched from using Relic to BLST as our underlying BLS library and regrettably that has left some users with these AMD chips out in the cold. |
Question was if it's actually supported. The fact that it might be possible to trick Windows [10] to install on unsupported hardware doesn't really mean that it makes it qualified for support by everybody :-)
For reference, as far as blst itself goes, x86_64-pc-windows-gnu and x86_64-pc-windows-msvc are interchangeable. Because it's C, not C++. So that you can compile assembly.S with the mingw toolchain or clang and link it into your C++ thing [compiled with MSVC]. As for Github Actions, they have both preinstalled, ready to be called. |
As an additional point, clang-cl is meant to be a drop-in replacement for cl. Yet at the same time you can compile assembly.S with it. Without bothering with --target. |
Just in case, the implied suggestion is to use clang-cl for everything, to compile build/assembly.S, src/server.c and your C++ thing. Unification! :-) |
Could we get a new release of the rust crate that supports K10? I think that may be all we need. Thanks! |
Assuming |
Double-check #201! |
As for having clang-cl compile assembly.S by cargo, no, cc-rs doesn't let you pull it off, not as is. But it's possible to use "raw" clang:
The last two steps are meant to convince you that it does call As for non-Rust builds. Other build systems ought to respect CC environment variable too. And have user-defined rules. What I'm driving at is that in general you should be able to perform a |
I'm waiting out [at least] cc-rs release that will make |
|
Confirmed that building with |
Thanks! |
Hey, curious if there's any update on this and if you're planning on cutting a new release sometime soon? Would be ideal to eventually not have to patch in the specific commit hash. |
Windows event viewer shows 0xc000001d (illegal instruction) when running BLST on a K10 AMD 905e CPU
Disassembly of code at offset shows pshufb instruction being executed.
According to x86 docs:
"pshufb (packed shuffle bytes) is not supported on AMD K8/K10, first included on AMD FX* (Bulldozer). It comes with SSSE3 (Supplemental SSE3) set."
But BLST uses it quite a bit for sha256 :-(
(venv) Williams-MacBook-Pro:blst bill$ grep -r pshufb *
src/asm/x86_64-xlate.pl:my $pshufb = sub {
src/asm/sha256-x86_64.pl: pshufb $TMP,@msg[0]
src/asm/sha256-x86_64.pl: pshufb $TMP,@msg[1]
src/asm/sha256-x86_64.pl: pshufb $TMP,@msg[2]
src/asm/sha256-x86_64.pl: pshufb $TMP,@msg[3]
src/asm/sha256-x86_64.pl: pshufb $t3,@x[0]
src/asm/sha256-x86_64.pl: pshufb $t3,@x[1]
src/asm/sha256-x86_64.pl: pshufb $t3,@x[2]
src/asm/sha256-x86_64.pl: pshufb $t3,@x[3]
src/asm/sha256-x86_64.pl: '&pshufb ($t3,$t4)', # sigma1(X[14..15])
src/asm/sha256-x86_64.pl: '&pshufb ($t3,$t5)',
src/asm/sha256-x86_64.pl: #&pshufb ($t3,$t4); # sigma1(X[14..15])
src/asm/sha256-x86_64.pl: #&pshufb ($t3,$t5);
Thanks for your efforts!!! Cheers
The text was updated successfully, but these errors were encountered: