-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Avx10.2 Instructions in Floating Point Conversions #111775
Open
khushal1996
wants to merge
40
commits into
dotnet:main
Choose a base branch
from
khushal1996:kcm-avx102-opt1
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+126
−37
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…Lower Avx10.2 nodes accordingly.
…DE." This reverts commit 067e31e.
…DE." This reverts commit 067e31e.
…embedded rounding" This reverts commit 493572f.
…DE." This reverts commit 067e31e.
This reverts commit 61719f8.
Co-authored-by: Bruce Forstall <[email protected]>
…1996/runtime into kcm-avx102-api-public-pr
dotnet-issue-labeler
bot
added
area-CodeGen-coreclr
CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
new-api-needs-documentation
labels
Jan 24, 2025
Note regarding the
|
1 similar comment
Note regarding the
|
dotnet-policy-service
bot
added
the
community-contribution
Indicates that the PR has been added by a community member
label
Jan 24, 2025
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
3 tasks
@@ -372,6 +373,8 @@ GenTree* Compiler::fgMorphExpandCast(GenTreeCast* tree) | |||
#else | |||
#if defined(TARGET_AMD64) | |||
// Following nodes are handled when lowering the nodes | |||
// float -> ulong/uint/int/long fro AVX10.2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggested change
// float -> ulong/uint/int/long fro AVX10.2 | |
// float -> ulong/uint/int/long for AVX10.2 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
area-CodeGen-coreclr
CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
community-contribution
Indicates that the PR has been added by a community member
new-api-needs-documentation
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Overview
This PR tracks optimizing x64 floating point to integer conversions using the new saturating instructions introduced in AVX10.2. We are following the spec doc to add the new instructions and optimize the x64/x86 conversions.
Testing
All of the changes made for testing are present in this branch
Step 1: Run superpmi.exe on library mch files using JITLateDisasm to check if any errors occur. Use JITLateDisasm to check for a valid decoding of the byte stream through LLVM disasmbler
For this step, a new coredistools was used built from the LLVM repo. After running superpmi with JITLateDisasm, no decoding failures were detected. Please contact for getting access to the superpmi logs.
Step 2: Run superpmi and check for asmdiffs and assert errors.
Below is the summary of superpmi run between this PR and PR #111209
Diff makes sense here. All of the diffs in superpmi logs belong to conversion scenario. E.g.
Since these diffs are expected, we can conclude that the superpmi run is successful
Step 3: Run the JIT test suite using a stable subset of tests on SDE
Results
Optimized ASM
Note: Below is a case by case basis of comparison between asm generated for
Avx512
vsAvx10.2
. TheAvx10v2
asm has been collected in sde.Case: Float to Int packed
** Test code**
Left Side is AVX512 vs Right Side is AVX10.2
Case: Float to UInt packed
** Test code**
Left Side is AVX512 vs Right Side is AVX10.2
Case: Double to long packed
** Test code**
Left Side is AVX512 vs Right Side is AVX10.2
Case: Double to Ulong packed
** Test code**
Left Side is AVX512 vs Right Side is AVX10.2
Case: Float to Int Scalar
** Test code**
Left Side is AVX512 vs Right Side is AVX10.2
Case: Float to UInt Scalar
** Test code**
Left Side is AVX512 vs Right Side is AVX10.2
Case: Float to Long Scalar
** Test code**
Left Side is AVX512 vs Right Side is AVX10.2
Case: Float to ULong Scalar
** Test code**
Left Side is AVX512 vs Right Side is AVX10.2
Case: Double to Long Scalar
** Test code**
Left Side is AVX512 vs Right Side is AVX10.2
Case: Double to ULong Scalar
** Test code**
Left Side is AVX512 vs Right Side is AVX10.2
Case: Double to int Scalar
** Test code**
Left Side is AVX512 vs Right Side is AVX10.2
Case: Double to UInt Scalar
** Test code**
Left Side is AVX512 vs Right Side is AVX10.2