Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added SatWidenMulPairwiseAccumulate and SatWidenMulAccumFixedPoint ops #2055

Merged

Conversation

johnplatts
Copy link
Contributor

Resolves issue #2050

The SatWidenMulPairwiseAccumulate(DI32, VI16 a, VI16 b, VI32 sum) op was added as AVX3_DL/PPC8/PPC9/PPC10 can carry out the SatWidenMulPairwiseAccumulate(di32, a, b, sum) op using a single instruction.

Adding the SatWidenMulPairwiseAccumulate op also allows the SatWidenMulAccumFixedPoint(di32, a, b, sum) op to be more efficiently implemented for x86 and PPC targets.

On NEON, SatWidenMulAccumFixedPoint(di32, a, b, sum) is a wrapper around the NEON vqdmlal_s16 op.

Similarly, on SVE2, SatWidenMulAccumFixedPoint(di32, a, b, sum) is a wrapper around the SVE2 svqdmlalb_s32 op.

Copy link
Member

@jan-wassenberg jan-wassenberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, thanks for adding this already!
FYI a compiler issue is causing our internal CI to fail. Will investigate.

@Ryo-not-rio
Copy link

Thank you for getting on this so quickly! Would it be possible to add the same for 8 bits and 16 bits as well?

Copy link
Member

@jan-wassenberg jan-wassenberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this :) Are you also interested in extending it to other lane types?

@copybara-service copybara-service bot merged commit 42a181a into google:master Apr 9, 2024
33 of 34 checks passed
@johnplatts johnplatts deleted the hwy_satwidenmul_enh_040324 branch May 1, 2024 11:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants