Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aeroway release 1.2 #3

Open
wants to merge 133 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
133 commits
Select commit Hold shift + click to select a range
7d6737b
Addition of CI for x86 and ARMv8 build/test
fas-cc Sep 9, 2024
a83e2fc
Merge branch 'create-ci-fas' into 'cc_dev'
fas-cc Sep 9, 2024
2fd2208
Implement MulByPow2 using svscale
wbb-ccl Sep 18, 2024
74685ff
IfNegativeThenNegOrUndefIfZero for Arm SVE using svneg[type]_m
mazimkhan Sep 24, 2024
e47b77f
Merge branch 'IfNegativeThenNegOrUndefIfZero-arm_sve' into 'cc_dev'
mazimkhan Sep 24, 2024
59e162d
Merge branch 'wbb-MulByFloorPow2' into 'cc_dev'
mazimkhan Sep 24, 2024
a791552
Implement SatWidenMulPairwiseAccumulate for i32
fas-cc Sep 26, 2024
d6e096b
Merge branch 'SatWidenMulPairwiseAccumulate' into 'cc_dev'
mazimkhan Sep 26, 2024
ae178be
Don't commit build folders
fas-cc Sep 26, 2024
93ed1da
Merge branch 'fas-ignore-build-folders' into 'cc_dev'
mazimkhan Sep 26, 2024
303bfb2
StoreTruncated implementation for Arm
mazimkhan Sep 30, 2024
6a2eb34
Merge branch 'StoreTruncated-mak' into 'cc_dev'
mazimkhan Sep 30, 2024
44b854e
Complex multiply and add variants
wbb-ccl Sep 30, 2024
491bc00
Merge branch 'wbb-complex-ops' into 'cc_dev'
mazimkhan Sep 30, 2024
1ba157e
Log what the failure was in CI
wbb-ccl Oct 2, 2024
ffae4bb
MulAddLower
fas-cc Oct 3, 2024
083c3a8
Merge branch 'MulAddLower' into 'cc_dev'
fas-cc Oct 3, 2024
bf6b6ce
MulSubAdd
fas-cc Oct 3, 2024
3da93a8
Merge branch 'MulSubAdd' into 'cc_dev'
fas-cc Oct 3, 2024
06ea917
MulRound
fas-cc Oct 3, 2024
5170af8
Merge branch 'MulRound' into 'cc_dev'
fas-cc Oct 3, 2024
865d556
Revert "Merge branch 'SatWidenMulPairwiseAccumulate' into 'cc_dev'"
mazimkhan Oct 3, 2024
c0bb54a
Merge branch 'revert-d6e096bf' into 'cc_dev'
mazimkhan Oct 3, 2024
3da6515
Merge branch 'wbb-ci-show-errors' into 'cc_dev'
mazimkhan Oct 3, 2024
93a420b
MulLower
fas-cc Oct 7, 2024
6b77338
Merge branch 'MulLower' into 'cc_dev'
fas-cc Oct 7, 2024
cf192e8
SqrtLower
fas-cc Oct 7, 2024
e74b03e
Merge branch 'SqrtLower' into 'cc_dev'
fas-cc Oct 7, 2024
a96461b
Generic implementation of AllOnes and AllZeros
mazimkhan Oct 8, 2024
ef1ca4b
Merge branch 'AllOnes-mak' into 'cc_dev'
mazimkhan Oct 8, 2024
58303e9
GetExponent
wbb-ccl Oct 8, 2024
8f3f74c
Merge branch 'wbb-get-exponent' into 'cc_dev'
mazimkhan Oct 8, 2024
07f2650
Initial AddLower
FRS-CC Oct 1, 2024
61267cc
Updated SVE implementation
FRS-CC Oct 2, 2024
0894ae2
Update AddLower functionality
FRS-CC Oct 3, 2024
8e05211
Add to quick reference
FRS-CC Oct 3, 2024
4f8af66
Update test to use Iota
FRS-CC Oct 3, 2024
efd5778
Fix AddLower expected vector
FRS-CC Oct 3, 2024
6c7a326
Match reverted commit in master
FRS-CC Oct 8, 2024
d063d76
Update test for clarity
FRS-CC Oct 9, 2024
c9ea394
Merge branch 'frs_add_lower' into 'cc_dev'
FRS-CC Oct 9, 2024
239d9ff
MulLower with Fixed guard
fas-cc Oct 10, 2024
49f6cd3
Merge branch 'MulLower' into 'cc_dev'
fas-cc Oct 10, 2024
a5d0826
Add LoadHigher
FRS-CC Oct 9, 2024
8b6fae5
Restrict to 2 lane vectors
FRS-CC Oct 9, 2024
cf0402e
Use correct intrinsic for MulLower
fas-cc Oct 15, 2024
b0b2d54
Merge branch 'mul-lower-fix' into 'cc_dev'
wbb-ccl Oct 15, 2024
9383346
Muladdlower fix
fas-cc Oct 16, 2024
c94c940
Merge branch 'muladdlower-fix' into 'cc_dev'
wbb-ccl Oct 16, 2024
90f7dbf
Masked interleaves
fas-cc Oct 16, 2024
8bf1f44
Merge branch 'masked-interleaves' into 'cc_dev'
fas-cc Oct 16, 2024
6044704
Masked Absolute Functions
fas-cc Oct 16, 2024
d22f196
Merge branch 'masked-abs' into 'cc_dev'
fas-cc Oct 16, 2024
d9e0d6c
DemoteXXXTo Operations
fas-cc Oct 16, 2024
d0607ed
Merge branch 'DemoteTos' into 'cc_dev'
fas-cc Oct 16, 2024
52e9042
Masked comparison ops
wbb-ccl Oct 16, 2024
d300af2
Merge branch 'wbb-masked-comparison' into 'cc_dev'
wbb-ccl Oct 16, 2024
12579ee
Masked arithmetic ops
wbb-ccl Oct 16, 2024
ce66372
Merge branch 'wbb-masked-arithmetic' into 'cc_dev'
wbb-ccl Oct 16, 2024
8f6c59e
Masked approx and sqrt
fas-cc Oct 16, 2024
df0871e
Merge branch 'masked-approx-and-sqrt' into 'cc_dev'
fas-cc Oct 16, 2024
5485420
MaskedLeadingZeroCountOrZero
fas-cc Oct 17, 2024
2095aa2
Merge branch 'fas-MaskedLeadingZeroCountOrZero' into 'cc_dev'
fas-cc Oct 17, 2024
9e59119
Masked Reduce*
wbb-ccl Oct 17, 2024
4f21e4f
Merge branch 'wbb-masked-reduce' into 'cc_dev'
wbb-ccl Oct 17, 2024
9ec33ae
Add generic, sve and test for MaskShiftLeft/RightSameOrZero
FRS-CC Oct 16, 2024
7b7643e
Add generic, sve and test for MaskedShiftRightOr
FRS-CC Oct 16, 2024
39906cf
Add generic, sve and test for MaskedShrOr
FRS-CC Oct 17, 2024
eb3c973
Add quick reference
FRS-CC Oct 17, 2024
f8e6909
Small fixes
FRS-CC Oct 17, 2024
5b59b55
Update LoadHigher functionality
FRS-CC Oct 17, 2024
a4279c4
Clean up
FRS-CC Oct 17, 2024
05948f4
Remove scalar guards
FRS-CC Oct 17, 2024
c060be7
Further edits
FRS-CC Oct 18, 2024
14feed0
Add generic, sve and test for MaskedMaxOrZero
FRS-CC Oct 16, 2024
43c0330
Add generic, sve and test for MaskedOrOrZero
FRS-CC Oct 16, 2024
b993ece
Update quick ref
FRS-CC Oct 16, 2024
a7bc682
Fix test
FRS-CC Oct 17, 2024
7d0b5ba
Clean up functions
FRS-CC Oct 18, 2024
6543dc9
Updates to all functions
FRS-CC Oct 18, 2024
2033a88
Align wrapping
FRS-CC Oct 18, 2024
d4264eb
Merge branch 'frs_load_high' into 'cc_dev'
FRS-CC Oct 18, 2024
0efaf44
Merge branch 'frs_masked_max' into 'cc_dev'
FRS-CC Oct 18, 2024
55da2dd
Merge branch 'frs_masked_shifts' into 'cc_dev'
wbb-ccl Oct 18, 2024
0eb1f9f
MaskedLoadU and Test
fas-cc Oct 21, 2024
1e8a934
Merge branch 'MaskedLoadU' into 'cc_dev'
fas-cc Oct 21, 2024
a050eed
PairwiseAdd
FRS-CC Oct 21, 2024
6b4e984
Merge branch 'fas-test-pairwise' into 'cc_dev'
mazimkhan Oct 21, 2024
04af79d
Use GCC 12 and QEMU 9.1.0
sja-CambridgeConsultants Oct 21, 2024
6fcdfa2
Merge branch 'sja-qemu-9.1' into 'cc_dev'
mazimkhan Oct 21, 2024
1f9f905
SetOr/SetOrZero SVE and generic implementation
mazimkhan Oct 21, 2024
daf8911
Merge branch 'SetOr-mak' into 'cc_dev'
mazimkhan Oct 21, 2024
910c3cf
Return arm format from PairwiseAdd and PairwiseSub
fas-cc Oct 23, 2024
6c65228
Merge branch 'fas-armify-pairwiseadd' into 'cc_dev'
fas-cc Oct 23, 2024
79a53bc
PromoteTo functions which include Rounding
fas-cc Oct 24, 2024
4316c25
Merge branch 'frs_promote_tos' into 'cc_dev'
fas-cc Oct 24, 2024
6803826
Add Generic and tests for MaskedPromoteTo and ConvertTo or Zero
FRS-CC Oct 21, 2024
09ddb5d
Add Generic and test for MaskedDemoteToOrZero
FRS-CC Oct 22, 2024
43ed208
Add sve for MaskedConvertToOrZero
FRS-CC Oct 22, 2024
437279a
Add to quick reference
FRS-CC Oct 22, 2024
0f4aaa7
Update tests
FRS-CC Oct 23, 2024
5afc5be
Update tests
FRS-CC Oct 24, 2024
150f2bf
Move TestPairwiseAdd to avoid muddled tests
fas-cc Oct 24, 2024
59e9e08
Reformat the non compiling test
FRS-CC Oct 24, 2024
e0a5bcf
Add MultiShift
wbb-ccl Oct 25, 2024
0d0c00f
Merge branch 'wbb-multishift' into 'cc_dev'
mazimkhan Oct 25, 2024
e77dfb4
Update test name
FRS-CC Oct 25, 2024
e562a54
Add IntFromFloat test
FRS-CC Oct 25, 2024
a8a2b2f
PairwiseAdd128 implementation for Arm and generic
mazimkhan Oct 28, 2024
1bbb210
Merge branch 'mak-pairwise-add-in-128-block' into 'cc_dev'
mazimkhan Oct 28, 2024
f0e7240
Merge branch 'fas-small-pairwise-fix' into 'cc_dev'
fas-cc Oct 28, 2024
65c7850
Add uint test
FRS-CC Oct 28, 2024
550eb2c
Update tests
FRS-CC Oct 28, 2024
d0fdf01
Fas masked tbl lookups
fas-cc Oct 28, 2024
bfc64c6
Merge branch 'fas-masked-tbl-lookups' into 'cc_dev'
fas-cc Oct 28, 2024
f8eaecd
Implement the full set of masked comparison operators
wbb-ccl Oct 28, 2024
3045632
Merge branch 'wbb-extra-comparisons' into 'cc_dev'
wbb-ccl Oct 28, 2024
dc4aebf
Handle overflows in test
FRS-CC Oct 29, 2024
47be6e0
Fix tests
FRS-CC Oct 29, 2024
3a45280
Merge branch 'frs_conversions' into 'cc_dev'
FRS-CC Oct 30, 2024
76b7d73
Update to demotexxxto for F32 to F16
fas-cc Oct 30, 2024
ec5ddad
Merge branch 'fas-demote-xxx-to' into 'cc_dev'
fas-cc Oct 30, 2024
2fb22a8
Tidy up quick reference
wbb-ccl Oct 30, 2024
95d2e14
Merge branch 'wbb-reference-review' into 'cc_dev'
wbb-ccl Oct 30, 2024
c749c7f
Working version of SatWidenPWAcc
FRS-CC Oct 30, 2024
a235c98
Add undef
FRS-CC Oct 31, 2024
2836cb6
Merge branch 'frs_SatWidenMulPairwiseAccumulate' into 'cc_dev'
FRS-CC Oct 31, 2024
a5eb5f5
Remove duplicate sentence
wbb-ccl Nov 4, 2024
870b811
Merge branch 'wbb-docs-fix' into 'cc_dev'
mazimkhan Nov 4, 2024
ee38e3f
Lint pass
wbb-ccl Nov 4, 2024
7a584eb
Merge branch 'wbb-lint-pass' into 'cc_dev'
mazimkhan Nov 4, 2024
6010983
Merge branch 'master' of github.com:vranhub/aeroway into aeroway_upst…
mazimkhan Nov 5, 2024
54731f5
Revert "Addition of CI for x86 and ARMv8 build/test"
mazimkhan Nov 5, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,5 @@ docs/g3doc/*
docs/html/*
docs/md/*
docs/rst/*
/build*
/.vscode
1 change: 1 addition & 0 deletions BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -494,6 +494,7 @@ HWY_TESTS = [
("hwy/tests/", "combine_test"),
("hwy/tests/", "compare_test"),
("hwy/tests/", "compress_test"),
("hwy/tests/", "complex_arithmetic_test"),
("hwy/tests/", "concat_test"),
("hwy/tests/", "convert_test"),
("hwy/tests/", "count_test"),
Expand Down
1 change: 1 addition & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -727,6 +727,7 @@ set(HWY_TEST_FILES
hwy/tests/cast_test.cc
hwy/tests/combine_test.cc
hwy/tests/compare_test.cc
hwy/tests/complex_arithmetic_test.cc
hwy/tests/compress_test.cc
hwy/tests/concat_test.cc
hwy/tests/convert_test.cc
Expand Down
311 changes: 311 additions & 0 deletions g3doc/quick_reference.md

Large diffs are not rendered by default.

2 changes: 2 additions & 0 deletions hwy/base.h
Original file line number Diff line number Diff line change
Expand Up @@ -652,6 +652,8 @@ using RemovePtr = typename RemovePtrT<T>::type;
hwy::EnableIf<kN * sizeof(T) <= bytes>* = nullptr
#define HWY_IF_V_SIZE_GT(T, kN, bytes) \
hwy::EnableIf<(kN * sizeof(T) > bytes)>* = nullptr
#define HWY_IF_V_SIZE_GE(T, kN, bytes) \
hwy::EnableIf<(kN * sizeof(T) >= bytes)>* = nullptr

#define HWY_IF_LANES(kN, lanes) hwy::EnableIf<(kN == lanes)>* = nullptr
#define HWY_IF_LANES_LE(kN, lanes) hwy::EnableIf<(kN <= lanes)>* = nullptr
Expand Down
Loading
Loading