SIMD enhancements #2880
Replies: 3 comments
-
So I tried this. It work well. However, using simple loops or STL (std::transform and co.), and compiling with -O3 -March=native -ffast-math gives much better performance. It auto-vectorises the code better and fast-math is the biggest win. Interestingly fast-math has no effect on the manual instructions. So the moral of the story is: just use STL algorithm and aggressively turn optimisations on. |
Beta Was this translation helpful? Give feedback.
-
I guess you might have to use SIMD in non-trivial loops which the compiler cannot auto-vectorise. In which case this stuff could be cool. Reading the docs though it doesn't work for all instructions. |
Beta Was this translation helpful? Give feedback.
-
Yeah it's always a mixed bag. Never know what's really going to work best until you try and compare and benchmark 🤷♂️😁 |
Beta Was this translation helpful? Give feedback.
-
GCC and Clang support Vector extensions using
__attribute__((vector_size(N))
. This is a platform independent SIMD extension.We could enhance dlib's SIMD wrappers to use this stuff when available. I'm not sure if it does runtime dispatch or if the exact instruction is decided at compile time, but still worth an investigation. It could potentially unlock a whole bunch of other performance enhancements without too much work.
Beta Was this translation helpful? Give feedback.
All reactions