Slow operator== and dirty hack
-
Hello!
I have C++ program which contains the following code fragment:
template<class T> struct Pixel { T v0, v1, v2, v3; inline bool operator==(Pixel<T> const &p) const { //return // reinterpret_cast<unsigned int const &>( v0) == // reinterpret_cast<unsigned int const &>(p.v0); return v0 == p.v0 && v1 == p.v1 && v2 == p.v2 && v3 == p.v3; }
The type
T
is currentlyunsigned char
, but I planning to usefloat
anddouble
also. The program runs in 6 seconds.When I uncomment the dirty hack (3 commented lines in code snippet above), then program runs in 5 seconds, which is much faster.
I investigated a problem, and found that compilers are not smart enough to optimise this expression. For example, compiler forces lazy behavior of an
&&
operator. You can see assembler output here:I investigated deeper and found the same problem for assignment!
Please help me to:
- fix the code so it is both fast and elegant;
- understand what is going on here.
Thanks!
-
Try
std::array
!You have 4 variables of the same type called v0..v3
-> usestd::array<T, 4> v;
operator==
is straightforward, just apply == to the array!For your test code:
bool compare3(Test const &t1, Test const &t2) { return t1.v == t2.v; }
With clang (3.8, 3.9) assembly output of compare1 and compare3 is identical while gcc (4.9, 5, 6) calls memcmp for the array.
-
Thanks, wob.
For now I have following solutions:
Align the structure. Helps for GCC only:
struct __attribute__((aligned(4))) Test { unsigned char a,b,c,d; };
Use
std::array
. Helps for Clang only:struct Test { std::array<unsigned char, 4> v; };
Use bit field. Helps both for Clang and GCC, not for ICC:
struct Test { unsigned int a: 8, b: 8, c: 8, d: 8; };
So, Intel compiler is the most stupid one.
Is it possible to extent bitfield idea to templates?