Slow operator== and dirty hack

  • Hello!

    I have C++ program which contains the following code fragment:

    template<class T> struct Pixel {
    	T v0, v1, v2, v3;
    	inline bool operator==(Pixel<T> const &p) const {
    		//	reinterpret_cast<unsigned int const &>(  v0) ==
    		//	reinterpret_cast<unsigned int const &>(p.v0);
    		return v0 == p.v0 && v1 == p.v1 && v2 == p.v2 && v3 == p.v3;

    The type T is currently unsigned char , but I planning to use float and double also. The program runs in 6 seconds.

    When I uncomment the dirty hack (3 commented lines in code snippet above), then program runs in 5 seconds, which is much faster.

    I investigated a problem, and found that compilers are not smart enough to optimise this expression. For example, compiler forces lazy behavior of an && operator. You can see assembler output here:

    I investigated deeper and found the same problem for assignment!

    Please help me to:

    • fix the code so it is both fast and elegant;
    • understand what is going on here.


  • Try std::array !

    You have 4 variables of the same type called v0..v3
    -> use std::array<T, 4> v;

    operator== is straightforward, just apply == to the array!

    For your test code:

    bool compare3(Test const &t1, Test const &t2) {
      return t1.v == t2.v;

    With clang (3.8, 3.9) assembly output of compare1 and compare3 is identical while gcc (4.9, 5, 6) calls memcmp for the array.

  • Thanks, wob.

    For now I have following solutions:

    Align the structure. Helps for GCC only:

    struct __attribute__((aligned(4))) Test {
      unsigned char a,b,c,d;

    Use std::array . Helps for Clang only:

    struct Test {
      std::array<unsigned char, 4> v;

    Use bit field. Helps both for Clang and GCC, not for ICC:

    struct Test {
       unsigned int a: 8, b: 8, c: 8, d: 8;

    So, Intel compiler is the most stupid one.

    Is it possible to extent bitfield idea to templates?

Log in to reply