Code-Analysis of the day: the potential of __builtin_expect()
Just discovered the Linux kernel folks are starting to use this fancy new feature of GCC-3.0 (and also the triple-cursed GCC-2.96) at some places. It lets us give the compiler a hint which branch is expected to be taken. I found that it indeed does increase performance, even on Intel where it cannot do much more than reordering the code. The simplest example is ex's dtor: inline ex::~ex() { if (--bp->refcount==0) delete bp; } which, using GCC-3.0.1, compiles to (CXXFLAGS="-O2"): 00000000 <GiNaC::ex::~ex()>: 0: 55 push %ebp 1: 89 e5 mov %esp,%ebp 3: 83 ec 08 sub $0x8,%esp 6: 8b 45 08 mov 0x8(%ebp),%eax 9: 8b 10 mov (%eax),%edx b: 8b 42 10 mov 0x10(%edx),%eax e: 48 dec %eax f: 85 c0 test %eax,%eax 11: 89 42 10 mov %eax,0x10(%edx) 14: 74 0a je 20 <GiNaC::ex::~ex()+0x20> 16: 89 ec mov %ebp,%esp 18: 5d pop %ebp 19: c3 ret 1a: 8d b6 00 00 00 00 lea 0x0(%esi),%esi 20: 85 d2 test %edx,%edx 22: 74 f2 je 16 <GiNaC::ex::~ex()+0x16> 24: 8b 02 mov (%edx),%eax 26: 83 ec 0c sub $0xc,%esp 29: 52 push %edx 2a: ff 50 0c call *0xc(%eax) 2d: 83 c4 10 add $0x10,%esp 30: eb e4 jmp 16 <GiNaC::ex::~ex()+0x16> Using quite expressive macros(*) lifted from the kernel, we can try to pessimise the code by saying that deletion is going to be likely (it isn't): inline ex::~ex() { if (likely(--bp->refcount==0)) delete bp; } Under the same compilerflags as above we now get the slightly shorter section: 00000000 <GiNaC::ex::~ex()>: 0: 55 push %ebp 1: 89 e5 mov %esp,%ebp 3: 83 ec 08 sub $0x8,%esp 6: 8b 45 08 mov 0x8(%ebp),%eax 9: 8b 10 mov (%eax),%edx b: 8b 42 10 mov 0x10(%edx),%eax e: 48 dec %eax f: 85 c0 test %eax,%eax 11: 89 42 10 mov %eax,0x10(%edx) 14: 75 10 jne 26 <GiNaC::ex::~ex()+0x26> 16: 85 d2 test %edx,%edx 18: 74 0c je 26 <GiNaC::ex::~ex()+0x26> 1a: 8b 02 mov (%edx),%eax 1c: 83 ec 0c sub $0xc,%esp 1f: 52 push %edx 20: ff 50 0c call *0xc(%eax) 23: 83 c4 10 add $0x10,%esp 26: 89 ec mov %ebp,%esp 28: 5d pop %ebp 29: c3 ret However, as expected, this runs slower (approx 5%!) than the version without the misleading compiler hint. It's quite interesting how the blocks were reordered in this example. A good compiler hint would be: inline ex::~ex() { if (unlikely(--bp->refcount==0)) delete bp; } This, however, compiles to the same code as the one without any hint. Interesting. I tried to insert similar hints at several places where it appears clear that I should know better than the compiler what is likely going to happen next. It always turned out to produce the same code. Why is the compiler is so good at making the right decision? I found that I always was introducing unlikely(), never likely(). Is a branch implicitly assumed as not taken??? Anyways, this stuff seems to be a bit premature. I just record it here in case somebody wants to experiment with this stuff. Regards -richy. (*) The exact definition was lifted from kernel-2.4.10-pre11: #define likely(x) __builtin_expect(!!(x), 1) #define unlikely(x) __builtin_expect((x), 0) -- Richard B. Kreckel <Richard.Kreckel@Uni-Mainz.DE> <http://wwwthep.physik.uni-mainz.de/~kreckel/>
participants (1)
-
Richard B. Kreckel