cpgf library documentation

A benchmark of three C++ open source callback/signal/slot libraries -- cpgf callback, libsigc++, and boost signal

This is a benchmark comparison of the speed performance of three C++ signal/slot/callback libraries, cpgf callback, libsigc++, and boost::signal2. The source code of the benchmark is included in the cpgf library.

Test environment
Hardware
  • Intel(R) Pentium(R) Dual CPU E2180 2GHz(A), 4G RAM.
Software
  • Windows XP SP2.
  • VC 9 (Microsoft Visual Studio 2008 Express), with _SECURE_SCL(B) = 0 and _HAS_ITERATOR_DEBUGGING = 0.
    Optimization -O2.
  • MingW GCC 4.5.2. Optimization -O3.
Functions Compiler Native call(C) Cpgf callback LibSigC++
2.2.8
Boost(D) Signal2
1.46.0
Single slot (callback) benchmark, 100M (100,000,000) iterations. Time unit: milliseconds.
Inline member function(E) VC 9 46 781 766 3547
GCC 4.5.2 47 1047 1000 1875
Non-inline member function VC 9 391 766 781 3515
GCC 4.5.2 453 1047 1000 1813
Virtual member function VC 9 359 719 672 3579
GCC 4.5.2 406 1031 969 1750
global function VC 9 47 719 734 3578
GCC 4.5.2 47 766 781 1562
functor object VC 9 47 468 438 3156
GCC 4.5.2 47 515 516 1547
Signal(F) (callback list) benchmark, 10M (10,000,000) iterations. Time unit: milliseconds.
Invoke empty signal VC 9 N/A 63 93 2000
GCC 4.5.2 N/A 31 31 1703
Invoke signal with 5 slots VC 9 N/A 641 2734 8797
GCC 4.5.2 N/A 578 2547 7515
Invoke signal with 10 slots VC 9 N/A 1109 3328 15203
GCC 4.5.2 N/A 1094 3032 13063
Invoke signal with 50 slots VC 9 N/A 6735 7875 64813
GCC 4.5.2 N/A 5109 6703 54969
Notes:
  1. The hardware used to test on is quite out of date. Any latest hardware will produce quite better performance.
  2. If you don't know why _SECURE_SCL should be 0, read my blog here.
  3. Native call is just invoking the functions directly.
  4. boost::signal2 is used in the benchmark instead of signal. Because signal2 is easy to use without linking to library, and boost::signal can't call slot directly. Also boost::signal2 is 50% faster than boost:signal.
  5. All callee functions, except the virtual functions, are not empty. They receive one int parameter and add it to a global variable. This is intended so the compilers won't eliminate the functions away. Function prototype: void (int).
  6. The slots that connected to the signal are various functions, include all inline, noninline, virtual, global, and functors.
  7. Some fast delegate code from codeproject was also tested, and they are one time faster than invoking slot of cpgf callback and libsigc++. However, they are not counted in the table because they can't be used to implement signal/slot mechanism and they can't handle functor object.


Some conclusions based on the data of cpgf callback and libsigc++ that compiled with VC:

  • For non-inline function, a slot, or a callback, has the same performance level as the native function call. Indeed a slot is only only 50% slower than a native call.
  • The slot invoking performance is almost the same between cpgf callback and libsigc++.
  • A single slot invoking is about equivalent to invoking a normal function via two function pointers, or call a virtual function two times.
  • Functor object has the best performance. So always prefer it when possible.
  • The slot invoking performance is enough for most purpose usage. 100M iterations only need up to 1000ms, which is average 10ns for a single call. That means if there are 10K slot invoking, only 0.1ms is spent on the invoking itself. 0.1ms is nothing for most high performance games. And don't forget this is the performance on quite old hardware.
  • The callback list (signal) performance of cpgf callback, for a single slot (note it's 5 slots in a signal in the benchmark) is 50% slower than one single slot call without callback list. That means if there are 10K slot invoking, only 0.2ms is spent on the invoking itself. That also means we can heavily use callbacks (signals) in a 60 FPS game without worrying the impact of the callbacks.
  • When there are few slots (callbacks) in a signal (callback list), cpgf callback has better performance than libsigc++ because cpgf callback has less setup overhead before dispatching. When there are a lot of slots, the performance difference between cpgf callback and libsigc++ is trivial.