#include <QThread.h>
Basically, when more than one thing uses data at the same time, corruption inevitably results because no operation on data is atomic eg; a=a+1 may compile into three instructions on some processors: load a; add one to a; store a; and hence if another thread is doing exactly the same operation at the same time, one will be added and not two as is correct. Even if an add operation is atomic on your processor, calculating an inverse square root won't be.
Because of this, threadsafe code must synchronise access to any data accessed by more than one thread ie; permit only one thing to use the data at a time. On Win32/64, such a thing is called a critical section (NOT a mutex - they have much more functionality and are much much slower) and under POSIX such a thing is called a mutex.
Generally QMutex is used by compositing it into your class eg;
class MyProtectedClass : public QMutex { MyProtectedClass() : QMutex() { ... } };
Mutexes are the single most used item in multithreaded programming and hence their performance is extremely important. Unfortunately, POSIX threads implementation of mutexs is mostly woeful, and critical section objects on Win32 are less than optimal as well.
Hence on Intel x86 architectures, or on all MS Windows, QMutex is implemented directly using inlined FXAtomicInt operations. This solution has proved in benchmarks to be as optimal as it gets for the supported platforms. On those POSIX platforms which allow POSIX semaphores to be used from within signal/cleanup handlers, interaction with the kernel can be absolutely minimised.
Worst case scenario falls back to the mutex provided by POSIX threads, so whatever happens you get correct operation.
Where implementation is direct (see above), a default spin count of 4000 applies (a spin count is how often a lock() retries acquisition before invoking a kernel wait). The value of this is important because it can make a huge difference to performance and for you to choose it correctly you need to bear some factors in mind.
Kernel waits are extremely expensive - putting a thread to sleep costs tens of thousands of cycles and thousands to wake it up. If a mutex is rarely in contention (ie; more than one thing claiming it at once) then it makes sense to retry the acquisition knowing that the chances are the current holder will relinquish it very soon.
The obvious crux is that spinning takes up processor time to the exclusion of all else, so too much spinning slows things down. Conversely too little spinning means too many kernel waits at possibly more than 100,000 cycles on some systems. You should set your spin count so it takes the same time to make the count as the average period the lock is held as so the other processor will have released the lock before the count is completed.
Finding where data is being altered without holding a lock can be difficult - this is where setting the debug yield flag using QMutex::setMutexDebugYield() can be useful. When set the mutex calls FX::QThread::yield() directly after any lock thus ensuring that anything else wanting that lock will go to sleep on it so you can inspect the situation using the debugger. Obviously this can severely degrade performance so only set for certain periods around the area you require it.
Now that QMutex has been hand tuned and optimised, we can present some figures (which were valid for v0.85):
FXAtomicInt QMutex SMP Build, 1 thread : 51203277 18389113 SMP Build, 2 threads: 4793978 5337603 Non-SMP Build, 1 thread : 103305785 27352297 Non-SMP Build, 2 threads: 54929964 10978153
However we can see that a mutex's performance when free of contention is some 3.4 times faster than when it is in contention. This is in fact very good and quite superior to other fast mutex implementations. On non-SMP builds, it's a factor of 2.5 but then atomic int operations are around precisely half as you'd expect with two threads.
Non-SMP builds' QMutex is between 48% and 106% faster than the SMP build's.
As of v0.86, defining FXINLINE_MUTEX_IMPLEMENTATION defines this class inline to all code including QThread.h. This can cause substantial performance improvement at the cost of code size.
Public Member Functions | |
QMUTEX_INLINEP | QMutex (FXuint spinCount=4000) |
QMUTEX_INLINEP bool | isLocked () const |
QMUTEX_INLINEP FXbool | locked () const |
QMUTEX_INLINEP FXuint | spinCount () const |
QMUTEX_INLINEP void | setSpinCount (FXuint c) |
QMUTEX_INLINEP void | lock () |
QMUTEX_INLINEP void | unlock () |
QMUTEX_INLINEP bool | tryLock () |
QMUTEX_INLINEP FXbool | trylock () |
Static Public Member Functions | |
static QMUTEX_INLINEP bool | setMutexDebugYield (bool v) |
Friends | |
class | QRWMutex |
QMUTEX_INLINEP FX::QMutex::QMutex | ( | FXuint | spinCount = 4000 |
) |
Constructs a mutex with the given spin count.
References FXERRHM, FXERRHOS, FX::FXRBConstruct(), FX::FXRBNew(), and FX::FXProcess::noOfProcessors().
QMUTEX_INLINEP bool FX::QMutex::isLocked | ( | ) | const |
Returns if the mutex is locked.
QMUTEX_INLINEP FXuint FX::QMutex::spinCount | ( | ) | const |
Returns the current spin count.
QMUTEX_INLINEP void FX::QMutex::setSpinCount | ( | FXuint | c | ) |
Sets the spin count.
QMUTEX_INLINEP void FX::QMutex::lock | ( | ) |
If free, claims the mutex and returns immediately. If not, waits until the current holder releases it and then claims it before returning
Reimplemented in FX::TnFXApp.
QMUTEX_INLINEP void FX::QMutex::unlock | ( | ) |
Releases the mutex for other to claim. Must be called as many times as lock() is called
Reimplemented in FX::TnFXApp.
bool FX::QMutex::tryLock | ( | ) |
Claims the mutex if free and returns true. If already taken, immediately returns false without waiting
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
For FOX compatibility
References FX::QThread::id().
QMUTEX_INLINEP bool FX::QMutex::setMutexDebugYield | ( | bool | v | ) | [static] |
Sets the debugging flag for mutexs in this process. See the main description above