Subject: TSX lock elision for glibc v2
Date: Thursday 10th January 2013 20:19:32 UTC (over 4 years ago)
This is a rework of the earlier patchkit not using IFUNC. This drops various depenendencies and simplifies things quite a lot. Some minor fixes for rwlocks, otherwise no changes. I dropped one patch that was already merged. --- Lock elision using TSX is a technique to optimize lock scaling. It allows to run existing locks in parallel using hardware memory transactions. New instructions (RTM) are used to control memory transactions. The full series is available at http://github.com/andikleen/glibc git://github.com/andikleen/glibc rtm-devel4 An overview is available in http://halobates.de/adding-lock-elision-to-linux.pdf See http://software.intel.com/file/41604 for the full TSX specification. Running TSX requires either new hardware with TSX support, or using the SDE emulator http://software.intel.com/en-us/articles/intel-software-development-emulator/ This patchkit implements a simple adaptive lock elision algorithm based on RTM. It enables elision for the pthread mutexes and rwlocks. The algorithm keeps track whether a mutex successfully elides or not, and stops eliding for some time when it is not. When the CPU supports RTM the elision path is automatically tried, otherwise any elision is disabled. The adaptation algorithm and its tuning is currently preliminary. I cannot post performance numbers at this point. The user can also tune this by setting the mutex type and environment variables. The lock transactions have a abort hook mechanism to hook into the abort path. This is quite useful for some debugging, so I kept this functionality. The mutexes can be configured at runtime with the PTHREAD_MUTEX environment variable. This will force a specific lock type for all mutexes in the program that do not have another type set explicitly. This can be done without modifying the program. Currently elision is enabled by default on systems that support RTM, unless explicitely disabled either in the program or by the user. Given more experience we can decide if that is a good idea, or if it should be opt-in. Limitations that may be fixable (but it's unclear if it's worth it): ------------------------------------------------------------------- - Adaptive enabled mutexes don't track the owner, so pthread_mutex_destroy will not detect a busy mutex. - Trylock on a already guaranteed to be locked lock will succeed - Unlocking an unlocked mutex will result in a crash currently (see above) - No elision support for recursive, error check mutexes Recursive may be possible, error check is unlikely - Some obscure cases can also fallback to non elision - Internal locks in glibc (like malloc or stdio) do not elide at this point. Changing these semantics would be possible, but has some runtime cost. Currently I decided to not do any expensive changes, but wait for more testing feedback. To be fixed: ------------ - The default tuning parameters may be revised.