|
Subject: Re: CLISP multithreading (patch included) Newsgroups: gmane.lisp.clisp.devel Date: 2008-07-28 11:33:40 GMT (48 weeks, 6 days and 5 hours ago) Hi Sam, I have implemented the same thing with TLS (__thread or xthread_key_get() whichever is available). Linked is another patch for it that can be applied against the CVS (it is not compatible with the previous one - actually I have removed all the THREAD_SP_SHIFT, sp_to_thread(), etc). http://code.brumbar.com/clisp-mt-tls-20080728.patch This time the code is much cleaner (the patch is big, since I removed per_thread from all global variables - now there is single one _current_thread which is per_thread - this might be useful for embedding). The LISP stack for the new threads is allocated by malloc() which may be not so good - but did not want to mess with memory mappings. With this patch it should be possible to build almost straightforward for Win32 (native threads) - only the xthread_cancel() should be implemented. I have done benchmarks (the standard ones) and here is link to the results. http://code.brumbar.com/clisp-mt-bench.txt This is the summary: (32 bit Debian x86 2.6.18-4) TLS-THREAD-LINUX-X86 (with __thread ) total 23.34946 sec 23.34946 scaled TLS-THREAD-LINUX-X86 (TLS via xthread_key_get /pthread_getspecific/) total 62.44790 sec 62.44790 scaled SP-THREAD-LINUX-X86 (stack pointer tweaking) total 22.11738 sec 22.11738 scaled NO-THREADS-LINUX-X86 (CVS HEAD version) total 21.14132 sec 21.14132 scaled As it seems the implementation with pthread_getspecific is almost 3 times slower. There is no big difference between the other values - single threaded CVS build has little advantage. OSX PPC Darwin Kernel Version 8.11.0 TLS-THREADS-OSX-PPC (TLS via xthread_key_get /pthread_getspecific/) total 333.30949 sec 333.30949 scaled SP-THREADS-OSX-PPC (TLS vis stack pointer tweaking) total 57.13520 sec 57.13520 scaled NO-THREADS-OSX-PPC (CVS HEAD version) total 61.29787 sec 61.29787 scaled Here the build with TLS (via pthread_getspecific) is really very slow - almost 6 times. The SP tweaking however has a small advantage over the single thread CVS build !!! Some more information about the difference between __thread and pthread_getspecific can be found here: http://blogs.sun.com/seongbae/date/20051216 So as it seems it is reasonable to use TLS when the compiler provides built in support for it - the code is straightforward and performance is good. In all other cases the SP tweaking gives much better performance. BR Vladimir >> I would prefer a TLS (Thread-Local Storage - __thread / per_thread) >> approach because it should be cheaper. >> we can keep both though - on the platforms with TLS, declare >> clisp_thread_t* current_thread not a function but a per_thread variable. > > Basically every multithreading environment I know provides a mechanism > for TLS however not all compilers have __thread (__desclspec(thread)) > support. The Apple fork of gcc does not (probably others as well). > > You are right - it is possible to redefine current_current() depending > on this. For example on osx (and platforms without compiler TLS > support) it can be something like: > #define current_thread() ({ var clisp_thread_t *__thr; > (clisp_thread_t *)xthread_key_get(cur_thr_key); }) > > This will remove the ugliness of switching manually the stack pointer > (plus possible unexpected consequences of this) and is supposed to be > quite portable. If it is fine (I do not see a reason not to be) - > there will be no reason to keep the sp_to_thread() stuff (unless > somebody runs in on MT platform that does not provide TLS - is there > such platform?). > > |
|
|