* The cache size (which determines the amount of memory you can write to in a transaction before having to commit back) is insufficient, causing excessive transaction aborts.
* There is no mechanism to bypass the HTM, writing to memory within a transaction that is not rolled back. This exacerbates the small cache size, since all memory writes have a cost, not just the ones you want rolled back in the case of a transaction abort.
Interestingly, this does not bode well for HTM on a platform with many smaller cores, say a hypothetical 64 core ARM. Each core will have a tiny amount of L1 cache, severely limiting transaction size.
And many smaller cores is exactly where you'd want the benefits of HTM, since the overhead of synchronization is higher in proportion to the work each core can do.
http://pypy.org/tmdonate.html (Search for "haswell")
http://grokbase.com/t/python/pypy-dev/13bvt3kg70/pluggable-h...
It seems to boil down to:
* The cache size (which determines the amount of memory you can write to in a transaction before having to commit back) is insufficient, causing excessive transaction aborts.
* There is no mechanism to bypass the HTM, writing to memory within a transaction that is not rolled back. This exacerbates the small cache size, since all memory writes have a cost, not just the ones you want rolled back in the case of a transaction abort.
Interestingly, this does not bode well for HTM on a platform with many smaller cores, say a hypothetical 64 core ARM. Each core will have a tiny amount of L1 cache, severely limiting transaction size.
And many smaller cores is exactly where you'd want the benefits of HTM, since the overhead of synchronization is higher in proportion to the work each core can do.