Using Barrier Elision to Improve Transactional Code Generation

With chip manufacturers such as Intel, IBM, and ARM offering native support for transactional memory in their instruction set architectures, memory transactions are on the verge of being considered a genuine application tool rather than just an interesting research topic. Despite this recent increase in popularity on the hardware side of transactional memory (HTM), software support for transactional memory (STM) is still scarce and the only compiler with transactional support currently available, the GNU Compiler Collection (GCC), does not generate code that achieves desirable performance. For hybrid solutions of TM (HyTM), which are frameworks that leverage the best aspects of HTM and STM, the subpar performance of the software side, caused by inefficient compiler generated code, might forbid HyTM to offer optimal results. This article extends previous work focused exclusively on STM implementations by presenting a detailed analysis of transactional code generated by GCC in the context of HybridTM implementations. In particular, it builds on previous research of transactional memory support in the Clang/LLVM compiler framework, which is decoupled from any TM runtime, and presents the following novel contributions: (a) it shows that STM’s performance overhead, due to an excessive amount of read and write barriers added by the compiler, also impacts the performance of HyTM systems; and (b) it reveals the importance of the previously proposed annotation mechanism to reduce the performance gap between HTM and STM in phased runtime systems. Furthermore, it shows that, by correctly using the annotations on just a few lines of code, it is possible to reduce the total number of instrumented barriers by 95% and to achieve speed-ups of up to 7× when compared to the original code generated by GCC and the Clang compiler

Bruno Chinelato Honorio, João P. L. De Carvalho, Catalina Munoz Morales, Alexandro Baldassin, and Guido Araujo. 2022. Using Barrier Elision to Improve Transactional Code Generation. ACM Trans. Archit. Code Optim. 19, 3, Article 46 (September 2022), 23 pages. https://doi.org/10.1145/3533318

Using OpenMP to Detect and Speculate Dynamic DOALL Loops

Improving Phased Transactional Memory via Commit Throughput and Capacity Estimation

Related posts

BioNetComp: a Python package for biological network development and comparison

Fast BEM multi-domain approach for the elastostatic analysis of short fiber composites.

CCES article in top Altimetric ranking