CCES Unicamp

Integrating Multi-FPGA Acceleration to OpenMP Distributed Computing

Designing high-performance scientific applications has become a time-consuming and complex task that requires developers to master multiple frameworks and toolchains. Although re-configurability and energy efficiency make FPGA a powerful accelerator, efficiently integrating multiple FPGAs into a distributed cluster is a complex and cumbersome task. Such complexity grows considerably when applications require partitioning execution among CPUs, GPUs, and FPGAs. This paper introduces FPGA offloading support to OpenMP cluster (OMPC), an OpenMP-only framework capable of transparently offloading computation across nodes in a cluster, which reduces developer effort and time to solution. In addition, OMPC enables true heterogeneity by allowing the programmer to assign program kernels to the most appropriate architecture (CPUs, GPUs, or FPGA), depending on their workload characteristics. This is achieved by adding only a few lines of standard OpenMP code to the application. The resulting framework was applied to the heterogeneous acceleration of an image recoloring application. Experimental results demonstrate speed-ups gains using different acceleration arrangements with CPU, GPU and FPGA. Measurements using Halstead metrics show that the proposed framework is faster to program. Furthermore, the solution enables transparently offloading OMPC communication tasks to multiple FPGAs, which results in speed-ups of up to 1.41x over the default communication mechanism (Message Passing Interface – MPI) on Task Bench, a synthetic benchmark for task parallelism.
 
 
 
 
 
ROSSO, Pedro Henrique et al. Integrating Multi-FPGA Acceleration to OpenMP Distributed Computing. In: International Workshop on OpenMP. Cham: Springer Nature Switzerland, 2024. p. 49-63.
 
 
 
 
 
 
 
 
 

 

Related posts

Packmol: A package to build initial configurations for molecular dynamics simulations

Leandro Martinez

Ionic liquid solvation of proteins in native and denatured states

Leandro Martinez

Fast BEM multi-domain approach for the elastostatic analysis of short fiber composites.

cces cces