Publicação
Numerical simulations on heterogeneous systems: dynamic workload and power management
| Resumo: | Numerical simulations are among the most relevant and computationally demanding applications used by scientists and engineers. As accuracy requirements keep increasing so does the corresponding workload and, consequently, the demand for additional computing power. HPC systems are thus a fundamental tool to allow for a time effective execution of such simulations; performance maximization is therefore a pertinent and crucial subject of research. Over the last decade HPC has undergone a major shift, resulting on heterogeneous parallel computing systems, which integrate devices with different architectures, exposing different instruction sets, programming and execution models, and ultimately, delivering significantly different performances. This heterogeneity raises a variety of challenges to application developers, such as performance and code non-portability, performance imbalances and disjoint memory address spaces. These challenges not only widen the gap between peak and sustained performance, but also significantly reduce development productivity. Additionally, numerical applications often exhibit dynamic workloads, with unpredictable computational requirements, which, together with associated code divergence and branching workflow, further aggravates the heterogeneity challenge — this is defined as the Two-fold Challenge. The increasing scale in HPC systems also leads to a fast growing power consumption, with power management solutions being of crucial importance. The design of such solutions becomes harder within the two-fold challenge context. This thesis addresses the Two-fold Challenge in the context of numerical simulations and HPC systems, focusing on optimising sustained performance and power consumption. A variety of mechanisms is proposed and validated across different parallel computing paradigms. These mechanisms include a unified execution and programming model, a transparent data management component and heterogeneity-aware dynamic load balancing and power management systems. The contributions of this thesis are divided into three areas: efficient and effective application development and execution on heterogeneous single-nodes with multiple computing devices, load and performance imbalances in heterogeneous distributed systems and power-performance trade-offs in heterogeneous distributed systems. In order to foster the adoption of proposed mechanisms, some were designed and integrated into a widely used numerical simulation library — OpenFOAM. Experimental results assert the effectiveness of the proposed approaches, resulting on significant gains in performance and reduced power consumption in multiple scenarios. |
|---|---|
| Autores principais: | Ribeiro, Roberto Carlos Sá |
| Assunto: | Engenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e Informática |
| Ano: | 2019 |
| País: | Portugal |
| Tipo de documento: | tese de doutoramento |
| Tipo de acesso: | acesso aberto |
| Instituição associada: | Universidade do Minho |
| Idioma: | inglês |
| Origem: | RepositóriUM - Universidade do Minho |
| Resumo: | Numerical simulations are among the most relevant and computationally demanding applications used by scientists and engineers. As accuracy requirements keep increasing so does the corresponding workload and, consequently, the demand for additional computing power. HPC systems are thus a fundamental tool to allow for a time effective execution of such simulations; performance maximization is therefore a pertinent and crucial subject of research. Over the last decade HPC has undergone a major shift, resulting on heterogeneous parallel computing systems, which integrate devices with different architectures, exposing different instruction sets, programming and execution models, and ultimately, delivering significantly different performances. This heterogeneity raises a variety of challenges to application developers, such as performance and code non-portability, performance imbalances and disjoint memory address spaces. These challenges not only widen the gap between peak and sustained performance, but also significantly reduce development productivity. Additionally, numerical applications often exhibit dynamic workloads, with unpredictable computational requirements, which, together with associated code divergence and branching workflow, further aggravates the heterogeneity challenge — this is defined as the Two-fold Challenge. The increasing scale in HPC systems also leads to a fast growing power consumption, with power management solutions being of crucial importance. The design of such solutions becomes harder within the two-fold challenge context. This thesis addresses the Two-fold Challenge in the context of numerical simulations and HPC systems, focusing on optimising sustained performance and power consumption. A variety of mechanisms is proposed and validated across different parallel computing paradigms. These mechanisms include a unified execution and programming model, a transparent data management component and heterogeneity-aware dynamic load balancing and power management systems. The contributions of this thesis are divided into three areas: efficient and effective application development and execution on heterogeneous single-nodes with multiple computing devices, load and performance imbalances in heterogeneous distributed systems and power-performance trade-offs in heterogeneous distributed systems. In order to foster the adoption of proposed mechanisms, some were designed and integrated into a widely used numerical simulation library — OpenFOAM. Experimental results assert the effectiveness of the proposed approaches, resulting on significant gains in performance and reduced power consumption in multiple scenarios. |
|---|