Publicação

A framework for heterogeneous many-core machines

Detalhes bibliográficos
Resumo:	Software development is known for being a complex task, especially when parallelism is involved. This complexity can, however, be reduced by dividing the software into smaller manageable modules. This philosophy is embraced by modular programming, which promotes the separation of concerns in well-defined modules. Unfortunately, traditional parallel programming models (e.g., OpenMP and MPI) are typically nonmodular, leading to the mix of parallelism- and domain- related concerns. To aim for maximum performance the parallel applications should be tuned to the characteristics of the target architecture(s). However, in traditional approaches, this tuning process leads to unceasing and invasive adjustments of the domain code since the parallelism-related concerns are mixed directly in the domain code. This lack of modularity increases the complexity of parallel programming and jeopardizes the application maintenance. These problems are even more exacerbated in hybrid parallelism (i.e., combining shared and distributed memory), which aims to exploit hierarchical systems such as clusters of multicore machines. Hence, these hybrid systems increase the complexity of the development of parallel applications even more, and consequently, emphasize the need for modular approaches. This thesis exploits the notion that modularity, pluggability (i.e., the ability to (un)plug modules without modifying the base code), and composability are key properties to make the process of developing parallel applications less complex. The first step towards achieving these properties is the separation of the parallelism-related concerns from the domain concerns and consequent encapsulation in proper modules. This thesis exploited the use of aspect-oriented programming (AOP) to achieve the separation of parallelism-related concerns and combined it with a methodology based on structured programming and design rules (i.e., designing the domain code accordingly). The result is an aspect-oriented framework that enables the development of modular parallel applications. This framework intrinsically supports the development of applications with hybrid parallelism by composing, in a non-invasive fashion, several parallelism-related modules with a given domain code. This framework shines by combining the efficiency and expressiveness of popular HPC parallel programming models with the modular features of aspect- and object- oriented (OO) design. As a result of studying AOP in the context of parallelism, we introduce the idea of parallelism layers, which combines the simplicity of well-known OO concepts (i.e., class extension and method overriding) with the flexibility of AOP. On the one hand, this combination enables the users of our framework to add parallelism to domain code, using familiar concepts analogous to class extension and method overriding but without the limitations of OO inheritance. On the other hand, programmers can exploit the advanced features of AOP, which, among others, are helpful to extend the functionality of the framework. Hence, parallelism layers provide a simple yet flexible approach for the development of parallel applications. Finally, to reduce the complexity of parallel programming even further, we enhanced the parallelism layers with a methodology and a workflow to parallelize applications – including hybrid parallelizations – in an incremental and structured manner. We evaluated the performance and programmability of our framework in comparison to other approaches by using a set of case studies and executing them in a cluster of multicores. We illustrated, using our framework and workflow, the entire process of developing efficient and modular parallelizations – from the sequential up to the hybrid version. Moreover, we show that our framework and workflow help to find more efficient parallelizations than the ones initially implemented. These results showed that parallelism layers are ideal for the quick prototyping and testing of di↵erent parallel strategies. The results show that the parallelizations developed with the framework had a performance comparable to the intrusive parallelizations and, at the same time, were less verbose. With our approach, all the hybrid versions were seamlessly implemented. These hybrids were always faster than the correspondent versions that only used MPI processes, which emphasizes the potentiality of hybrid parallelizations in clusters of multicores.
Autores principais:	Medeiros, Bruno Silvestre
Assunto:	Engenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e Informática
Ano:	2019
País:	Portugal
Tipo de documento:	tese de doutoramento
Tipo de acesso:	acesso aberto
Instituição associada:	Universidade do Minho
Idioma:	inglês
Origem:	RepositóriUM - Universidade do Minho

Descrição
Resumo:	Software development is known for being a complex task, especially when parallelism is involved. This complexity can, however, be reduced by dividing the software into smaller manageable modules. This philosophy is embraced by modular programming, which promotes the separation of concerns in well-defined modules. Unfortunately, traditional parallel programming models (e.g., OpenMP and MPI) are typically nonmodular, leading to the mix of parallelism- and domain- related concerns. To aim for maximum performance the parallel applications should be tuned to the characteristics of the target architecture(s). However, in traditional approaches, this tuning process leads to unceasing and invasive adjustments of the domain code since the parallelism-related concerns are mixed directly in the domain code. This lack of modularity increases the complexity of parallel programming and jeopardizes the application maintenance. These problems are even more exacerbated in hybrid parallelism (i.e., combining shared and distributed memory), which aims to exploit hierarchical systems such as clusters of multicore machines. Hence, these hybrid systems increase the complexity of the development of parallel applications even more, and consequently, emphasize the need for modular approaches. This thesis exploits the notion that modularity, pluggability (i.e., the ability to (un)plug modules without modifying the base code), and composability are key properties to make the process of developing parallel applications less complex. The first step towards achieving these properties is the separation of the parallelism-related concerns from the domain concerns and consequent encapsulation in proper modules. This thesis exploited the use of aspect-oriented programming (AOP) to achieve the separation of parallelism-related concerns and combined it with a methodology based on structured programming and design rules (i.e., designing the domain code accordingly). The result is an aspect-oriented framework that enables the development of modular parallel applications. This framework intrinsically supports the development of applications with hybrid parallelism by composing, in a non-invasive fashion, several parallelism-related modules with a given domain code. This framework shines by combining the efficiency and expressiveness of popular HPC parallel programming models with the modular features of aspect- and object- oriented (OO) design. As a result of studying AOP in the context of parallelism, we introduce the idea of parallelism layers, which combines the simplicity of well-known OO concepts (i.e., class extension and method overriding) with the flexibility of AOP. On the one hand, this combination enables the users of our framework to add parallelism to domain code, using familiar concepts analogous to class extension and method overriding but without the limitations of OO inheritance. On the other hand, programmers can exploit the advanced features of AOP, which, among others, are helpful to extend the functionality of the framework. Hence, parallelism layers provide a simple yet flexible approach for the development of parallel applications. Finally, to reduce the complexity of parallel programming even further, we enhanced the parallelism layers with a methodology and a workflow to parallelize applications – including hybrid parallelizations – in an incremental and structured manner. We evaluated the performance and programmability of our framework in comparison to other approaches by using a set of case studies and executing them in a cluster of multicores. We illustrated, using our framework and workflow, the entire process of developing efficient and modular parallelizations – from the sequential up to the hybrid version. Moreover, we show that our framework and workflow help to find more efficient parallelizations than the ones initially implemented. These results showed that parallelism layers are ideal for the quick prototyping and testing of di↵erent parallel strategies. The results show that the parallelizations developed with the framework had a performance comparable to the intrusive parallelizations and, at the same time, were less verbose. With our approach, all the hybrid versions were seamlessly implemented. These hybrids were always faster than the correspondent versions that only used MPI processes, which emphasizes the potentiality of hybrid parallelizations in clusters of multicores.