Publicação
Extracting Ontologies From Neural Networks
| Resumo: | Artificial neural network-based methods have been growing in popularity, being success- fully applied to perform a variety of tasks. As these systems begin to be deployed in domains where it is desired that they have a certain degree of autonomy and respon- sibility, the need to comprehend the reasoning behind their answers is becoming a re- quirement. Though, neural networks are still regarded as black boxes, since their internal representation do not provide any human-understandable explanation for their outputs. A considerable amount of work has been done towards the development of methods to increase the interpretability of neural networks. However, these methods often produce interpretations are too complex and do not have any declarative meaning, leaving the user with the burden of rationalizing them. Recent work has shown that it is possible to establish mappings between a neural network’s internal representations and a set of human-understandable concepts. In this dissertation we propose a method that leverage these mappings to induce an ontology that describes a neural network’s classification process, through logical relations between human-understandable concepts. |
|---|---|
| Autores principais: | Ferreira, João Miguel Dias |
| Assunto: | Artificial Neural Networks Ontologies Rule Extraction Inductive Logic Programming |
| Ano: | 2022 |
| País: | Portugal |
| Tipo de documento: | dissertação de mestrado |
| Tipo de acesso: | acesso aberto |
| Instituição associada: | Universidade Nova de Lisboa |
| Idioma: | inglês |
| Origem: | Repositório Institucional da UNL |
| Resumo: | Artificial neural network-based methods have been growing in popularity, being success- fully applied to perform a variety of tasks. As these systems begin to be deployed in domains where it is desired that they have a certain degree of autonomy and respon- sibility, the need to comprehend the reasoning behind their answers is becoming a re- quirement. Though, neural networks are still regarded as black boxes, since their internal representation do not provide any human-understandable explanation for their outputs. A considerable amount of work has been done towards the development of methods to increase the interpretability of neural networks. However, these methods often produce interpretations are too complex and do not have any declarative meaning, leaving the user with the burden of rationalizing them. Recent work has shown that it is possible to establish mappings between a neural network’s internal representations and a set of human-understandable concepts. In this dissertation we propose a method that leverage these mappings to induce an ontology that describes a neural network’s classification process, through logical relations between human-understandable concepts. |
|---|