Publicação

Extracting Ontologies From Neural Networks

Detalhes bibliográficos
Resumo:	Artificial neural network-based methods have been growing in popularity, being success- fully applied to perform a variety of tasks. As these systems begin to be deployed in domains where it is desired that they have a certain degree of autonomy and respon- sibility, the need to comprehend the reasoning behind their answers is becoming a re- quirement. Though, neural networks are still regarded as black boxes, since their internal representation do not provide any human-understandable explanation for their outputs. A considerable amount of work has been done towards the development of methods to increase the interpretability of neural networks. However, these methods often produce interpretations are too complex and do not have any declarative meaning, leaving the user with the burden of rationalizing them. Recent work has shown that it is possible to establish mappings between a neural network’s internal representations and a set of human-understandable concepts. In this dissertation we propose a method that leverage these mappings to induce an ontology that describes a neural network’s classification process, through logical relations between human-understandable concepts.
Autores principais:	Ferreira, João Miguel Dias
Assunto:	Artificial Neural Networks Ontologies Rule Extraction Inductive Logic Programming
Ano:	2022
País:	Portugal
Tipo de documento:	dissertação de mestrado
Tipo de acesso:	acesso aberto
Instituição associada:	Universidade Nova de Lisboa
Idioma:	inglês
Origem:	Repositório Institucional da UNL

Descrição
Resumo:	Artificial neural network-based methods have been growing in popularity, being success- fully applied to perform a variety of tasks. As these systems begin to be deployed in domains where it is desired that they have a certain degree of autonomy and respon- sibility, the need to comprehend the reasoning behind their answers is becoming a re- quirement. Though, neural networks are still regarded as black boxes, since their internal representation do not provide any human-understandable explanation for their outputs. A considerable amount of work has been done towards the development of methods to increase the interpretability of neural networks. However, these methods often produce interpretations are too complex and do not have any declarative meaning, leaving the user with the burden of rationalizing them. Recent work has shown that it is possible to establish mappings between a neural network’s internal representations and a set of human-understandable concepts. In this dissertation we propose a method that leverage these mappings to induce an ontology that describes a neural network’s classification process, through logical relations between human-understandable concepts.