Publicação

SME credit application, a text classification approach

Ver documento

Detalhes bibliográficos
Resumo:During the SME credit application process a credit expert will give a specific recommendation to the credit commercial advisor. This recommendation can be classified as positive, negative or partial. This project aims to construct a text classifier model in order to give the recommendation text one of the categories mentioned before. To achieve this, two models are tested using state-of-the-art architecture called BERT proposed by Google in 2019. The first model will use single sentence BERT classification model as proposed by Google. The second model will use SBERT architecture, where BERT embedding model will be fine-tuned for the specific task, a max-pooling layer is added to extract a fixed size vector for all the document and work under fully connected network architecture. Results show that the second approach got better results regarding accuracy, precision and recall. Despite of the bunch of limitations of computational capacity, limited number of tagged examples and BERT maximum sequence length the model show a good first approach to solve the current problem.
Autores principais:López, Daniela Saavedra
Assunto:Natural Language Processing (NLP) Banking Credit application Small and medium enterprise (SME) Neural Networks (NN) Bi-directional Encoder Representations for Transformers (BERT)
Ano:2020
País:Portugal
Tipo de documento:dissertação de mestrado
Tipo de acesso:acesso aberto
Instituição associada:Universidade Nova de Lisboa
Idioma:inglês
Origem:Repositório Institucional da UNL
Descrição
Resumo:During the SME credit application process a credit expert will give a specific recommendation to the credit commercial advisor. This recommendation can be classified as positive, negative or partial. This project aims to construct a text classifier model in order to give the recommendation text one of the categories mentioned before. To achieve this, two models are tested using state-of-the-art architecture called BERT proposed by Google in 2019. The first model will use single sentence BERT classification model as proposed by Google. The second model will use SBERT architecture, where BERT embedding model will be fine-tuned for the specific task, a max-pooling layer is added to extract a fixed size vector for all the document and work under fully connected network architecture. Results show that the second approach got better results regarding accuracy, precision and recall. Despite of the bunch of limitations of computational capacity, limited number of tagged examples and BERT maximum sequence length the model show a good first approach to solve the current problem.