Publicação

Critical data leak detection in institutions’ public Web sites

Ver documento

Detalhes bibliográficos
Resumo:Content of modern Web sites could be vulnerable to the data leaks, but could also already contain data leaks in itself, especially in the content of large institution’s Web sites, where a lot of users have an access to large, constantly processed huge amounts of data, which can include sensitive data. Unlike content of databases the content of such Web sites are much less structured and therefore less trackable and even more vulnerable to leaks that could happen due to the human factor. Most existing Data Leak Detection Systems are designed to detect data leaks on networks or in highly organized and structured systems like, for example, databases. During this work we will describe the process of creation of the multi-user Data Leak Detection System which will be capable of detecting critical types of data inside different institution’s Web sites by using descriptive entities of such types received from users. With this work we make a contribution to solving the problem of data leakage from educational institutions’ Web sites by analyzing the problem and developing a Data Detection System capable of collecting data from Web sites independently of search engines and, with help of users, of detecting critical data types in the collected data, providing a user, on the end of detection process, with the basic type of the report, giving him the opportunity for further observation of the detected data in order to decide whether to remove those data from the corresponding Web pages or not.
Autores principais:Igorevich, Vasilenko Andrey
Assunto:Data leak detection Crawling Nutch Solr Information Security GDPR
Ano:2020
País:Portugal
Tipo de documento:dissertação de mestrado
Tipo de acesso:acesso aberto
Instituição associada:Instituto Politécnico de Bragança
Idioma:inglês
Origem:Biblioteca Digital do IPB
Descrição
Resumo:Content of modern Web sites could be vulnerable to the data leaks, but could also already contain data leaks in itself, especially in the content of large institution’s Web sites, where a lot of users have an access to large, constantly processed huge amounts of data, which can include sensitive data. Unlike content of databases the content of such Web sites are much less structured and therefore less trackable and even more vulnerable to leaks that could happen due to the human factor. Most existing Data Leak Detection Systems are designed to detect data leaks on networks or in highly organized and structured systems like, for example, databases. During this work we will describe the process of creation of the multi-user Data Leak Detection System which will be capable of detecting critical types of data inside different institution’s Web sites by using descriptive entities of such types received from users. With this work we make a contribution to solving the problem of data leakage from educational institutions’ Web sites by analyzing the problem and developing a Data Detection System capable of collecting data from Web sites independently of search engines and, with help of users, of detecting critical data types in the collected data, providing a user, on the end of detection process, with the basic type of the report, giving him the opportunity for further observation of the detected data in order to decide whether to remove those data from the corresponding Web pages or not.