Publicação
Critical data leak detection in institutions’ public Web sites
| Resumo: | Content of modern Web sites could be vulnerable to the data leaks, but could also already contain data leaks in itself, especially in the content of large institution’s Web sites, where a lot of users have an access to large, constantly processed huge amounts of data, which can include sensitive data. Unlike content of databases the content of such Web sites are much less structured and therefore less trackable and even more vulnerable to leaks that could happen due to the human factor. Most existing Data Leak Detection Systems are designed to detect data leaks on networks or in highly organized and structured systems like, for example, databases. During this work we will describe the process of creation of the multi-user Data Leak Detection System which will be capable of detecting critical types of data inside different institution’s Web sites by using descriptive entities of such types received from users. With this work we make a contribution to solving the problem of data leakage from educational institutions’ Web sites by analyzing the problem and developing a Data Detection System capable of collecting data from Web sites independently of search engines and, with help of users, of detecting critical data types in the collected data, providing a user, on the end of detection process, with the basic type of the report, giving him the opportunity for further observation of the detected data in order to decide whether to remove those data from the corresponding Web pages or not. |
|---|---|
| Autores principais: | Igorevich, Vasilenko Andrey |
| Assunto: | Data leak detection Crawling Nutch Solr Information Security GDPR |
| Ano: | 2020 |
| País: | Portugal |
| Tipo de documento: | dissertação de mestrado |
| Tipo de acesso: | acesso aberto |
| Instituição associada: | Instituto Politécnico de Bragança |
| Idioma: | inglês |
| Origem: | Biblioteca Digital do IPB |
| Resumo: | Content of modern Web sites could be vulnerable to the data leaks, but could also already contain data leaks in itself, especially in the content of large institution’s Web sites, where a lot of users have an access to large, constantly processed huge amounts of data, which can include sensitive data. Unlike content of databases the content of such Web sites are much less structured and therefore less trackable and even more vulnerable to leaks that could happen due to the human factor. Most existing Data Leak Detection Systems are designed to detect data leaks on networks or in highly organized and structured systems like, for example, databases. During this work we will describe the process of creation of the multi-user Data Leak Detection System which will be capable of detecting critical types of data inside different institution’s Web sites by using descriptive entities of such types received from users. With this work we make a contribution to solving the problem of data leakage from educational institutions’ Web sites by analyzing the problem and developing a Data Detection System capable of collecting data from Web sites independently of search engines and, with help of users, of detecting critical data types in the collected data, providing a user, on the end of detection process, with the basic type of the report, giving him the opportunity for further observation of the detected data in order to decide whether to remove those data from the corresponding Web pages or not. |
|---|