Publicação

Using text mining to diagnose and classify epilepsy in children

Detalhes bibliográficos
Resumo:	Epilepsy diagnosis can be an extremely complex process, demanding considerable time and effort from physicians and healthcare infrastructures. Physicians need to classify each specific type of epilepsy based on different data, e.g., types of seizures, events and exams' results. This work presents a text mining approach to support medical decisions relating to epilepsy diagnosis and classification in children. We propose a text mining process that, using patient medical records, applies ontologies and named entities recognition as preprocessing steps, then applying K-Nearest Neighbors as a white-box lazy method to classify each instance. Results on real medical records suggest that the proposed framework shows good performance and clear interpretations, albeit the reduced volume of available training data.
Autores principais:	Luis Pereira
Outros Autores:	Rijo, Rui; Silva, Catarina; Agostinho, Margarida
Assunto:	Data mining Text mining Electronic medical records ICD codes Machine learning Epilepsy
Ano:	2013
País:	Portugal
Tipo de documento:	comunicação em conferência
Tipo de acesso:	acesso restrito
Instituição associada:	Instituto Politécnico de Leiria
Idioma:	inglês
Origem:	IC-online

Descrição
Resumo:	Epilepsy diagnosis can be an extremely complex process, demanding considerable time and effort from physicians and healthcare infrastructures. Physicians need to classify each specific type of epilepsy based on different data, e.g., types of seizures, events and exams' results. This work presents a text mining approach to support medical decisions relating to epilepsy diagnosis and classification in children. We propose a text mining process that, using patient medical records, applies ontologies and named entities recognition as preprocessing steps, then applying K-Nearest Neighbors as a white-box lazy method to classify each instance. Results on real medical records suggest that the proposed framework shows good performance and clear interpretations, albeit the reduced volume of available training data.