Publicação

Graph kernels and neural networks for predicting yields of chemical reactions

Ver documento

Detalhes bibliográficos
Resumo:Predicting chemical reaction yields is a widely investigated problem in drug discovery due to the natu ral appearance of diseases and viruses worldwide. With the evolution and further study of machine learn ing, this area of computer science has provided alternatives that help chemists search for more effective molecule combinations. This dissertation presents two research hypotheses with different bases that seek to improve this prediction problem. The first research hypothesis is related to the support vector regression algorithm, which uses graph kernels to measure similarity between molecules and then perform the pre diction. We propose the application of non-linearity in the Weisfeiler-Lehman graph kernel to improve the measure of comparison between molecules and thus enhance the complexity of the support vector regres sion models. The second research hypothesis is related to the class of neural networks. We propose a deep learning base to solve this problem through graph neural networks, which use graph convolutional layers and global read-out operations to extract molecular features from graph-structure data. The main focus is to ensure that all models generalise well to obtain good results in experiments with unknown molecules. We performed tests on chemical data for both methods and achieved improvements. The non-linearity in graph kernels proved to be the most advantageous, having surpassed the state-of-the-art methods in one of the two global tests performed. The graph neural networks were not as effective, although they showed competitive results. Concerning neural networks, we highlight the creation of the deep learning base and the in-depth analysis of the hyperparameters to enhance further research on the reaction yield prediction problem, as this area shows immense potential in drug discovery.
Autores principais:Braga, Diogo Filipe Ribeiro Ferreira
Assunto:Machine learning Chemistry Graphs Kernel methods Graph neural networks Química Grafos Redes neuronais de grafos
Ano:2022
País:Portugal
Tipo de documento:dissertação de mestrado
Tipo de acesso:acesso aberto
Instituição associada:Universidade do Minho
Idioma:inglês
Origem:RepositóriUM - Universidade do Minho
Descrição
Resumo:Predicting chemical reaction yields is a widely investigated problem in drug discovery due to the natu ral appearance of diseases and viruses worldwide. With the evolution and further study of machine learn ing, this area of computer science has provided alternatives that help chemists search for more effective molecule combinations. This dissertation presents two research hypotheses with different bases that seek to improve this prediction problem. The first research hypothesis is related to the support vector regression algorithm, which uses graph kernels to measure similarity between molecules and then perform the pre diction. We propose the application of non-linearity in the Weisfeiler-Lehman graph kernel to improve the measure of comparison between molecules and thus enhance the complexity of the support vector regres sion models. The second research hypothesis is related to the class of neural networks. We propose a deep learning base to solve this problem through graph neural networks, which use graph convolutional layers and global read-out operations to extract molecular features from graph-structure data. The main focus is to ensure that all models generalise well to obtain good results in experiments with unknown molecules. We performed tests on chemical data for both methods and achieved improvements. The non-linearity in graph kernels proved to be the most advantageous, having surpassed the state-of-the-art methods in one of the two global tests performed. The graph neural networks were not as effective, although they showed competitive results. Concerning neural networks, we highlight the creation of the deep learning base and the in-depth analysis of the hyperparameters to enhance further research on the reaction yield prediction problem, as this area shows immense potential in drug discovery.