Publicação

In silico characterization of microbial communities interaction in soil samples

Ver documento

Detalhes bibliográficos
Resumo:Mlicrobial communities, besides its many applications, can represent a solution for pollution problems with reduced costs. However, to explore them in our favor, it is necessary to understand how they work and be able to infer their potential regarding specific metabolic networks. Because of the continuous growth of genomic data, various tools have been developed for homology and metabolic pathway inference, however new and improved strategies and algorithms still being required. In this work, it has been developed a pipeline that makes use of clusters of orthologous data to perform the annotation of unknown sequences, and after that, the prediction of species' functional potential and microbial interactions. For that were developed two tools, OrtScraper, for the download of bulk organized data from specif pathways of interest, and OrtAn that performs the annotation on clusters of orthologous groups. The test and evalua-tion of the pipeline were focused on the well-known transformation of benzoate to acetyl-CoA (BTA) pathway. Two different genome sets were used, set A, from whose the annotation of the sequences was known, and set B, from whose the capacity regarding the benzoate degradation was known. Both tools successfully performed the desired goal and for the annotation, the best cases presented an FL score over 0.90. The recall values of the annotation showed to be the weakest point of this pipeline, which led, possibly, to the unsatisfactory results on the prediction of the species functional potential. Some improvements to the developed tools and pipeline were proposed to improve the annotation and species functional potential inference.
Autores principais:Gomes, Marta Lopes
Assunto:Clustering Orthologous Homology Annotation Microbial communities Functional potencial Ortólogos Homologia Anotação Comunidades microbiais Potencial funcional
Ano:2019
País:Portugal
Tipo de documento:dissertação de mestrado
Tipo de acesso:acesso aberto
Instituição associada:Universidade do Minho
Idioma:inglês
Origem:RepositóriUM - Universidade do Minho
Descrição
Resumo:Mlicrobial communities, besides its many applications, can represent a solution for pollution problems with reduced costs. However, to explore them in our favor, it is necessary to understand how they work and be able to infer their potential regarding specific metabolic networks. Because of the continuous growth of genomic data, various tools have been developed for homology and metabolic pathway inference, however new and improved strategies and algorithms still being required. In this work, it has been developed a pipeline that makes use of clusters of orthologous data to perform the annotation of unknown sequences, and after that, the prediction of species' functional potential and microbial interactions. For that were developed two tools, OrtScraper, for the download of bulk organized data from specif pathways of interest, and OrtAn that performs the annotation on clusters of orthologous groups. The test and evalua-tion of the pipeline were focused on the well-known transformation of benzoate to acetyl-CoA (BTA) pathway. Two different genome sets were used, set A, from whose the annotation of the sequences was known, and set B, from whose the capacity regarding the benzoate degradation was known. Both tools successfully performed the desired goal and for the annotation, the best cases presented an FL score over 0.90. The recall values of the annotation showed to be the weakest point of this pipeline, which led, possibly, to the unsatisfactory results on the prediction of the species functional potential. Some improvements to the developed tools and pipeline were proposed to improve the annotation and species functional potential inference.