Publicação

Automation of machine learning pipelines for anomaly detection challenges

Ver documento

Detalhes bibliográficos
Resumo:Machine Learning (ML) and Data Science can solve different real-world problems. Businesses are becoming increasingly interested in these approaches, and as technology evolves, new challenges can be identified, mostly regarding the ML models development, deployment cycle and data cleansing, which can significantly decrease the accuracy and viability of ML software systems. Development and Operations (DevOps) practices have become popular in operating software systems at scale successfully, but they need to be adapted to deliver the best results when applied to ML systems. This led to the emergence of Machine Learning and Operations (MLOps), a development culture specific for ML systems, derived from DevOps principles. What MLOps attempts to address is the unification of the development cycle of ML based software systems while striving for automation and monitoring, in order to allow continuous integration and delivery. With this thesis, the goal is to study different available frameworks and methods for ML systems, in order to develop an automated ML pipeline to ingest and manipulate high volumes of data. A sensorial system, which simulates the interior of a vehicle, gathers enough data to feed the pipeline. Alongside the development of the ML system, a visual interface which allows control over the overall system and its data is created.
Autores principais:Martins, Ricardo Rodrigues
Assunto:Software engineering Machine learning MLOps Model Automation Engenharia de software Aprendizagem automática Modelo Automação
Ano:2023
País:Portugal
Tipo de documento:dissertação de mestrado
Tipo de acesso:acesso aberto
Instituição associada:Universidade do Minho
Idioma:inglês
Origem:RepositóriUM - Universidade do Minho
Descrição
Resumo:Machine Learning (ML) and Data Science can solve different real-world problems. Businesses are becoming increasingly interested in these approaches, and as technology evolves, new challenges can be identified, mostly regarding the ML models development, deployment cycle and data cleansing, which can significantly decrease the accuracy and viability of ML software systems. Development and Operations (DevOps) practices have become popular in operating software systems at scale successfully, but they need to be adapted to deliver the best results when applied to ML systems. This led to the emergence of Machine Learning and Operations (MLOps), a development culture specific for ML systems, derived from DevOps principles. What MLOps attempts to address is the unification of the development cycle of ML based software systems while striving for automation and monitoring, in order to allow continuous integration and delivery. With this thesis, the goal is to study different available frameworks and methods for ML systems, in order to develop an automated ML pipeline to ingest and manipulate high volumes of data. A sensorial system, which simulates the interior of a vehicle, gathers enough data to feed the pipeline. Alongside the development of the ML system, a visual interface which allows control over the overall system and its data is created.