Publicação

A data analysis approach to evaluate the performance of predictive models

Ver documento

Detalhes bibliográficos
Resumo:Bikes have become an increasingly popular mode of transportation, mainly due to their agility in covering short distances and as a sustainable mode of transportation. More and more cities worldwide are adopting bike-sharing systems, where users can rent a bike for a small fee. However, this shift also brings its own challenges, that ranges from ensuring safety and robust infrastructure to managing costs. One prominent issue is ensuring bike availability. After all, the main value proposition of such systems is their convenience. If a user approaches a rental station and do not finds any bike available, or if there are no slots to return the rented bike, the system loses its purpose. Predicting the number of rentals each station will have each day is a challenge, as this value is highly unpredictable and can be influenced by various factors, from the weather to local events. This dissertation addresses the bike availability issue in a bike-sharing system. Having bikes available at the right place and the right time is the key to success. To effectively forecast demand, two predictive algorithms were implemented: SARIMAX and Gradient Boosting. SARIMAX is a variant of the well-known ARIMA, recognized for its accuracy in time series forecasting. On the other hand, Gradient Boosting, an algorithm based on decision-trees, is widely used because of its ability to handle vast amounts of data with minimal computational resources. The core question of this dissertation is to determine which of these algorithms will best predict the daily demand of each station, ensuring that users always have a bike available when and where they need it. This guarantees users satisfaction and, in return, promotes the growth of companies managing such systems. Based on the daily rental volumes of each station, the Gradient Boosting algorithm was the one that presented the best performance. This performance was further improved when the stations were divided into clusters, depending on their rental volume.
Autores principais:Pinho, Adriana Costa
Assunto:Bike-sharing system Demand forecasting Gradient boosting ARIMA SARIMA SARIMAX Sistema de partilha de bicicletas Previsão da procura Gradient boosting
Ano:2024
País:Portugal
Tipo de documento:dissertação de mestrado
Tipo de acesso:acesso aberto
Instituição associada:Universidade do Minho
Idioma:inglês
Origem:RepositóriUM - Universidade do Minho
Descrição
Resumo:Bikes have become an increasingly popular mode of transportation, mainly due to their agility in covering short distances and as a sustainable mode of transportation. More and more cities worldwide are adopting bike-sharing systems, where users can rent a bike for a small fee. However, this shift also brings its own challenges, that ranges from ensuring safety and robust infrastructure to managing costs. One prominent issue is ensuring bike availability. After all, the main value proposition of such systems is their convenience. If a user approaches a rental station and do not finds any bike available, or if there are no slots to return the rented bike, the system loses its purpose. Predicting the number of rentals each station will have each day is a challenge, as this value is highly unpredictable and can be influenced by various factors, from the weather to local events. This dissertation addresses the bike availability issue in a bike-sharing system. Having bikes available at the right place and the right time is the key to success. To effectively forecast demand, two predictive algorithms were implemented: SARIMAX and Gradient Boosting. SARIMAX is a variant of the well-known ARIMA, recognized for its accuracy in time series forecasting. On the other hand, Gradient Boosting, an algorithm based on decision-trees, is widely used because of its ability to handle vast amounts of data with minimal computational resources. The core question of this dissertation is to determine which of these algorithms will best predict the daily demand of each station, ensuring that users always have a bike available when and where they need it. This guarantees users satisfaction and, in return, promotes the growth of companies managing such systems. Based on the daily rental volumes of each station, the Gradient Boosting algorithm was the one that presented the best performance. This performance was further improved when the stations were divided into clusters, depending on their rental volume.