Publicação
Using Twitter News Sentiment Analysis to Forecast US GDP
| Resumo: | Sentimental analysis is an emerging subject which has been growing in popularity and applicability with the growth in availability of accurate pre-trained machine learning models fine-tuned for a given domain and task (sentiment classification in this case). GDP is one of the variables which are attempted to forecast more often. Given its relevance and aggregate scope, it makes it both one of the most important to accurately forecast and also one of the hardest to do so. This research, therefore, attempts to use sentimental analysis of news articles to forecast GDP and assess if it produces forecasting accuracy gains. Previous studies have used either news articles datasets from single sources or heuristics-based sentiment classification models. Here, a dataset composed of thousands of news articles extracted from several sources present in Twitter/X will be used and classified with pre-trained context-aware machine learning models. An ARIMAX enriched with exogenous variables produced by such sentimental analysis will produce rolling forecasts alongside a base ARIMA and both forecasts will be compared to assess the existence of forecasting accuracy gains. The findings show that this approach does generate statistically significant forecasting accuracy gains for the one and four steps-ahead forecasting horizons. Such findings point to the possibility of including exogenous variables coming from news articles sentiment classification in the models currently in use for GDP forecasting by policy-making agents. Additionally, using sources like the one explored in this research allows an easy process of fine-tuning the specific type of news articles used as a basis for such a forecast, making this approach very flexible to be applied across adjacent scopes. |
|---|---|
| Autores principais: | Matos, José Miguel Pereira de |
| Assunto: | Sentiment analysis ARIMAX Time Series Forecasting Twitter SDG 8 - Decent work and economic growth |
| Ano: | 2024 |
| País: | Portugal |
| Tipo de documento: | dissertação de mestrado |
| Tipo de acesso: | acesso aberto |
| Instituição associada: | Universidade Nova de Lisboa |
| Idioma: | inglês |
| Origem: | Repositório Institucional da UNL |
| Resumo: | Sentimental analysis is an emerging subject which has been growing in popularity and applicability with the growth in availability of accurate pre-trained machine learning models fine-tuned for a given domain and task (sentiment classification in this case). GDP is one of the variables which are attempted to forecast more often. Given its relevance and aggregate scope, it makes it both one of the most important to accurately forecast and also one of the hardest to do so. This research, therefore, attempts to use sentimental analysis of news articles to forecast GDP and assess if it produces forecasting accuracy gains. Previous studies have used either news articles datasets from single sources or heuristics-based sentiment classification models. Here, a dataset composed of thousands of news articles extracted from several sources present in Twitter/X will be used and classified with pre-trained context-aware machine learning models. An ARIMAX enriched with exogenous variables produced by such sentimental analysis will produce rolling forecasts alongside a base ARIMA and both forecasts will be compared to assess the existence of forecasting accuracy gains. The findings show that this approach does generate statistically significant forecasting accuracy gains for the one and four steps-ahead forecasting horizons. Such findings point to the possibility of including exogenous variables coming from news articles sentiment classification in the models currently in use for GDP forecasting by policy-making agents. Additionally, using sources like the one explored in this research allows an easy process of fine-tuning the specific type of news articles used as a basis for such a forecast, making this approach very flexible to be applied across adjacent scopes. |
|---|