Document details

Sentiment analysis on twitter for the portuguese language

Author(s): Duarte, Eduardo Santos

Date: 2013

Persistent ID: http://hdl.handle.net/10362/11338

Origin: Repositório Institucional da UNL

Subject(s): Sentiment analysis; Named entity recognition; Opinion mining; Semantic analysis; Social knowledge


Description

With the growth and popularity of the internet and more specifically of social networks, users can more easily share their thoughts, insights and experiences with others. Messages shared via social networks provide useful information for several applications, such as monitoring specific targets for sentiment or comparing the public sentiment on several targets, avoiding the traditional marketing research method with the use of surveys to explicitly get the public opinion. To extract information from the large amounts of messages that are shared, it is best to use an automated program to process these messages. Sentiment analysis is an automated process to determine the sentiment expressed in natural language in text. Sentiment is a broad term, but here we are focussed in opinions and emotions that are expressed in text. Nowadays, out of the existing social network websites, Twitter is considered the best one for this kind of analysis. Twitter allows users to share their opinion on several topics and entities, by means of short messages. The messages may be malformed and contain spelling errors, therefore some treatment of the text may be necessary before the analysis, such as spell checks. To know what the message is focusing on it is necessary to find these entities on the text such as people, locations, organizations, products, etc. and then analyse the rest of the text and obtain what is said about that specific entity. With the analysis of several messages, we can have a general idea on what the public thinks regarding many different entities. It is our goal to extract as much information concerning different entities from tweets in the Portuguese language. Here it is shown different techniques that may be used as well as examples and results on state-of-the-art related work. Using a semantic approach, from these messages we were able to find and extract named entities and assigning sentiment values for each found entity, producing a complete tool competitive with existing solutions. The sentiment classification and assigning to entities is based on the grammatical construction of the message. These results are then used to be viewed by the user in real time or stored to be viewed latter. This analysis provides ways to view and compare the public sentiment regarding these entities, showing the favourite brands, companies and people, as well as showing the growth of the sentiment over time.

Dissertação para obtenção do Grau de Mestre em Engenharia Informática

Document Type Master thesis
Language English
Advisor(s) Damásio, Carlos; Gouveia, João
Contributor(s) Duarte, Eduardo Santos
facebook logo  linkedin logo  twitter logo 
mendeley logo

Related documents

No related documents