FinnSentiment -- A Finnish Social Media Corpus for Sentiment Polarity Annotation
Abstract
A Finnish social media sentiment analysis dataset with 27,000 sentences annotated for sentiment polarity is introduced to address a data shortage, along with inter-annotator agreement analysis and validation baselines.
Sentiment analysis and opinion mining is an important task with obvious application areas in social media, e.g. when indicating hate speech and fake news. In our survey of previous work, we note that there is no large-scale social media data set with sentiment polarity annotations for Finnish. This publications aims to remedy this shortcoming by introducing a 27,000 sentence data set annotated independently with sentiment polarity by three native annotators. We had the same three annotators for the whole data set, which provides a unique opportunity for further studies of annotator behaviour over time. We analyse their inter-annotator agreement and provide two baselines to validate the usefulness of the data set.
Models citing this paper 1
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 3
Collections including this paper 0
No Collection including this paper