Skip to content

wthoutanymmries/rusentiment

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Sentiment Annotation Guidelines

This repository contains:

  • RuSentiment dataset for sentiment analysis of Russian social media;

  • guidelines for annotation of sentiment in social media, with which RuSentiment was produced. There are two versions, one with examples in Russian (VKontakte social network) and one with English examples from Twitter. The guidelines were prepared as part of RuSentiment project by Text Machine lab.

Both RuSentiment and the guidelines are available for non-commercial use.

Project page: https://text-machine.cs.uml.edu/projects/rusentiment/

Published paper (COLING 2018): https://aclweb.org/anthology/C18-1064

Highlights of our annotation policy:

  • negative and positive sentiment classes cover both implicit and explicit sentiment, both for expressing emotion and attitudes;
  • neutral class (unmarked for sentiment);
  • speech act class: social media posts often include formulaic greetings, thank-you posts and congratulatory posts, which may or may not express the actual sentiment of the sender;
  • "skip" class for unclear cases, noisy posts, content that was likely not created by the users themselves (poems, lyrics, jokes etc.).
  • cases of mixed sentiment are annotated for the dominant sentiment of the post, and the guidelines cover 6 frequent cases of mixed sentiment to improve inter-annotator agreement;
  • hashtags and smileys are not treated as automatic sentiment labels.

For Russian these guideines yielded annotation speed of 250-350 posts per hour, with Fleiss kappa of 0.654 for randomly selected posts. See our paper for details on how active learning influenced the inter-annotator agreement.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published