Skip to content

Data set for LREC 2020 paper "I Feel Offended, Don't Be Abusive!"

License

Notifications You must be signed in to change notification settings

tommasoc80/AbuseEval

Repository files navigation

AbuseEval

DOI

Data set for LREC 2020 paper "I Feel Offended, Don't Be Abusive!"

The repository is structured as follows:

  • data/ : the folder contains the enriched versions of the OffensEval/OLID dataset with the distinction of explicit/implicit offensive messages (./data/offenseval_explicit_implicit) and the newly proposed annotations of abusive messages (./data/abuseval_labels)
  • dictionary-based_experiments/ : the folder contains the script to replicate the dictionary experiments reported in the paper (OffenseEval sub-task A and AbuseEval binary classification)
  • keywords/ : the folder contains the list of the top 50 keywords from the OffensEval training and test data for sub-task A per class (list of keywords for offensive and not offensive messages)

OLID/OffensEval Data: https://competitions.codalab.org/competitions/20011

Data Statement (Bender and Friedman, 2018)

The annotation of the explicit-implicit labels in OffensEval has been conducted by a male (38, Italian) and a female (39, Serbian) annotators, highly educated, with a background in computational linguistics, and familiar with Twitter.

The inter-annotator agreement of AbuseEval has been conducted by three annotators: 1 man (38, Italian) and 2 women (39, Serbian; 23, Russian); all highly educated, with a background in computational linguistics, and familiar with Twitter. The full annotation of AbuseEval has been conducted by one annotator (23, Russian), highly educated and with a background in computational linguistics.

All ages refer to the time of annotation: 2019.

References

@inproceedings{zampierietal2019, 
    title={{Predicting the Type and Target of Offensive Posts in Social Media}}, 
    author={Zampieri, Marcos and Malmasi, Shervin and Nakov, Preslav and Rosenthal, Sara and Farra, Noura and Kumar, Ritesh}, 
    booktitle={Proceedings of NAACL}, 
    year={2019}
} 

@inproceedings{casellietal2020, 
    title={{I Feel Offended, Don’t Be Abusive! Implicit/Explicit Messages in Offensive and Abusive Language}}, 
    author={Tommaso Caselli,Valerio Basile, Jelena Mitrovi\'{c}, Inga Kartoziya, Michael Granitzer}, 
    booktitle={Proceedings of LREC}, 
    year={2020}
} 

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

About

Data set for LREC 2020 paper "I Feel Offended, Don't Be Abusive!"

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages