This file was copied from a data.world repository: crowdflower/Hate Speech Identification/twitter-hate-speech-classifier-DFE-a845520.csv
Open Issue: What do the columns mean?
This file was copied from Davidson et al.'s labeled_data.csv. The same data is also in a data.world repository: thomasrdavidson/Hate Speech and Offensive Language/labeled_data.csv
The file contains 5 columns:
count
= number of CrowdFlower users who coded each tweet (min is 3, sometimes more users coded a tweet when judgments were determined to be unreliable by CF).
hate_speech
= number of CF users who judged the tweet to be hate speech.
offensive_language
= number of CF users who judged the tweet to be offensive.
neither
= number of CF users who judged the tweet to be neither offensive nor non-offensive.
class
= class label for majority of CF users.
0 - hate speech
1 - offensive language
2 - neither