Skip to content

Hackathon for an NLP task involving sexism classification

Notifications You must be signed in to change notification settings

alexsasu/NitroNLP-Hackathon-2023

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NitroNLP-Hackathon-2023

The hackathon (link; called NitroNLP, hosted in 2023) was for a multi-class classification task, consisting of classifying texts written in Romanian, from multiple sources: social media, web articles, books; into different types of sexism: sexist direct, sexist descriptive, sexist reporting, non-sexist offensive, and non-sexist non-offensive. The metric of interest was the weighted accuracy, given that the dataset was imbalanced.

Our approaches first consisted of trying classical machine learning methods, namely: Decision Tree, KNN, MLP; with the BoW representation, and, because these didn't bring us a satisfying weighted accuracy score, we moved on to a version of BERT called RoBERT, pre-trained on a Romanian corpus, which we then fine-tuned on our dataset and applied balanced weights to it.

Weighted accuracy scores of our models:

image

Our full documentation for this competition can be consulted in the "Paper.pdf" file.

Contributors:

About

Hackathon for an NLP task involving sexism classification

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published