Skip to content

Authorship Authentication task done on the Reuters 50-50 Data Set. Multiple models are implemented.

Notifications You must be signed in to change notification settings

Pzeyang/Authorship-Auth

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

77 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Authorship-Auth

Authorship Authentication task done on the Reuter 50-50 Data Set. Multiple models are implemented. In this repository you can find the different models used to tackle the authorship attribution taks. The CNN, BiLSTM and CNN_LSTM, LSTM+Attn and BERT contain the models by which the files are named. Due to the limitations of GitHub, please refer to https://nlp.stanford.edu/projects/glove/ in order to download the pretrained word vectors and use them. The word vectors used in these examples are glove.6B.100d. The Ngrams file contains 3 models used to classify the texts based on character ngrams: SVM, MultinomialNB and LDA. The HASKER file contains an implementation of a SVM trained on String Kernels.

About

Authorship Authentication task done on the Reuters 50-50 Data Set. Multiple models are implemented.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%