Authorship Authentication task done on the Reuter 50-50 Data Set. Multiple models are implemented. In this repository you can find the different models used to tackle the authorship attribution taks. The CNN, BiLSTM and CNN_LSTM, LSTM+Attn and BERT contain the models by which the files are named. Due to the limitations of GitHub, please refer to https://nlp.stanford.edu/projects/glove/ in order to download the pretrained word vectors and use them. The word vectors used in these examples are glove.6B.100d. The Ngrams file contains 3 models used to classify the texts based on character ngrams: SVM, MultinomialNB and LDA. The HASKER file contains an implementation of a SVM trained on String Kernels.
forked from Emposes/Authorship-Auth
-
Notifications
You must be signed in to change notification settings - Fork 0
Pzeyang/Authorship-Auth
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
About
Authorship Authentication task done on the Reuters 50-50 Data Set. Multiple models are implemented.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published
Languages
- Python 100.0%