Skip to content

Commit

Permalink
paper link
Browse files Browse the repository at this point in the history
  • Loading branch information
ningkko committed Jan 20, 2022
1 parent a37da13 commit 37c0059
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
# TweebankNLP
[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)

This repo contains the new `Tweebank-NER` [dataset](./Tweebank-NER-v1.0) and `Twitter-Stanza` pipeline for state-of-the-art Tweet NLP. `Tweebank-NER V1.0` is the annotated NER dataset based on Tweebank V2, the main UD treebank for English Twitter NLP tasks. The `Twitter-Stanza` pipeline provides pre-trained Tweet NLP models (NER, tokenization, lemmatization, POS tagging, dependency parsing) with state-of-the-art or competitive performance. The models are fully compatible with Stanza and provide both Python and command-line interfaces for users.
This repo contains the new `Tweebank-NER` [dataset](./Tweebank-NER-v1.0) and `Twitter-Stanza` pipeline for state-of-the-art Tweet NLP, as described in **[Annotating the Tweebank Corpus on Named Entity Recognition and Building NLP Models for Social Media Analysis](https://arxiv.org/abs/2201.07281)**.

`Tweebank-NER V1.0` is the annotated NER dataset based on Tweebank V2, the main UD treebank for English Twitter NLP tasks. The `Twitter-Stanza` pipeline provides pre-trained Tweet NLP models (NER, tokenization, lemmatization, POS tagging, dependency parsing) with state-of-the-art or competitive performance. The models are fully compatible with Stanza and provide both Python and command-line interfaces for users.


## Installation
Expand Down

0 comments on commit 37c0059

Please sign in to comment.