Good Examples Make A Faster Learner: Simple Demonstration-based Learning for Low-resource NER

This repo provides the model, code & data of our paper: Good Examples Make A Faster Learner: Simple Demonstration-based Learning for Low-resource NER (ACL 2022). [PDF]

Overview

Demonstration-based learning framework for NER integrates prompt into the input itself to make better input representations for token classification. Concatenating simple demonstration can be helpful to improve the performance.

Setup

Optional Create and activate your conda/virtual environment
Run pip install -r requirements.txt
Optional Add support for CUDA. We have tested the repository on pytorch version 1.7.1 with CUDA version 10.1.

# conda
conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=10.1 -c pytorch

# pip
pip install torch==1.7.1+cu101 torchvision==0.8.2+cu101 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html

Important Locate your python libraries directory and replace the bert_score/score.py with score.py provided in this repository. We make some changes to cache the model and avoid reloading of model for each call. For example,

cp score.py ~/.conda/envs/<ENV_NAME>/lib/python3.6/site-packages/bert_score/score.py

Valid Combination Table

Prompt	Template	Description
`max`	`no_context`, `context`, `lexical`	Entity-oriented demonstration - Popular
`random`	`no_context`, `context`, `lexical`	Entity-oriented demonstration - Random
`sbert`	`context_all`, `lexical_all`	Instance-oriented demonstration - SBERT
`bertscore`	`context_all`, `lexical_all`	Instance-oriented demonstration - BERTSCORE

Running

Possible values for:

<DATASET> : conll, ontonotes_conll, bc5cdr
<PROMPT> : from the table above
<TEMPLATE> : from the table above
<SUFFIX> : 25, 50
<TRAIN_SEED> : 42, 1337, 2021
<SAMPLE_SEED> : 42, 1337, 2021, 5555, 9999
<CHECK_POINT> : Saved checkpoint

Single run

Execute a single run.

In-domain setting

scripts/in_domain/in_domain_one.sh <DATASET> <SHOT> <PROMPT> <TEMPLATE> <TRAIN_SEED> <SAMPLE_SEED>

Domain Adaptation setting

scripts/domain_adaptation/domain_adaptation_one.sh <DATASET> <SHOT> <PROMPT> <TEMPLATE> <TRAIN_SEED> <SAMPLE_SEED> <CHECK_POINT>

Multiple runs

This setting runs all 15 runs i.e. 5 different sub-samples x 3 training seeds

In-domain setting
```
scripts/in_domain/in_domain_all.sh
```
- remember to configure the parameters on top of this script.

Domain Adaptation setting

scripts/domain_adaptation/domain_adaptation_all.sh

Running prompt Search

Prompt	Template
`search`	`no_context`, `context`, `lexical`

search for best entities (based on only one seed)

python3 search.py \
    --dataset <DATASET> \
    --data_dir dataset/<DATASET> \
    --model_folder models/<DATASET>/conll_max_context \
    --device cuda:0 \
    --percent_filename_suffix <SEEDED_SUFFIX> \
    --template <TEMPLATE>

Run with best entities

python sampling_run.py \
    --train_file search_run.py \
    --dataset <DATASET> \
    --data_dir dataset/<DATASET> \
    --gpu 0 \
    --suffix <SUFFIX> \
    --template <TEMPLATE>

Citation

If you find our work helpful, please cite the following:

@InProceedings{lee2021fewner,
  author =  {Lee, Dong-Ho and Kadakia, Akshen and Tan, Kangmin and Agarwal, Mahak and Feng, Xinyu and Shibuya, Takashi and Mitani, Ryosuke and Sekiya, Toshiyuki and Pujara, Jay and Ren, Xiang},
  title =   {Good Examples Make A Faster Learner: Simple Demonstration-based Learning for Low-resource NER},
  year =    {2022},  
  booktitle = {Association for Computational Linguistics (ACL)},  
}

Name		Name	Last commit message	Last commit date
Latest commit History 105 Commits
dataset		dataset
scripts		scripts
src		src
.gitignore		.gitignore
README.md		README.md
log.py		log.py
requirements.txt		requirements.txt
score.py		score.py
search.py		search.py
search_continual_run.py		search_continual_run.py
search_run.py		search_run.py
trainer.py		trainer.py
transformers_continual_trainer.py		transformers_continual_trainer.py
transformers_predictor.py		transformers_predictor.py
transformers_trainer.py		transformers_trainer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Good Examples Make A Faster Learner: Simple Demonstration-based Learning for Low-resource NER

Overview

Table of contents

Setup

Valid Combination Table

Running

Single run

Multiple runs

Running prompt Search

Citation

About

Releases

Packages

Contributors 4

Languages

INK-USC/fewNER

Folders and files

Latest commit

History

Repository files navigation

Good Examples Make A Faster Learner: Simple Demonstration-based Learning for Low-resource NER

Overview

Table of contents

Setup

Valid Combination Table

Running

Single run

Multiple runs

Running prompt Search

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages