Skip to content

ginesiametlle/semtagger

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

68 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

semtagger

About this repository

This repository provides a universal semantic tagger which can be easily trained on the Parallel Meaning Bank.

A recent version of Python 3 with the packages listed in requirements.txt is expected.

Training a neural model

$ ./run.sh --train [--model MODEL_FILE]

Using a trained model to predict sem-tags

$ ./run.sh --predict --input INPUT_CONLL_FILE --output OUTPUT_CONLL_FILE [--model MODEL_FILE]

Jointly training and predicting

$ ./run.sh --train --predict --input INPUT_CONLL_FILE --output OUTPUT_CONLL_FILE [--model MODEL_FILE]

Configuration

One can edit config.sh for fine control over the employed features and model architecture.

Note that trained models are stored/loaded using the directory defined in config.sh when the --model option is not provided.

Comments

It is advisable to run a tokenizer such as Elephant on your additional data (if any).

Furthermore, if you have the means to identify multiword expressions, you can represent them as a single token using white spaces, tildes or hyphens (as in ice cream, ice~cream or ice-cream).

References

  1. L. Abzianidze and J. Bos. Towards Universal Semantic Tagging. In Proceedings of the 12th International Conference on Computational Semantics (IWCS) - Short papers. Association for Computational Linguistics, 2017.

  2. J. Bjerva, B. Plank and J. Bos. Semantic Tagging with Deep Residual Networks. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 3531–3541. Association for Computational Linguistics, 2016.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published