tsakorpus2eaf

This is a tool for transforming simple ELAN transcriptions and morphologically annotated representations of the same files in tsakorpus JSON into morphologically annotated ELAN files.

Requirements

You need Python 3.x to run the conversion. See requirements.txt for the list of required modules.

How to use

Create conf, eaf and json folders inside repository root if they are not present. Put unannotated ELAN files in the eaf folder. Put their annotated Tsakorpus versions in the json folder. Put corpus.json from the conf folder of your tsakorpus instance to conf. (The program will need gr_fields_order value from the lang_props dictionary to be able to order grammatical tags correctly.) Change the values passed to the constructor, if needed. Run tsakorpus2kenlm.py. Annotated versions will appear in eaf_analyzed.

Configuration

There are several configuration options.

The most important is the regex that described which tiers contain the transcription to be analyzed. It is passed as a string to the constructor of EafProcessor class. Only those time-aligned tiers whose name or type match this regex will be processed.
If you want the grammatical tags to be sorted nicely, you will have to provide a corpus.json file from your tsakorpus instance where the tag order is described (see tsakorpus documentation). You will also have to pass the name of the language in the lang argument for the constructor.
The constructor has optional arguments that determine how the tier types for tokens, lemmata etc. are going to be called. If you already have those tier types in your ELAN files, put their names there. Otherwise, new tier types and tiers will be created.

License

The software is distributed under MIT license (see LICENSE.md).

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
eaf_add_morphology.py		eaf_add_morphology.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

tsakorpus2eaf

Requirements

How to use

Configuration

License

About

Releases

Packages

Languages

License

timarkh/tsakorpus2eaf

Folders and files

Latest commit

History

Repository files navigation

tsakorpus2eaf

Requirements

How to use

Configuration

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages