An Integrated Corpus Tool With Multilingual Support for the Study of Language, Literature, and Translation
-
Updated
Jun 24, 2024 - Python
An Integrated Corpus Tool With Multilingual Support for the Study of Language, Literature, and Translation
Bitextor generates translation memories from multilingual websites
Python scripts preprocessing Penn Treebank and Chinese Treebank
OpusFilter - Parallel corpus processing toolkit
Utilities for Processing the Switchboard Dialogue Act Corpus
A parser for annotated MuseScore 3 files.
Utilities for Processing the Meeting Recorder Dialogue Act Corpus
A simple collocation-driven recognition of rhymes. Contains pre-trained models for Czech, Dutch, English, French, German, Russian, and Spanish poetry
Plotly-Dash NLP project. Document similarity measure using Latent Dirichlet Allocation, principal component analysis and finally follow with KMeans clustering. Project is completed with dynamic visual interaction.
Measure the similarity of text corpora for 74 languages
A processor for KyotoCorpus, KWDLC, and AnnotatedFKCCorpus
Scripts for building a geo-located web corpus using Common Crawl data
Utilities for Processing the HCRC Map Task Corpus
uniblock, scoring and filtering corpus with Unicode block information (and more).
N-Gram language model that learns n-gram probabilities from a given corpus and generates new sentences from it based on the conditional probabilities from the generated words and phrases.
Corpus processing library
This package provides utility classes and static methods for Python that make use of different third party software commonly used in text processing such as: Unitex-GramLab, TreeTagger, Apache-Tika and Google-Tesseract.
Sense Tagged Instances For Finnish
Utilities for Processing the BT Oasis Corpus
Add a description, image, and links to the corpus-processing topic page so that developers can more easily learn about it.
To associate your repository with the corpus-processing topic, visit your repo's landing page and select "manage topics."