An Integrated Corpus Tool With Multilingual Support for the Study of Language, Literature, and Translation
-
Updated
Jul 19, 2024 - Python
An Integrated Corpus Tool With Multilingual Support for the Study of Language, Literature, and Translation
Bitextor generates translation memories from multilingual websites
Python scripts preprocessing Penn Treebank and Chinese Treebank
OpusFilter - Parallel corpus processing toolkit
A Serverless Text Annotation Tool for Corpus Development
Utilities for Processing the Switchboard Dialogue Act Corpus
Corpus processing library
Hard-Forked from JuliaText/TextAnalysis.jl
Utilities for Processing the Meeting Recorder Dialogue Act Corpus
Reading the data from OPIEC - an Open Information Extraction corpus
ALvisNLP corpus processing engine
Plotly-Dash NLP project. Document similarity measure using Latent Dirichlet Allocation, principal component analysis and finally follow with KMeans clustering. Project is completed with dynamic visual interaction.
Corpus processing library
N-Gram language model that learns n-gram probabilities from a given corpus and generates new sentences from it based on the conditional probabilities from the generated words and phrases.
A simple collocation-driven recognition of rhymes. Contains pre-trained models for Czech, Dutch, English, French, German, Russian, and Spanish poetry
A parser for annotated MuseScore 3 files.
A processor for KyotoCorpus, KWDLC, and AnnotatedFKCCorpus
uniblock, scoring and filtering corpus with Unicode block information (and more).
Measure the similarity of text corpora for 74 languages
Add a description, image, and links to the corpus-processing topic page so that developers can more easily learn about it.
To associate your repository with the corpus-processing topic, visit your repo's landing page and select "manage topics."