seqtolang

seqtolang is a python library for multi-langauge documents identification.

See this post for implementation details.

Getting Started

Install from source:

$ git clone https://github.com/hiredscorelabs/seqtolang
$ cd seqtolang
$ python setup.py install

or using PyPi:

$ pip install seqtolang

Basic usage:

from seqtolang import Detector

detector = Detector()
text = "In Chinese, the French phrase 'Je rentre chez moi Je rentre chez moi' will be '我正在回家'"
languages = detector.detect(text)
print(languages)

>>> [('fr', 0.499), ('en', 0.437), ('zh', 0.062)]


tokens = detector.detect(text, aggregated=False)
print(tokens)

>>> ['eng', 'eng', 'eng', 'eng', 'eng', 'fra', 'fra', 'fra', 'fra', 'fra', 'fra', 'fra', 'fra', 'eng', 'eng', 'zho']

seqtolang support 36 languages:

['afr', 'eus', 'bel', 'ben', 'bul', 'cat', 'zho', 'ces', 'dan', 'nld', 'eng', 'est', 'fin', 'fra', 
'glg', 'deu', 'ell', 'hin', 'hun', 'isl', 'ind', 'gle', 'ita', 'jpn', 'kor', 'lat', 'lit', 'pol', 
'por', 'ron', 'rus', 'slk', 'spa', 'swe', 'ukr', 'vie']

Docker Example

To make it easier to test the lib a runnable docker is also provided. To test it:

$> docker build . -t seqtolang
$> docker run -e SEQTOLANG_TEXT="Good boy in chinese is 好孩子" seqtolang
['Good', 'boy', 'in', 'chinese', 'is', '好孩子']
['eng', 'eng', 'eng', 'eng', 'eng', 'zho']

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.circleci		.circleci
.github/workflows		.github/workflows
media		media
seqtolang		seqtolang
tests		tests
.gitignore		.gitignore
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
dev-requirements.txt		dev-requirements.txt
docker_entrypoint.py		docker_entrypoint.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

seqtolang

Getting Started

Basic usage:

Docker Example

Support

Getting Help

License

About

Releases

Packages

Contributors 3

Languages

hiredscorelabs/seqtolang

Folders and files

Latest commit

History

Repository files navigation

seqtolang

Getting Started

Basic usage:

Docker Example

Support

Getting Help

License

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages