Skip to content

Popular repositories

  1. extractor extractor Public

    C++ 23 4

  2. corset corset Public

    Corset is a web-based data selection portal that helps you getting relevant data from massive amounts of parallel data.

    SCSS 17 4

  3. keops keops Public

    Tool for manual evaluation of parallel sentences.

    PHP 12 4

  4. DataCollection DataCollection Public

    Forked from modernmt/DataCollection

    Data collection, alignment and TAUS repository

    Python 8 3

  5. cirrus-scripts cirrus-scripts Public

    Scripts for running bitextor/paracrawl/europat jobs on cirrus.ac.uk

    Shell 7

  6. human-evaluations human-evaluations Public

    Results of the human evaluation

    Rich Text Format 5 3

Repositories

Showing 10 of 20 repositories
  • giawarc Public

    Processing utilities for Internet Archive

    paracrawl/giawarc’s past year of commit activity
    C++ 1 0 4 1 Updated Apr 19, 2024
  • corset Public

    Corset is a web-based data selection portal that helps you getting relevant data from massive amounts of parallel data.

    paracrawl/corset’s past year of commit activity
    SCSS 17 GPL-3.0 4 1 0 Updated Nov 6, 2023
  • keops Public

    Tool for manual evaluation of parallel sentences.

    paracrawl/keops’s past year of commit activity
    PHP 12 GPL-3.0 4 0 0 Updated Oct 19, 2023
  • cirrus-scripts Public

    Scripts for running bitextor/paracrawl/europat jobs on cirrus.ac.uk

    paracrawl/cirrus-scripts’s past year of commit activity
    Shell 7 0 8 1 Updated Jul 18, 2023
  • giashard Public

    Sharding program for Paracrawl

    paracrawl/giashard’s past year of commit activity
    Go 1 0 1 0 Updated May 10, 2023
  • europat-scripts Public

    Scripts for obtaining patent data

    paracrawl/europat-scripts’s past year of commit activity
    Java 4 2 1 1 Updated Apr 14, 2023
  • tmxutil Public

    Tools to generate & filter Europat tmx files.

    paracrawl/tmxutil’s past year of commit activity
    Python 3 MIT 1 1 0 Updated Jan 17, 2023
  • synthesis Public

    Data synthesis by contextualizing glossary translations

    paracrawl/synthesis’s past year of commit activity
    Python 5 3 0 0 Updated Jul 1, 2021
  • opus-train Public

    Automate download and training with OPUS corpora

    paracrawl/opus-train’s past year of commit activity
    Shell 2 MIT 2 0 0 Updated Jan 28, 2021
  • human-evaluations Public

    Results of the human evaluation

    paracrawl/human-evaluations’s past year of commit activity
    Rich Text Format 5 3 0 0 Updated Dec 9, 2020

Top languages

Loading…

Most used topics

Loading…