Skip to content

Experiment data

Latest
Compare
Choose a tag to compare
@j0ma j0ma released this 25 Apr 19:36
· 1 commit to main since this release

Data for canonical name translation experiments

  • Full dump of ParaNames used to create parallel data for experiments

    • TSV formatted (full_paranames_dump.tsv.tar.gz)
    • DuckDB database (full_paranames_dump_duckdb.tar.gz, experimental)
  • Parallel data used in experiments (parallel_data_for_experiments.tar.gz)

  • Metadata (metadata.tar.gz)

    • Wikidata ID => train/dev/test split mapping
    • Sizes of each language in each split