Skip to content

Home project IN9550 UiO. Neural machine translation from French to English.

Notifications You must be signed in to change notification settings

NoB0/nmt-in9550

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Neural Machine Translation French to English

Code for home project of IN9550 UiO.

Training

Tokenizer

Run the following command to train a tokenizer.

python -m nmt.tokenizer --vocab_size <vocabulary size>

Seq2Seq Models

The training configuration is defined in a YAML file and can be overrided using command line arguments. Each time the command is run the configuration is saved under data/runs/{model_name}.meta.yaml

Transformer-based

python -m nmt.main -c config/transformer_default_config.yaml

RNN-based

python -m nmt.main -c config/rnn_default_config.yaml

Evaluation

The configuration file used to run evaluation is a YAML file with the following entries. An example is available here.

  • tokenizer: Path to the tokenizer
  • test_data: Path to test dataset
  • model: Path to Transformer-based Seq2Seq model
  • output_file: Path to save predictions
  • debug: Whether or not to run in debug mode

Run the following command to evaluate your Transformer-based model:

python -m nmt.evaluation.transformer_evaluation -c <config_file>

Run the following command to evaluate your RNN-based model:

python -m nmt.evaluation.rnn_evaluation -c <config_file>

Note

The code in this repository is mainly based on Pytorch tutorials.

About

Home project IN9550 UiO. Neural machine translation from French to English.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages