Skip to content

Tags: svngoku/fairseq

Tags

v0.12.3

Toggle v0.12.3's commit message
v0.12.3 release

v0.12.2

Toggle v0.12.2's commit message
v0.12.2 release

v0.12.1

Toggle v0.12.1's commit message
v0.12.1 release

v0.12.0

Toggle v0.12.0's commit message
0.12.0 release

v0.10.2

Toggle v0.10.2's commit message

v0.10.1

Toggle v0.10.1's commit message
v0.10.0 -> v0.10.1

Bug fixes

v0.10.0

Toggle v0.10.0's commit message
v0.9.0 -> v0.10.0

It's been a long time since our last release (0.9.0) nearly a year ago! There
have been numerous changes and new features added since then, which we've tried
to summarize below. While this release carries the same major version as our
previous release (0.x.x), if you have code that relies on 0.9.0, it is likely
you'll need to adapt it before updating to 0.10.0.

Looking forward, this will also be the last significant release with the
0.x.x numbering. The next release will be 1.0.0 and will include a major
migration to the [Hydra configuration system](https://github.com/facebookresearch/hydra),
with an eye towards modularizing fairseq to be more usable as a library.

Changelog:

New papers:
- [Reducing Transformer Depth on Demand with Structured Dropout (Fan et al., 2019)](https://github.com/pytorch/fairseq/tree/master/examples/layerdrop/README.md)
- [MBART: Multilingual Denoising Pre-training for Neural Machine Translation ({Liu*,Gu*,Goyal*} et al., 2020)](https://github.com/pytorch/fairseq/blob/master/examples/mbart/README.md)
- [Neural Machine Translation with Byte-Level Subwords (Wang et al., 2019)](https://github.com/pytorch/fairseq/blob/master/examples/byte_level_bpe/README.md)
- [Training with Quantization Noise for Extreme Model Compression ({Fan*,Stock*} et al., 2019)](https://github.com/pytorch/fairseq/blob/master/examples/quant_noise/README.md)
- [Monotonic Multihead Attention (Ma et al., 2020)](https://github.com/pytorch/fairseq/blob/master/examples/simultaneous_translation/README.md)
- [Unsupervised Quality Estimation for Neural Machine Translation (Fomicheva et al., 2020)](examples/unsupervised_quality_estimation/README.md)
- [wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations (Baevski et al., 2020)](https://github.com/pytorch/fairseq/blob/master/examples/wav2vec/README.md)
- [Lexically constrained decoding with dynamic beam allocation](examples/constrained_decoding/README.md)
- [Generating Medical Reports from Patient-Doctor Conversations Using Sequence-to-Sequence Models (Enarvi et al., 2020)](examples/pointer_generator/README.md)
- [Linformer: Self-Attention with Linear Complexity (Wang et al., 2020)](examples/linformer/README.md)
- [Cross-lingual Retrieval for Iterative Self-Supervised Training (Tran et al., 2020)](examples/criss/README.md)
- [Deep Transformers with Latent Depth (Li et al., 2020)](examples/latent_depth/README.md)
- [Better Fine-Tuning by Reducing Representational Collapse (Aghajanyan et al. 2020)](examples/rxf/README.md)

Major new features:
- TorchScript support for Transformer and SequenceGenerator (PyTorch 1.6+ only)
- Model parallel training support (see [Megatron-11b](https://github.com/pytorch/fairseq/tree/master/examples/megatron_11b))
- TPU support via `--tpu` and `--bf16` options (7751229)
- Added [VizSeq (a visual analysis toolkit for evaluating fairseq models)](https://facebookresearch.github.io/vizseq/docs/getting_started/fairseq_example)
- Migrated to Python logging (fb76dac)
- Added “SlowMo” distributed training backend (0dac0ff)
- Added Optimizer State Sharding (ZeRO) (5d7ed6a)
- Added several features to improve speech recognition support in fairseq: CTC criterion, external ASR decoder support (currently only wav2letter decoder) with KenLM and fairseq language model fusion

Minor features:
- Added `--patience` for early stopping
- Added `--shorten-method=[none|truncate|random_crop]` to language modeling (and other) tasks
- Added `--eval-bleu` for computing BLEU scores during training (60fbf64)
- Added support for training huggingface models (e.g. `hf_gpt2`) (2728f9b)
- Added FusedLAMB optimizer (`--optimizer=lamb`) (f75411a)
- Added LSTM-based language model (`lstm_lm`) (9f4256e)
- Added dummy tasks and models for benchmarking (91f0534; a541b19)
- Added tutorial and pretrained models for paraphrasing (630701e)
- Support quantization for Transformer (6379573)
- Support multi-GPU validation in fairseq-validate (2f7e3f3)
- Support batched inference in hub interface (3b53962)
- Support for language model fusion in standard beam search (5379461)

Breaking changes:
- Updated requirements to Python 3.6+ and PyTorch 1.5+
- Main entry point scripts (eval_lm.py, generate.py, etc.) removed from root directory into `fairseq_cli`
- Changed format for generation output; `H-` now corresponds to tokenized system outputs and newly added `D-` lines correspond to detokenized outputs (f353913)
- We now log the stats from the log-interval (displayed as `train_inner`) instead of a rolling average over each epoch.
- SequenceGenerator/Scorer does not print alignment by default, re-enable with `--print-alignment`
- Print base 2 scores in generation scripts (660d69f)
- Incremental decoding interface changed to use `FairseqIncrementalState` (4e48c4a; 88185fc)
- Refactor namespaces in Criterions to support library usage (introduce `LegacyFairseqCriterion` for BC) (46b773a)
- Deprecate `FairseqCriterion::aggregate_logging_outputs` interface, use `FairseqCriterion::reduce_metrics` instead (8679339)
- Moved `fairseq.meters` to `fairseq.logging.meters` and added new metrics aggregation module (`fairseq.logging.metrics`) (1e324a5; f8b795f)
- Reset mid-epoch stats every log-interval steps (244835d)
- Ignore duplicate entries in dictionary files (dict.txt) and support manual overwrite with `#fairseq:overwrite` option (dd1298e; 937535d)
- Use 1-based indexing for epochs everywhere (aa79bb9)

Minor interface changes:
- Added `FairseqTask::begin_epoch` hook (122fc1d)
- `FairseqTask::build_generator` interface changed (cd2555a)
- Change `RobertaModel` base class to `FairseqEncoder` (307df56)
- Expose `FairseqOptimizer.param_groups` property (8340b2d)
- Deprecate `--fast-stat-sync` and replace with `FairseqCriterion::logging_outputs_can_be_summed` interface (fe6c2ed)
- `--raw-text` and `--lazy-load` are fully deprecated; use `--dataset-impl` instead
- Mixture of expert tasks moved to `examples/` (8845dcf)

Performance improvements:
- Use cross entropy from apex for improved memory efficiency (5065077)
- Added buffered dataloading (`--data-buffer-size`) (4115317)

v0.9.0

Toggle v0.9.0's commit message
v0.8.0 -> v0.9.0 (facebookresearch#1452)

Summary:
Possibly breaking changes:
- Set global numpy seed (4a7cd58)
- Split `in_proj_weight` into separate k, v, q projections in MultiheadAttention (fdf4c3e)
- TransformerEncoder returns namedtuples instead of dict (27568a7)

New features:
- Add `--fast-stat-sync` option (e1ba32a)
- Add `--empty-cache-freq` option (315c463)
- Support criterions with parameters (ba5f829)

New papers:
- Simple and Effective Noisy Channel Modeling for Neural Machine Translation (49177c9)
- Levenshtein Transformer (86857a5, ...)
- Cross+Self-Attention for Transformer Models (4ac2c5f)
- Jointly Learning to Align and Translate with Transformer Models (1c66792)
- Reducing Transformer Depth on Demand with Structured Dropout (dabbef4)
- Unsupervised Cross-lingual Representation Learning at Scale (XLM-RoBERTa) (e23e5ea)
- BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension (a92bcda)
- CamemBERT: a French BERT (b31849a)

Speed improvements:
- Add CUDA kernels for LightConv and DynamicConv (f840564)
- Cythonization of various dataloading components (4fc3953, ...)
- Don't project mask tokens for MLM training (718677e)
Pull Request resolved: facebookresearch#1452

Differential Revision: D18798409

Pulled By: myleott

fbshipit-source-id: 860a0d5aaf7377c8c9bd63cdb3b33d464f0e1727

v0.8.0

Toggle v0.8.0's commit message
v0.7.2 -> v0.8.0 (facebookresearch#1017)

Summary:
Changelog:
- Relicensed under MIT license
- Add RoBERTa
- Add wav2vec
- Add WMT'19 models
- Add initial ASR code
- Changed torch.hub interface (`generate` renamed to `translate`)
- Add `--tokenizer` and `--bpe`
- f812e52: Renamed data.transforms -> data.encoders
- 654affc: New Dataset API (optional)
- `47fd985`: Deprecate old Masked LM components
- `5f78106`: Set mmap as default dataset format and infer format automatically
- Misc fixes for sampling
- Misc fixes to support PyTorch 1.2
Pull Request resolved: facebookresearch#1017

Differential Revision: D16799880

Pulled By: myleott

fbshipit-source-id: 45ad8bc531724a53063cbc24ca1c93f715cdc5a7

v0.7.2

Toggle v0.7.2's commit message
v0.7.1 -> v0.7.2 (facebookresearch#891)

Summary:
No major API changes since the last release. Cutting a new release since we'll be merging significant (possibly breaking) changes to logging, data loading and the masked LM implementation soon.
Pull Request resolved: facebookresearch#891

Differential Revision: D16377132

Pulled By: myleott

fbshipit-source-id: f1cb88e671ccd510e53334d0f449fe18585268c7