Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
  • Loading branch information
thomwolf committed Sep 26, 2019
2 parents 9676d1a + 8349d75 commit 1d646ba
Show file tree
Hide file tree
Showing 14 changed files with 361 additions and 19 deletions.
35 changes: 33 additions & 2 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
@@ -1,9 +1,38 @@
Transformers
================================================================================================================================================

Transformers is a library of state-of-the-art pre-trained models for Natural Language Processing (NLP).
🤗 Transformers (formerly known as `pytorch-transformers` and `pytorch-pretrained-bert`) provides general-purpose architectures
(BERT, GPT-2, RoBERTa, XLM, DistilBert, XLNet...) for Natural Language Understanding (NLU) and Natural Language Generation
(NLG) with over 32+ pretrained models in 100+ languages and deep interoperability between TensorFlow 2.0 and PyTorch.

The library currently contains PyTorch implementations, pre-trained model weights, usage scripts and conversion utilities for the following models:
Features
---------------------------------------------------

- As easy to use as pytorch-transformers
- As powerful and concise as Keras
- High performance on NLU and NLG tasks
- Low barrier to entry for educators and practitioners

State-of-the-art NLP for everyone
- Deep learning researchers
- Hands-on practitioners
- AI/ML/NLP teachers and educators

Lower compute costs, smaller carbon footprint
- Researchers can share trained models instead of always retraining
- Practitioners can reduce compute time and production costs
- 8 architectures with over 30 pretrained models, some in more than 100 languages

Choose the right framework for every part of a model's lifetime
- Train state-of-the-art models in 3 lines of code
- Deep interoperability between TensorFlow 2.0 and PyTorch models
- Move a single model between TF2.0/PyTorch frameworks at will
- Seamlessly pick the right framework for training, evaluation, production

Contents
---------------------------------

The library currently contains PyTorch and Tensorflow implementations, pre-trained model weights, usage scripts and conversion utilities for the following models:

1. `BERT <https://github.com/google-research/bert>`_ (from Google) released with the paper `BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding <https://arxiv.org/abs/1810.04805>`_ by Jacob Devlin, Ming-Wei Chang, Kenton Lee and Kristina Toutanova.
2. `GPT <https://github.com/openai/finetune-transformer-lm>`_ (from OpenAI) released with the paper `Improving Language Understanding by Generative Pre-Training <https://blog.openai.com/language-unsupervised>`_ by Alec Radford, Karthik Narasimhan, Tim Salimans and Ilya Sutskever.
Expand All @@ -14,6 +43,7 @@ The library currently contains PyTorch implementations, pre-trained model weight
7. `RoBERTa <https://github.com/pytorch/fairseq/tree/master/examples/roberta>`_ (from Facebook), released together with the paper a `Robustly Optimized BERT Pretraining Approach <https://arxiv.org/abs/1907.11692>`_ by Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov.
8. `DistilBERT <https://huggingface.co/transformers/model_doc/distilbert.html>`_ (from HuggingFace) released together with the blog post `Smaller, faster, cheaper, lighter: Introducing DistilBERT, a distilled version of BERT <https://medium.com/huggingface/distilbert-8cf3380435b5>`_ by Victor Sanh, Lysandre Debut and Thomas Wolf.


.. toctree::
:maxdepth: 2
:caption: Notes
Expand All @@ -37,6 +67,7 @@ The library currently contains PyTorch implementations, pre-trained model weight
main_classes/model
main_classes/tokenizer
main_classes/optimizer_schedules
main_classes/processors

.. toctree::
:maxdepth: 2
Expand Down
6 changes: 6 additions & 0 deletions docs/source/main_classes/model.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,9 @@ The base class ``PreTrainedModel`` implements the common methods for loading/sav

.. autoclass:: transformers.PreTrainedModel
:members:

``TFPreTrainedModel``
~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.TFPreTrainedModel
:members:
58 changes: 58 additions & 0 deletions docs/source/main_classes/processors.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
Processors
----------------------------------------------------

This library includes processors for several traditional tasks. These processors can be used to process a dataset into
examples that can be fed to a model.

Processors
~~~~~~~~~~~~~~~~~~~~~

All processors follow the same architecture which is that of the
:class:`~pytorch_transformers.data.processors.utils.DataProcessor`. The processor returns a list
of :class:`~pytorch_transformers.data.processors.utils.InputExample`. These
:class:`~pytorch_transformers.data.processors.utils.InputExample` can be converted to
:class:`~pytorch_transformers.data.processors.utils.InputFeatures` in order to be fed to the model.

.. autoclass:: pytorch_transformers.data.processors.utils.DataProcessor
:members:


.. autoclass:: pytorch_transformers.data.processors.utils.InputExample
:members:


.. autoclass:: pytorch_transformers.data.processors.utils.InputFeatures
:members:


GLUE
~~~~~~~~~~~~~~~~~~~~~

`General Language Understanding Evaluation (GLUE) <https://gluebenchmark.com/>`__ is a benchmark that evaluates
the performance of models across a diverse set of existing NLU tasks. It was released together with the paper
`GLUE: A multi-task benchmark and analysis platform for natural language understanding <https://openreview.net/pdf?id=rJ4km2R5t7>`__

This library hosts a total of 10 processors for the following tasks: MRPC, MNLI, MNLI (mismatched),
CoLA, SST2, STSB, QQP, QNLI, RTE and WNLI.

Those processors are:
- :class:`~pytorch_transformers.data.processors.utils.MrpcProcessor`
- :class:`~pytorch_transformers.data.processors.utils.MnliProcessor`
- :class:`~pytorch_transformers.data.processors.utils.MnliMismatchedProcessor`
- :class:`~pytorch_transformers.data.processors.utils.Sst2Processor`
- :class:`~pytorch_transformers.data.processors.utils.StsbProcessor`
- :class:`~pytorch_transformers.data.processors.utils.QqpProcessor`
- :class:`~pytorch_transformers.data.processors.utils.QnliProcessor`
- :class:`~pytorch_transformers.data.processors.utils.RteProcessor`
- :class:`~pytorch_transformers.data.processors.utils.WnliProcessor`

Additionally, the following method can be used to load values from a data file and convert them to a list of
:class:`~pytorch_transformers.data.processors.utils.InputExample`.

.. automethod:: pytorch_transformers.data.processors.glue.glue_convert_examples_to_features

Example usage
^^^^^^^^^^^^^^^^^^^^^^^^^

An example using these processors is given in the
`run_glue.py <https://github.com/huggingface/pytorch-transformers/blob/master/examples/run_glue.py>`__ script.
56 changes: 56 additions & 0 deletions docs/source/model_doc/bert.rst
Original file line number Diff line number Diff line change
Expand Up @@ -70,3 +70,59 @@ BERT
.. autoclass:: transformers.BertForQuestionAnswering
:members:


``TFBertModel``
~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.TFBertModel
:members:


``TFBertForPreTraining``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.TFBertForPreTraining
:members:


``TFBertForMaskedLM``
~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.TFBertForMaskedLM
:members:


``TFBertForNextSentencePrediction``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.TFBertForNextSentencePrediction
:members:


``TFBertForSequenceClassification``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.TFBertForSequenceClassification
:members:


``TFBertForMultipleChoice``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.TFBertForMultipleChoice
:members:


``TFBertForTokenClassification``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.TFBertForTokenClassification
:members:


``TFBertForQuestionAnswering``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.TFBertForQuestionAnswering
:members:

27 changes: 27 additions & 0 deletions docs/source/model_doc/distilbert.rst
Original file line number Diff line number Diff line change
Expand Up @@ -41,3 +41,30 @@ DistilBERT

.. autoclass:: transformers.DistilBertForQuestionAnswering
:members:

``TFDistilBertModel``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.TFDistilBertModel
:members:


``TFDistilBertForMaskedLM``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.TFDistilBertForMaskedLM
:members:


``TFDistilBertForSequenceClassification``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.TFDistilBertForSequenceClassification
:members:


``TFDistilBertForQuestionAnswering``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.TFDistilBertForQuestionAnswering
:members:
21 changes: 21 additions & 0 deletions docs/source/model_doc/gpt.rst
Original file line number Diff line number Diff line change
Expand Up @@ -34,3 +34,24 @@ OpenAI GPT

.. autoclass:: transformers.OpenAIGPTDoubleHeadsModel
:members:


``TFOpenAIGPTModel``
~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.TFOpenAIGPTModel
:members:


``TFOpenAIGPTLMHeadModel``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.TFOpenAIGPTLMHeadModel
:members:


``TFOpenAIGPTDoubleHeadsModel``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.TFOpenAIGPTDoubleHeadsModel
:members:
21 changes: 21 additions & 0 deletions docs/source/model_doc/gpt2.rst
Original file line number Diff line number Diff line change
Expand Up @@ -34,3 +34,24 @@ OpenAI GPT2

.. autoclass:: transformers.GPT2DoubleHeadsModel
:members:


``TFGPT2Model``
~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.TFGPT2Model
:members:


``TFGPT2LMHeadModel``
~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.TFGPT2LMHeadModel
:members:


``TFGPT2DoubleHeadsModel``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.TFGPT2DoubleHeadsModel
:members:
21 changes: 21 additions & 0 deletions docs/source/model_doc/roberta.rst
Original file line number Diff line number Diff line change
Expand Up @@ -34,3 +34,24 @@ RoBERTa

.. autoclass:: transformers.RobertaForSequenceClassification
:members:


``TFRobertaModel``
~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.TFRobertaModel
:members:


``TFRobertaForMaskedLM``
~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.TFRobertaForMaskedLM
:members:


``TFRobertaForSequenceClassification``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.TFRobertaForSequenceClassification
:members:
14 changes: 14 additions & 0 deletions docs/source/model_doc/transformerxl.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,3 +28,17 @@ Transformer XL

.. autoclass:: transformers.TransfoXLLMHeadModel
:members:


``TFTransfoXLModel``
~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.TFTransfoXLModel
:members:


``TFTransfoXLLMHeadModel``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.TFTransfoXLLMHeadModel
:members:
28 changes: 28 additions & 0 deletions docs/source/model_doc/xlm.rst
Original file line number Diff line number Diff line change
Expand Up @@ -39,3 +39,31 @@ XLM

.. autoclass:: transformers.XLMForQuestionAnswering
:members:


``TFXLMModel``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.TFXLMModel
:members:


``TFXLMWithLMHeadModel``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.TFXLMWithLMHeadModel
:members:


``TFXLMForSequenceClassification``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.TFXLMForSequenceClassification
:members:


``TFXLMForQuestionAnsweringSimple``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.TFXLMForQuestionAnsweringSimple
:members:
28 changes: 28 additions & 0 deletions docs/source/model_doc/xlnet.rst
Original file line number Diff line number Diff line change
Expand Up @@ -41,3 +41,31 @@ XLNet

.. autoclass:: transformers.XLNetForQuestionAnswering
:members:


``TFXLNetModel``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.TFXLNetModel
:members:


``TFXLNetLMHeadModel``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.TFXLNetLMHeadModel
:members:


``TFXLNetForSequenceClassification``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.TFXLNetForSequenceClassification
:members:


``TFXLNetForQuestionAnsweringSimple``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.TFXLNetForQuestionAnsweringSimple
:members:
6 changes: 3 additions & 3 deletions docs/source/pretrained_models.rst
Original file line number Diff line number Diff line change
Expand Up @@ -44,15 +44,15 @@ Here is the full list of the currently provided pretrained models together with
| +------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+
| | ``bert-large-uncased-whole-word-masking-finetuned-squad`` | | 24-layer, 1024-hidden, 16-heads, 340M parameters. |
| | | | The ``bert-large-uncased-whole-word-masking`` model fine-tuned on SQuAD |
| | | (see details of fine-tuning in the `example section <https://github.com/huggingface/transformers/tree/master/examples>`__). |
| | | (see details of fine-tuning in the `example section <https://github.com/huggingface/transformers/tree/master/examples>`__). |
| +------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+
| | ``bert-large-cased-whole-word-masking-finetuned-squad`` | | 24-layer, 1024-hidden, 16-heads, 340M parameters |
| | | | The ``bert-large-cased-whole-word-masking`` model fine-tuned on SQuAD |
| | | (see `details of fine-tuning in the example section <https://huggingface.co/transformers/examples.html>`__) |
| | | (see `details of fine-tuning in the example section <https://huggingface.co/transformers/examples.html>`__) |
| +------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+
| | ``bert-base-cased-finetuned-mrpc`` | | 12-layer, 768-hidden, 12-heads, 110M parameters. |
| | | | The ``bert-base-cased`` model fine-tuned on MRPC |
| | | (see `details of fine-tuning in the example section <https://huggingface.co/transformers/examples.html>`__) |
| | | (see `details of fine-tuning in the example section <https://huggingface.co/transformers/examples.html>`__) |
+-------------------+------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+
| GPT | ``openai-gpt`` | | 12-layer, 768-hidden, 12-heads, 110M parameters. |
| | | | OpenAI GPT English model |
Expand Down
Loading

0 comments on commit 1d646ba

Please sign in to comment.