In this report, I will present the work of my project on Transformer Architecture. The thesis is in French language (see thesis.pdf), but I will resume some parts to write them in the form of an article in English.
The study is about transformer and it as well as its application with a comparison with other architecture such as RNNs and CNNs.
The library used is Tensor2Tensor: https://github.com/tensorflow/tensor2tensor. The .ipynb file contains a tutorial on using this library for a translation task with Transformer.
Link to Jupyter notebook : https://drive.google.com/open?id=1vdW5OPPphCLSaf_iiSpsuxZazscY5iyB
Machine translation is constantly evolving. It has allowed the world to advance whether in the field of trade or business. it have mad made it easy for many tasks, and she was able to connect people from different countries with different languages.
The questions that arise about Transformer, why is this technology so special compared to previous ones ? Why did you create this architecture ? Are not the current architectures performing well ? What solutions does the Transformer bring ? The Transformer achieved levels of quality and performance very high, surpassing those used by Google Translation and Facebook. They had bad deficiencies and were corrected by the Transformer, while offering much better results.
The Transformer is based on a mechanism already used by older architectures, but better, called self-attention.
I published an article about comparison of Transformer vs RNN and CNN: https://medium.com/@yacine.benaffane/transformer-vs-rnn-and-cnn-18eeefa3602b