Very Very Large Transformers

Posted on Ven 09 oct 2020 in Machine Translation, Language Models

Transformers Models: Basics

The transformer architecture of (Vaswani et al, 2017) has been transformative in the sense that is has quickly replaced more classical RNN architectures as the basic block for building large-scale structured prediction model. A TF learns to transform a structured linguistic object (typically a string) into a …


Continue reading