Transformer

What is a transformer?

A transformer is a deep learning model architecture. "Transform" means representation is trained and re-used only by changing downstream layers.

Types of transformers

encoder only LLM (autoencoding models)

decoder only LLM (autoregressive models)

encoder decoder LLM (sequence-to-sequence models)

Transformer architecture

word embeddings -> positional encoding (multi-head encoding) -> encoder -> decoder