GPT-scratch Simple implementation of a GPT from scratch using a stack of decoder blocks. References [1] "Attention Is All You Need" paper [2] "Transformers From Scratch" by Peter Bloem