transformers 5 Doing MORE To consume LESS – Flash Attention V1 Feb 8, 2025 A beginner's guide to Vision Language Models (VLMs) Dec 23, 2024 KV cache – The how not to waste your FLOPS starter Nov 21, 2024 Attention scores, Scaling and Softmax Nov 11, 2024 The Hidden Beauty of Sinusoidal Positional Encodings in Transformers Nov 1, 2024