Archives
- 21 Apr Speculative Decoding - Making Language Models Generate Faster Without Losing Their Minds
- 16 Mar Mixture of Experts – Scaling Transformers Without Breaking the FLOPS Bank
- 08 Feb Doing MORE To consume LESS – Flash Attention V1
- 04 Jan Guidance – Structuring your outputs is easier than you think
- 23 Dec A beginner's guide to Vision Language Models (VLMs)
- 21 Nov KV cache – The how not to waste your FLOPS starter
- 11 Nov Attention scores, Scaling and Softmax
- 01 Nov The Hidden Beauty of Sinusoidal Positional Encodings in Transformers
- 28 Oct Vanishing and exploding Gradients – A non-flat-earther's perspective.
- 25 Oct Teaching an AI to Drive a Taxi – A Friendly Guide to Q-Learning
- 22 Oct Recall and Precision – A Practical Case Against Memorization