Deep-learning 3 Attention scores, Scaling and Softmax Nov 11, 2024 The Hidden Beauty of Sinusoidal Positional Encodings in Transformers Nov 1, 2024 Vanishing and exploding Gradients – A non-flat-earther's perspective. Oct 28, 2024