normalization 2 Attention scores, Scaling and Softmax Nov 11, 2024 Vanishing and exploding Gradients – A non-flat-earther's perspective. Oct 28, 2024