LLMs 2 Speculative Decoding - Making Language Models Generate Faster Without Losing Their Minds Apr 21, 2025 Mixture of Experts – Scaling Transformers Without Breaking the FLOPS Bank Mar 16, 2025