Elements Of Mechanistic Interpretability: From Observation to Causation

We strip down mechanistic interpretability to three key experiments: watching a model 'think', finding where it stores concepts, and performing 'causal surgery' to change its 'thought process'

Oct 26, 2025 Machine Learning, Deep-learning

SFT vs. DPO (/ RLHF)- A Visual Guide to What Your LLM Actually Learns

A visual guide and toy experiment to build intuition for the practical differences between Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO).

Aug 30, 2025 Machine Learning, Deep-learning

Do You Need A Matryoshka Model?

An analysis conducted on the informations hotspots on embeddings.

Jun 22, 2025 Machine Learning, Deep-learning

Chunking Strategies for RAG - Breaking Down Documents for Better Retrieval

A comprehensive guide to chunking strategies for Retrieval-Augmented Generation, from basic splitting to advanced semantic and agentic approaches.

May 31, 2025 Machine Learning, Deep-learning

Speculative Decoding - Making Language Models Generate Faster Without Losing Their Minds

Speculative decoding speeds up autoregressive text generation by combining a small draft model with a larger verifier model. This two-step dance slashes latency while preserving quality, an essenti...

Apr 21, 2025 Machine Learning, Deep-learning

Mixture of Experts – Scaling Transformers Without Breaking the FLOPS Bank

Mixture of Experts (MoE) lets you scale transformer models to billions of parameters without proportional compute costs. By selectively routing tokens through specialized experts, MoE achieves mass...

Mar 16, 2025 Machine Learning, Deep-learning

Doing MORE To consume LESS – Flash Attention V1

Flash Attention played a major role in making LLMs more accessible to consumers. This algorithm embodies how a set of what one might consider "trivial ideas" can come together and form a powerful s...

Feb 8, 2025 Machine Learning, Deep-learning

Guidance – Structuring your outputs is easier than you think

In this post, we explore how to simplify and optimize the output generation process in language models using guidance techniques. By pre-structuring inputs and restraining the output space, we can ...

Jan 4, 2025 Language Models, Optimization

A beginner's guide to Vision Language Models (VLMs)

The amount of visual data that we constantly ingest is massive, and our ability to function in an environment may greatly impove when we have access to this modality, thus being able to use it as a...

Dec 23, 2024 Deep-learning, Vision Language Models

Row of the contextualized representation needed for predicting the next token

KV cache – The how not to waste your FLOPS starter

You've probably heard of the Transformers by now, they're everywhere, so much so that new born babies are gonna start saying Transformers as their first word, this blog will explore an important co...

Nov 21, 2024 Machine Learning, Deep-learning