Interactive deep-dives
into ML papers

Formulas broken down step by step. Walk-through examples with real numbers. Interactive visualizations you can poke at. No hand-waving.

LLaDA: Large Language Diffusion with mAsking

Nie et al. · arXiv 2025

A masked diffusion framework for LLMs. Uses progressive masking as the forward process and learns to predict masked tokens in reverse, matching AR models at 8B scale.

Diffusion LMMasked DiffusionPre-training

→

Simple and Effective Masked Diffusion Language Models (MDLM)

Sahoo et al. · NeurIPS 2024

Simplifies masked discrete diffusion with a principled continuous-time ELBO. Clean, minimal design with strong perplexity results.

Diffusion LMMasked DiffusionELBO

→

Fast Discrete Diffusion Language Models (Fast DLLM)

Shi et al. · arXiv 2025

Accelerates discrete diffusion LMs with adaptive noise schedules and importance sampling, reducing denoising steps by 3-10x.

Diffusion LMFast SamplingAdaptive

→

Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

Arriola et al. · arXiv 2025

Generates text in blocks — blocks go left-to-right (AR), tokens within each block are denoised in parallel (diffusion). Best of both worlds.

Diffusion LMBlock GenerationHybrid

→

Interactive deep-dives
into ML papers

Diffusion Language Models

LLaDA: Large Language Diffusion with mAsking

Simple and Effective Masked Diffusion Language Models (MDLM)

Fast Discrete Diffusion Language Models (Fast DLLM)

Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

RLHF & Alignment

Transformer Foundations

Large Language Models

Vision-Language Models

Efficient Training & Inference

Interactive deep-divesinto ML papers

Diffusion Language Models

LLaDA: Large Language Diffusion with mAsking

Simple and Effective Masked Diffusion Language Models (MDLM)

Fast Discrete Diffusion Language Models (Fast DLLM)

Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

RLHF & Alignment

Transformer Foundations

Large Language Models

Vision-Language Models

Efficient Training & Inference

Interactive deep-dives
into ML papers