Memory‑saving technique that recomputes activations during backward pass to fit larger models. ← Gas Token Hard Cap →