Technique that accumulates velocity of gradients to smooth updates; basis for SGD with momentum, Adam. ← Mainnet Layer 1 (L1) →