Compressing model weights and activations to lower precision, for example 8‑bit or 4‑bit, to reduce memory and speed up inference. ← Quadratic Funding Fee Market →