Teacher Forcing

August 23, 2025 2 weeks ago 1 min read

Training technique where the model is fed ground‑truth tokens for the next step instead of its own predictions, speeding convergence but risking exposure bias.