Logprobs
Per‑token log‑probabilities emitted by models; useful for calibration and safety filters.
Test that fits a linear classifier on frozen embeddings to measure learned representations.
Regularization that replaces hard labels with softened targets to reduce overconfidence.
Compressed representation where generative models operate to encode semantic structure.
Normalization technique applied across features for each token; stabilizes training in transformers.
Objective minimized during training, e.g., cross‑entropy for classification or language modeling.
Recurrent neural network architecture with gates for learning long‑range dependencies.