Subword tokenization algorithm used by BERT like models to balance vocabulary size and coverage by splitting rare words into learned pieces. ← Fair Launch