Tokenisation

Breaking text into chunks for LLM processing

ModelsTechnical
Updated 2 May 2025·Reviewed

Definition

The process of breaking down text into chunks ("tokens") that are used as inputs by LLMs. The phrase "AI is smart" becomes four tokens: "AI", "is", "smart", and punctuation.

All TermsBack to GlossaryNext TermTopic Modeling