Tokenisation

Breaking text into chunks for LLM processing

ModelsTechnical
Back to Glossary
Updated 2 May 2025

Definition

The process of breaking down text into chunks ("tokens") that are used as inputs by LLMs. The phrase "AI is smart" becomes four tokens: "AI", "is", "smart", and punctuation.