Generative AI | Blog | Dr James Williams

Blog

Demystifying Token Prediction: How Temperature Actually Shapes What an LLM Says Next

A from-first-principles explanation of next-token prediction: logits, softmax, and temperature, demonstrated with a real bigram language model trained and run live in JavaScript, with an adjustable temperature slider and real probability bars computed on every step — plus what changes (and what doesn't) once you scale up to an actual transformer.