Demystifying Token Prediction: How Temperature Actually Shapes What an LLM Says Next
A from-first-principles explanation of next-token prediction: logits, softmax, and temperature, demonstrated with a real bigram language model trained and run live in JavaScript, with an adjustable temperature slider and real probability bars computed on every step — plus what changes (and what doesn't) once you scale up to an actual transformer.