Watch a large language model predict the next word one token at a time, and see what changes when you adjust the temperature dial. A hands-on look at how LLMs generate text.
Pick a sentence starter, set the temperature, and watch the model complete it token by token. Temperature 0 is fully deterministic; higher values sample from wider probability distributions. For most major foundation models (ChatGPT, Claude, Gemini), temperature is adjustable only through the API — not the consumer chat interface, which runs at a fixed default.