Every time a language model like GPT-4, Claude or Mistral generates a sentence, it does something deceptively simple: It picks one word at a time. This word-by-word approach is what gives ...