Meta’s Multi-Token Prediction — What Does It Mean for AI Models & Future of Coding?
In a recent report , researchers at Meta , Université Paris-Saclay, and Ecole des Ponts ParisTech suggest improving the accuracy and speed of AI large language models (LLMs) by “making them predict multiple tokens simultaneously.” This contradicts the classic structure of auto-regressive language models, which have been designed to predict one token at a time. Credit: VentureBeat made with Midjourney . Next-Token Prediciton Model — Limitations & Benefits “Token” is a piece of information or ideas we give in prompts. The current large language models such as GPT and Llama are trained with a next-token prediction loss. Meaning ? The model — AI — is given a sequence of tokens and must predict the next one. How are AI Models Trained? When we give AI a “token” — or piece of information, it adds the predicted token to the input and repeats the process, one token at a time. By doing this over and over on large corpora of text, the model learns general patterns that allow it to genera