• LLMs are not just next token predictors
    Alex Grzankowski, Stephen M. Downes, and Partick Forber
    Inquiry: An Interdisciplinary Journal of Philosophy. forthcoming.
    LLMs are statistical models of language learning through stochastic gradient descent with a next token prediction objective. Prompting a popular view among AI modelers: LLMs are just next token predictors. While LLMs are engineered using next token prediction, and trained based on their success at this task, our view is that a reduction to just next token predictor sells LLMs short. Moreover, there are important explanations of LLM behavior and capabilities that are lost when we engage in this k…Read more