At first glance, the treatment of LLMs from the perspective of the Free Energy Principle (FEP) seems straightforward: LLMs do not fully operate under the FEP due to their lack of direct connection with the external environment and the incompleteness of their active inference. LLMs do not fully capture the consequences of their actions; hence, they will never acquire a human-like world model. While I agree that LLMs do not operate under FEP and have major limitations resulting from their lack of …
Read moreAt first glance, the treatment of LLMs from the perspective of the Free Energy Principle (FEP) seems straightforward: LLMs do not fully operate under the FEP due to their lack of direct connection with the external environment and the incompleteness of their active inference. LLMs do not fully capture the consequences of their actions; hence, they will never acquire a human-like world model. While I agree that LLMs do not operate under FEP and have major limitations resulting from their lack of connection to the external environment, I argue that they capture an important part of our world model by grounding their responses in it. Although this grounding is mediated—and therefore imperfect—I attribute to LLMs a deeper relation with the external world than many have thus far acknowledged. I explore two main directions when evaluating the possibility of LLMs encapsulating a human-like world model. First, I evaluate the similarity of their behavior, with the underlying hypothesis that, under the FEP, similar behaviors exhibited in similar environments imply similar internal models. Second, I use research on mechanistic interpretability to explore whether human and LLM neural networks are similar. The conclusion is that LLMs might acquire part of our world model, even though this was not intended during their training.