This paper introduces a trajectory level of explanation for inference-time behaviour in large language models. Existing frameworks—autoregressive conditioning, mechanistic circuit analysis, and quasi-cognitive description—treat generation as a sequence of context-conditioned draws or as circuit execution. None provides the vocabulary needed to ask whether exit from a behavioural mode is harder than entry, whether transitions are threshold-mediated or continuous, or whether a model’s path through…
Read moreThis paper introduces a trajectory level of explanation for inference-time behaviour in large language models. Existing frameworks—autoregressive conditioning, mechanistic circuit analysis, and quasi-cognitive description—treat generation as a sequence of context-conditioned draws or as circuit execution. None provides the vocabulary needed to ask whether exit from a behavioural mode is harder than entry, whether transitions are threshold-mediated or continuous, or whether a model’s path through representational space exhibits the path dependence characteristic of a dynamical system with stable regimes. The paper argues that these questions are not merely unmeasured but unformulable within current frameworks, and that their invisibility is the signature of a genuine explanatory level rather than a gap in current knowledge. Drawing on formal links between transformer attention and attractor dynamics, results on metastability in transformer systems, and empirical evidence of regime-like behaviour in large language models, it proposes the asymmetry coefficient A(M,γ) = Rout (γ)/Rin (γ) as a discriminator between four competing accounts of inference-time behaviour. An experimental protocol is specified, the full result space is mapped, explicit failure conditions are stated, and alignment gating is reframed as trajectory control rather than fixed output policy.