-
550The individuation problem for large language models asks which entities associated with them, if any, should be identified as minds. We approach this problem through mechanistic interpretability, engaging in particular with recent empirical work on persona vectors, persona space, and emergent misalignment. We argue that three views are the strongest candidates: the virtual instance view and two new views we introduce, the (virtual) instance-persona view and the model-persona view. First, we arg…Read more
-
290Recent work in mechanistic interpretability has studied how large language models recall facts stored in their weights. This paper argues that factual recall points to something broader: a general kind of operation in deep learning models, which I call feature recall. The core observation is that a linear projection can be read as retrieving stored information scaled by input activations. I define feature recall, show it applies across architectures, and contrast it with the established paradigm…Read more
-
76New horizons in machine understanding: explanatory and objectual understanding in deep learning video generation modelsSynthese 206 (285): 285. 2025.OpenAI has recently released SORA, a deep learning model that can generate highly realistic videos. Its creators claim that it “understands the physical world in motion.” In this paper, I subject this claim to philosophical scrutiny. After explaining in general how stable diffusion models generate videos, I employ the concepts of explanatory and objectual understanding to determine what kind of understanding of the physical world such deep learning models for video generation might possess. Draw…Read more
-
826What makes understanding an important cognitive state? And what does having the concept of understanding do for us? This paper offers a unifying account of understanding by jointly reverse-engineering the function of both the state and the concept. We argue that we care about understanding because it grounds and predicts robust competence: the stable ability to succeed across novel scenarios. Our concept of understanding evolved as an efficient proxy to track this elusive property, allowing us t…Read more
-
129An Alternative to Cognitivism: Computational Phenomenology for Deep LearningMinds and Machines 33 (3): 397-427. 2023.We propose a non-representationalist framework for deep learning relying on a novel method computational phenomenology, a dialogue between the first-person perspective (relying on phenomenology) and the mechanisms of computational models. We thereby propose an alternative to the modern cognitivist interpretation of deep learning, according to which artificial neural networks encode representations of external entities. This interpretation mainly relies on neuro-representationalism, a position th…Read more
-
4708Mechanistic Indicators of Understanding in Large Language ModelsPhilosophical Studies. 2026.Large language models (LLMs) are often portrayed as merely imitating linguistic patterns without genuine understanding. We argue that recent findings in mechanistic interpretability (MI), the emerging field probing the inner workings of LLMs, render this picture increasingly untenable—but only once those findings are integrated within a theoretical account of understanding. We propose a tiered framework for thinking about understanding in LLMs and use it to synthesize the most relevant findings …Read more
-
École Polytechnique Federale de LausanneDoctoral student
Areas of Specialization
| Philosophy of Artificial Intelligence |
| Epistemology |
| Understanding |
Areas of Interest
| Philosophy of Artificial Intelligence |
| Epistemology |
| Understanding |