PhilPeople

More details

University of California, Davis
Department of Philosophy

Associate Professor

University of Bristol

Department of Philosophy

PhD, 2011

Homepage

Davis, California, United States of America

Areas of Specialization

Philosophy of Mind

Philosophy of Cognitive Science

Areas of Interest

Memory and Cognitive Science

Implicit/Explicit Rules and Representations

Modularity in Cognitive Science

Explanation in Cognitive Science

PhilPapers Editorships

Representation

Mechanistic Interpretability and Representationalism about Belief
André Curtis-Trudel and Preston Lennon

Philosophy of Ai. forthcoming.

Large language models (LLMs) exhibit impressive performance across a range of apparently cognitive tasks. Mentalists holds that this performance is best explained by the fact that LLMs have mental states, while anti-mentalists hold that this performance should be explained some other way. In this note we address representationalist folk mentalism, which holds (a) possessing a folk mental state like belief or desire is a matter of having an internal representation with appropriate content and (b)…Read more
Large language models (LLMs) exhibit impressive performance across a range of apparently cognitive tasks. Mentalists holds that this performance is best explained by the fact that LLMs have mental states, while anti-mentalists hold that this performance should be explained some other way. In this note we address representationalist folk mentalism, which holds (a) possessing a folk mental state like belief or desire is a matter of having an internal representation with appropriate content and (b) that LLMs have folk psychological states of this sort (or at least robust precursors to such states). Although representationalist folk mentalism appears attractive, we argue that neither probing nor intervention studies have uncovered representations of the relevant sort in state-of-the-art LLMs. However, while it would be premature to accept representationalist folk mentalism, our argument provides a roadmap for mechanistic interpretability research going forward.

Zoe Drayson

Mechanistic Interpretability and Representationalism about Belief
André Curtis-Trudel and Preston Lennon

Philosophy of Ai. forthcoming.

Zoe Drayson

Mechanistic Interpretability and Representationalism about Belief André Curtis-Trudel and Preston Lennon Philosophy of Ai. forthcoming.

Mechanistic Interpretability and Representationalism about Belief
André Curtis-Trudel and Preston Lennon

Philosophy of Ai. forthcoming.