Jacqueline Harding (Stanford University): Publications

More details

Stanford University
Department of Philosophy

Doctoral student

Areas of Specialization

Philosophy of Artificial Intelligence

Representation in Artificial Intelligence

Ethics of Artificial Intelligence

Impact of Artificial Intelligence

Areas of Interest

Ethics of Artificial Intelligence

Philosophy of Artificial Intelligence

Impact of Artificial Intelligence

Representation in Artificial Intelligence

112

A Communication-First Account of Explanation
with Tobias Gerstenberg and Thomas F. Icard

Noûs. forthcoming.

This paper develops a formal account of causal explanation, grounded in a theory of conversational pragmatics, and inspired by the interventionist idea that explanation is about asking and answering what-if-things-had-been-different questions. We illustrate the fruitfulness of the account, relative to previous accounts, by showing that widely recognised "explanatory virtues" emerge naturally, as do subtle empirical patterns concerning the impact of norms on causal judgments. This shows the value…Read more
This paper develops a formal account of causal explanation, grounded in a theory of conversational pragmatics, and inspired by the interventionist idea that explanation is about asking and answering what-if-things-had-been-different questions. We illustrate the fruitfulness of the account, relative to previous accounts, by showing that widely recognised "explanatory virtues" emerge naturally, as do subtle empirical patterns concerning the impact of norms on causal judgments. This shows the value of a "communication-first" approach to explanation: getting clear on explanation’s communicative dimension is an important prerequisite for philosophical work on explanation. The result is a simple but powerful framework for incorporating insights from the cognitive sciences into philosophical work on explanation, which will be useful for philosophers or cognitive scientists interested in explanation.

Causation Pragmatics Causal Modeling Explanation in Cognitive Science Pragmatic Theories of Explanation F…Read more
Causation Pragmatics Causal Modeling Explanation in Cognitive Science Pragmatic Theories of Explanation Formal Philosophy
828

Taking AI Welfare Seriously
with Robert Long, Jeff Sebo, Patrick Butlin, Kathleen Finlinson, Kyle Fish, Jacob Pfau, Toni Sims, Jonathan Birch, and David Chalmers

In this report, we argue that there is a realistic possibility that some AI systems will be conscious and/or robustly agentic in the near future. That means that the prospect of AI welfare and moral patienthood — of AI systems with their own interests and moral significance — is no longer an issue only for sci-fi or the distant future. It is an issue for the near future, and AI companies and other actors have a responsibility to start taking it seriously. We also recommend three early step…Read more
In this report, we argue that there is a realistic possibility that some AI systems will be conscious and/or robustly agentic in the near future. That means that the prospect of AI welfare and moral patienthood — of AI systems with their own interests and moral significance — is no longer an issue only for sci-fi or the distant future. It is an issue for the near future, and AI companies and other actors have a responsibility to start taking it seriously. We also recommend three early steps that AI companies and other actors can take: They can (1) acknowledge that AI welfare is an important and difficult issue (and ensure that language model outputs do the same), (2) start assessing AI systems for evidence of consciousness and robust agency, and (3) prepare policies and procedures for treating AI systems with an appropriate level of moral concern. To be clear, our argument in this report is not that AI systems definitely are — or will be — conscious, robustly agentic, or otherwise morally significant. Instead, our argument is that there is substantial uncertainty about these possibilities, and so we need to improve our understanding of AI welfare and our ability to make wise decisions about this issue. Otherwise there is a significant risk that we will mishandle decisions about AI welfare, mistakenly harming AI systems that matter morally and/or mistakenly caring for AI systems that do not.

Philosophy, Miscellaneous Deep Learning Large Language Models Machine Learning, Misc Artificial Consciou…Read more
Philosophy, Miscellaneous Deep Learning Large Language Models Machine Learning, Misc Artificial Consciousness
2116

What is AI safety? What do we want it to be?
with Cameron Domenico Kirk-Giannini

Philosophical Studies 182 (7): 1495-1518. 2025.

The field of AI safety seeks to prevent or reduce the harms caused by AI systems. A simple and appealing account of what is distinctive of AI safety as a field holds that this feature is constitutive: a research project falls within the purview of AI safety just in case it aims to prevent or reduce the harms caused by AI systems. Call this appealingly simple account The Safety Conception of AI safety. Despite its simplicity and appeal, we argue that The Safety Conception is in tension with at le…Read more
The field of AI safety seeks to prevent or reduce the harms caused by AI systems. A simple and appealing account of what is distinctive of AI safety as a field holds that this feature is constitutive: a research project falls within the purview of AI safety just in case it aims to prevent or reduce the harms caused by AI systems. Call this appealingly simple account The Safety Conception of AI safety. Despite its simplicity and appeal, we argue that The Safety Conception is in tension with at least two trends in the ways AI safety researchers and organizations think and talk about AI safety: first, a tendency to characterize the goal of AI safety research in terms of catastrophic risks from future systems; second, the increasingly popular idea that AI safety can be thought of as a branch of safety engineering. Adopting the methodology of conceptual engineering, we argue that these trends are unfortunate: when we consider what concept of AI safety it would be best to have, there are compelling reasons to think that The Safety Conception is the answer. Descriptively, The Safety Conception allows us to see how work on topics that have historically been treated as central to the field of AI safety is continuous with work on topics that have historically been treated as more marginal, like bias, misinformation, and privacy. Normatively, taking The Safety Conception seriously means approaching all efforts to prevent or mitigate harms from AI systems based on their merits rather than drawing arbitrary distinctions between them.

Artificial Intelligence Safety Explainability in Artificial Intelligence Interpretability in Artificia…Read more
Artificial Intelligence Safety Explainability in Artificial Intelligence Interpretability in Artificial Intelligence Large Language Models Ethics of Artificial Intelligence, Misc
1932

What is it for a Machine Learning Model to Have a Capability?
with Nathaniel Sharadin

British Journal for the Philosophy of Science. forthcoming.

What can contemporary machine learning (ML) models do? Given the proliferation of ML models in society, answering this question matters to a variety of stakeholders, both public and private. The evaluation of models' capabilities is rapidly emerging as a key subfield of modern ML, buoyed by regulatory attention and government grants. Despite this, the notion of an ML model possessing a capability has not been interrogated: what are we saying when we say that a model is able to do something? And …Read more
What can contemporary machine learning (ML) models do? Given the proliferation of ML models in society, answering this question matters to a variety of stakeholders, both public and private. The evaluation of models' capabilities is rapidly emerging as a key subfield of modern ML, buoyed by regulatory attention and government grants. Despite this, the notion of an ML model possessing a capability has not been interrogated: what are we saying when we say that a model is able to do something? And what sorts of evidence bear upon this question? In this paper, we aim to answer these questions, using the capabilities of large language models (LLMs) as a running example. Drawing on the large philosophical literature on abilities, we develop an account of ML models' capabilities which can be usefully applied to the nascent science of model evaluation. Our core proposal is a conditional analysis of model abilities (CAMA): crudely, a machine learning model has a capability to X just when it would reliably succeed at doing X if it 'tried'. The main contribution of the paper is making this proposal precise in the context of ML, resulting in an operationalisation of CAMA applicable to LLMs. We then put CAMA to work, showing that it can help make sense of various features of ML model evaluation practice, as well as suggest procedures for performing fair inter-model comparisons.

Cognitive Sciences Philosophy of Technology, Misc Large Language Models Neural Networks and Connectioni…Read more
Cognitive Sciences Philosophy of Technology, Misc Large Language Models Neural Networks and Connectionism Impact of Artificial Intelligence Abilities Explainability in Artificial Intelligence Moral Status of Artificial Systems Interpretability in Artificial Intelligence Artificial Intelligence in Science
1889

Operationalising Representation in Natural Language Processing
British Journal for the Philosophy of Science. 2023.

Despite its centrality in the philosophy of cognitive science, there has been little prior philosophical work engaging with the notion of representation in contemporary NLP practice. This paper attempts to fill that lacuna: drawing on ideas from cognitive science, I introduce a framework for evaluating the representational claims made about components of neural NLP models, proposing three criteria with which to evaluate whether a component of a model represents a property and operationalising th…Read more
Despite its centrality in the philosophy of cognitive science, there has been little prior philosophical work engaging with the notion of representation in contemporary NLP practice. This paper attempts to fill that lacuna: drawing on ideas from cognitive science, I introduce a framework for evaluating the representational claims made about components of neural NLP models, proposing three criteria with which to evaluate whether a component of a model represents a property and operationalising these criteria using probing classifiers, a popular analysis technique in NLP (and deep learning more broadly). The project of operationalising a philosophically-informed notion of representation should be of interest to both philosophers of science and NLP practitioners. It affords philosophers a novel testing-ground for claims about the nature of representation, and helps NLPers organise the large literature on probing experiments, suggesting novel avenues for empirical research.

Explanation in Cognitive Science Information Theory Scientific Representation Natural Language Processi…Read more
Explanation in Cognitive Science Information Theory Scientific Representation Natural Language Processing Impact of Artificial Intelligence Neural Networks and Connectionism Representation in Artificial Intelligence Machine Learning Philosophy of Artificial Intelligence, Miscellaneous
71

Proxy Selection in Transitive Proxy Voting
Social Choice and Welfare 58 69-99. 2022.

Transitive proxy voting (or "liquid democracy") is a novel form of collective decision making, often framed as an attractive hybrid of direct and representative democracy. Although the ideas behind liquid democracy have garnered widespread support, there have been relatively few attempts to model it formally. This paper makes three main contributions. First, it proposes a new social choice-theoretic model of liquid democracy, which is distinguished by taking a richer formal perspective on the pr…Read more
Transitive proxy voting (or "liquid democracy") is a novel form of collective decision making, often framed as an attractive hybrid of direct and representative democracy. Although the ideas behind liquid democracy have garnered widespread support, there have been relatively few attempts to model it formally. This paper makes three main contributions. First, it proposes a new social choice-theoretic model of liquid democracy, which is distinguished by taking a richer formal perspective on the process by which a voter chooses a proxy. Second, it examines the model from an axiomatic perspective, proving (a) a proxy vote analogue of May's Theorem and (b) an impossibility result concerning monotonicity properties in a proxy vote setting. Third, it explores the topic of manipulation in transitive proxy votes. Two forms of manipulation specific to the proxy vote setting are defined, and it is shown that manipulation occurs in strictly more cases in proxy votes than in classical votes.

Arrow's Theorem Game Theory and Political Philosophy
1909

AI language models cannot replace human research participants
with William D’Alessandro, N. G. Laskowski, and Robert Long

AI and Society 39 (5): 2603-2605. 2024.

In a recent letter, Dillion et. al (2023) make various suggestions regarding the idea of artificially intelligent systems, such as large language models, replacing human subjects in empirical moral psychology. We argue that human subjects are in various ways indispensable.

Philosophy of Psychology Philosophy of Artificial Intelligence Meta-Ethics Moral Psychology
339

Everettian Quantum Mechanics and the Metaphysics of Modality
British Journal for the Philosophy of Science 72 (4): 939-964. 2021.

This article sits at a point of intersection between the philosophy of physics and the metaphysics of modality. There are clear similarities between Everettian quantum mechanics and various modal metaphysical theories, but there have hitherto been few attempts at exploring how the two topics relate. In this article, I build on a series of recent papers by Wilson ([2011], [2012], [2013]), who argues that Everettian quantum mechanics’ connections with traditional modal metaphysics are vital in def…Read more
This article sits at a point of intersection between the philosophy of physics and the metaphysics of modality. There are clear similarities between Everettian quantum mechanics and various modal metaphysical theories, but there have hitherto been few attempts at exploring how the two topics relate. In this article, I build on a series of recent papers by Wilson ([2011], [2012], [2013]), who argues that Everettian quantum mechanics’ connections with traditional modal metaphysics are vital in defending it against objections. I show that Wilson’s preferred version of Everettian quantum mechanics has two problems. First, it is unable to account for the contingency of various intuitively contingent modal claims. Second, it fails to yield intuitive truth values on modal claims about the number of branches in a given Everettian multiverse. Since modal claims about branch number are instrumental in decision-theoretic solutions to Everettian quantum mechanics’ problem(s) with probability, this second problem has wider dialectical implications. I suggest amendments to the underlying metaphysics that overcome these problems. The result is a more robust version of Everettian quantum mechanics.

Probabilities in Quantum Mechanics Modal Realism

Jacqueline Harding

A Communication-First Account of Explanation
with Tobias Gerstenberg and Thomas F. Icard

Noûs. forthcoming.

Taking AI Welfare Seriously
with Robert Long, Jeff Sebo, Patrick Butlin, Kathleen Finlinson, Kyle Fish, Jacob Pfau, Toni Sims, Jonathan Birch, and David Chalmers

What is AI safety? What do we want it to be?
with Cameron Domenico Kirk-Giannini

Philosophical Studies 182 (7): 1495-1518. 2025.

What is it for a Machine Learning Model to Have a Capability?
with Nathaniel Sharadin

British Journal for the Philosophy of Science. forthcoming.

Operationalising Representation in Natural Language Processing
British Journal for the Philosophy of Science. 2023.

Proxy Selection in Transitive Proxy Voting
Social Choice and Welfare 58 69-99. 2022.

AI language models cannot replace human research participants
with William D’Alessandro, N. G. Laskowski, and Robert Long

AI and Society 39 (5): 2603-2605. 2024.

Everettian Quantum Mechanics and the Metaphysics of Modality
British Journal for the Philosophy of Science 72 (4): 939-964. 2021.

Jacqueline Harding

A Communication-First Account of Explanation with Tobias Gerstenberg and Thomas F. Icard Noûs. forthcoming.

Taking AI Welfare Seriously with Robert Long, Jeff Sebo, Patrick Butlin, Kathleen Finlinson, Kyle Fish, Jacob Pfau, Toni Sims, Jonathan Birch, and David Chalmers

What is AI safety? What do we want it to be? with Cameron Domenico Kirk-Giannini Philosophical Studies 182 (7): 1495-1518. 2025.

What is it for a Machine Learning Model to Have a Capability? with Nathaniel Sharadin British Journal for the Philosophy of Science. forthcoming.

Operationalising Representation in Natural Language Processing British Journal for the Philosophy of Science. 2023.

Proxy Selection in Transitive Proxy Voting Social Choice and Welfare 58 69-99. 2022.

AI language models cannot replace human research participants with William D’Alessandro, N. G. Laskowski, and Robert Long AI and Society 39 (5): 2603-2605. 2024.

Everettian Quantum Mechanics and the Metaphysics of Modality British Journal for the Philosophy of Science 72 (4): 939-964. 2021.

A Communication-First Account of Explanation
with Tobias Gerstenberg and Thomas F. Icard

Noûs. forthcoming.

Taking AI Welfare Seriously
with Robert Long, Jeff Sebo, Patrick Butlin, Kathleen Finlinson, Kyle Fish, Jacob Pfau, Toni Sims, Jonathan Birch, and David Chalmers

What is AI safety? What do we want it to be?
with Cameron Domenico Kirk-Giannini

Philosophical Studies 182 (7): 1495-1518. 2025.

What is it for a Machine Learning Model to Have a Capability?
with Nathaniel Sharadin

British Journal for the Philosophy of Science. forthcoming.

Operationalising Representation in Natural Language Processing
British Journal for the Philosophy of Science. 2023.

Proxy Selection in Transitive Proxy Voting
Social Choice and Welfare 58 69-99. 2022.

AI language models cannot replace human research participants
with William D’Alessandro, N. G. Laskowski, and Robert Long

AI and Society 39 (5): 2603-2605. 2024.

Everettian Quantum Mechanics and the Metaphysics of Modality
British Journal for the Philosophy of Science 72 (4): 939-964. 2021.