•  190
    Rapid progress in artificial intelligence (AI) capabilities has drawn fresh attention to the prospect of consciousness in AI. There is an urgent need for rigorous methods to assess AI systems for consciousness, but significant uncertainty about relevant issues in consciousness science. We present a method for assessing AI systems for consciousness that involves exploring what follows from existing or future neuroscientific theories of consciousness. Indicators derived from such theories can be u…Read more
  •  782
    In this report, we argue that there is a realistic possibility that some AI systems will be conscious and/or robustly agentic in the near future. That means that the prospect of AI welfare and moral patienthood — of AI systems with their own interests and moral significance — is no longer an issue only for sci-fi or the distant future. It is an issue for the near future, and AI companies and other actors have a responsibility to start taking it seriously. We also recommend three early step…Read more
  •  184
    Is there a tension between AI safety and AI welfare?
    with Jeff Sebo and Toni Sims
    Philosophical Studies 182 (7): 2005-2033. 2025.
    The field of AI safety considers whether and how AI development can be safe and beneficial for humans and other animals, and the field of AI welfare considers whether and how AI development can be safe and beneficial for AI systems. There is a prima facie tension between these projects, since some measures in AI safety, if deployed against humans and other animals, would raise questions about the ethics of constraint, deception, surveillance, alteration, suffering, death, disenfranchisement, and…Read more
  •  411
    Introspective Capabilities in Large Language Models
    Journal of Consciousness Studies 30 (9): 143-153. 2023.
    This paper considers the kind of introspection that large language models (LLMs) might be able to have. It argues that LLMs, while currently limited in their introspective capabilities, are not inherently unable to have such capabilities: they already model the world, including mental concepts, and already have some introspection-like capabilities. With deliberate training, LLMs may develop introspective capabilities. The paper proposes a method for such training for introspection, situates poss…Read more
  •  1880
    In a recent letter, Dillion et. al (2023) make various suggestions regarding the idea of artificially intelligent systems, such as large language models, replacing human subjects in empirical moral psychology. We argue that human subjects are in various ways indispensable.
  •  244
    As machine learning informs increasingly consequential decisions, different metrics have been proposed for measuring algorithmic bias or unfairness. Two popular “fairness measures” are calibration and equality of false positive rate. Each measure seems intuitively important, but notably, it is usually impossible to satisfy both measures. For this reason, a large literature in machine learning speaks of a “fairness tradeoff” between these two measures. This framing assumes that both measures are,…Read more
  •  136
    How wishful seeing is not like wishful thinking
    Philosophical Studies 175 (6): 1401-1421. 2017.
    On a traditional view of perceptual justification, perceptual experiences always provide prima facie justification for beliefs based on them. Against this view, Matthew McGrath and Susanna Siegel argue that if an experience is formed in an epistemically pernicious way then it is epistemically downgraded. They argue that "wishful seeing"—when a subject sees something because he wants to see it—is psychologically and normatively analogous to wishful thinking. They conclude that perception can lose…Read more