Julia Haas

DeepMind
  •  19
    The Puzzle of Evaluating Moral Cognition in Artificial Agents
    with Madeline G. Reinecke, Yiran Mao, Markus Kunesch, Edgar A. Duéñez-Guzmán, and Joel Z. Leibo
    Cognitive Science 47 (8). 2023.
    In developing artificial intelligence (AI), researchers often benchmark against human performance as a measure of progress. Is this kind of comparison possible for moral cognition? Given that human moral judgment often hinges on intangible properties like “intention” which may have no natural analog in artificial agents, it may prove difficult to design a “like‐for‐like” comparison between the moral behavior of artificial and human agents. What would a measure of moral behavior for both humans a…Read more
  • Recovering Spinoza's theory of akrasia
    In Ursula Goldenbaum & Christopher Kluz (eds.), Doing Without Free Will: Spinoza and Contemporary Moral Problems, Lexington Books. 2015.
  •  40
    In this opinionated review, I draw attention to some of the contributions reinforcement learning can make to questions in the philosophy of mind. In particular, I highlight reinforcement learning's foundational emphasis on the role of reward in agent learning, and canvass two ways in which the framework may advance our understanding of perception and motivation.
  •  216
    The evaluative mind
    In Mind Design III, . forthcoming.
    I propose that the successes and contributions of reinforcement learning urge us to see the mind in a new light, namely, to recognise that the mind is fundamentally evaluative in nature.
  •  665
    I argue for the role of reinforcement learning in the philosophy of mind. To start, I make several assumptions about the nature of reinforcement learning and its instantiation in minds like ours. I then review some of the contributions of reinforcement learning methods have made across the so-called 'decision sciences.' Finally, I show how principles from reinforcement learning can shape philosophical debates regarding the nature of perception and characterisations of desire.
  •  486
    Is Synchronic Self-Control Possible?
    Review of Philosophy and Psychology 12 (2): 397-424. 2020.
    An agent exercises instrumental rationality to the degree that she adopts appropriate means to achieving her ends. Adopting appropriate means to achieving one’s ends can, in turn, involve overcoming one’s strongest desires, that is, it can involve exercising synchronic self-control. However, contra prominent approaches, I deny that synchronic self-control is possible. Specifically, I draw on computational models and empirical evidence from cognitive neuroscience to describe a naturalistic, multi…Read more
  •  620
    Can hierarchical predictive coding explain binocular rivalry?
    Philosophical Psychology 34 (3): 424-444. 2021.
    Hohwy et al.’s (2008) model of binocular rivalry (BR) is taken as a classic illustration of predictive coding’s explanatory power. I revisit the account and show that it cannot explain the role of reward in BR. I then consider a more recent version of Bayesian model averaging, which recasts the role of reward in (BR) in terms of optimism bias. If we accept this account, however, then we must reconsider our conception of perception. On this latter view, I argue, organisms engage in what amounts …Read more
  •  2200
    The Neuroscience of Moral Judgment: Empirical and Philosophical Developments
    In Felipe De Brigard & Walter Sinnott-Armstrong (eds.), Neuroscience and Philosophy, Mit Press. pp. 17-47. 2022.
    We chart how neuroscience and philosophy have together advanced our understanding of moral judgment with implications for when it goes well or poorly. The field initially focused on brain areas associated with reason versus emotion in the moral evaluations of sacrificial dilemmas. But new threads of research have studied a wider range of moral evaluations and how they relate to models of brain development and learning. By weaving these threads together, we are developing a better understanding o…Read more
  •  57
    I describe a suite of reinforcement learning environments in which artificial agents learn to value and respond to moral content and contexts. I illustrate the core principles of the framework by characterizing one such environment, or “gridworld,” in which an agent learns to trade-off between monetary profit and fair dealing, as applied in a standard behavioral economic paradigm. I then highlight the core technical and philosophical advantages of the learning approach for modeling moral cogniti…Read more
  •  110
    This paper presents an empirical solution to the puzzle of weakness of will. Specifically, it presents a theory of action, grounded in contemporary cognitive neuroscientific accounts of decision making, that explains the phenomenon of weakness of will without resulting in a puzzle.