•  6
    A Behavioural and Representational Evaluation of Goal-Directedness in Language Model Agents
    with Arghal Raghu, Fade Chen, Niall Dalton, Evgenii Kortukov, Angelos Nalmpantis, Moksh Nirvaan, Gabriele Sarti, and Mario Giulianelli
    Proceedings of the 43Rd International Conference on Machine Learning (Icml). forthcoming.
    Understanding an agent's goals helps explain and predict its behaviour, yet there is no established methodology for reliably attributing goals to agentic systems. We propose a framework for evaluating goal-directedness that integrates behavioural evaluation with interpretability-based analyses of models' internal representations. As a case study, we examine an LLM agent navigating a 2D grid world toward a goal state. Behaviourally, we evaluate the agent against an optimal policy across varying g…Read more
  •  52
    Choice and Credence in Context
    Dissertation, University of Michigan, Ann Arbor. 2024.
    This dissertation is about the role that conditionals play in uncertain reasoning and deliberation. Specifically, I attempt to show that, by appealing to a particular semantics for conditionals---a contexutalist, sequence semantics, which has recently become popular in philosophy of language---several open problems in decision theory and epistemology can be solved. Chapter 1 is introductory. I set out the semantic view of conditionals in question, and I describe some of its historical backgroun…Read more
  •  1915
    Causal decision theory, context, and determinism
    Philosophy and Phenomenological Research 109 (1): 226-260. 2024.
    The classic formulation of causal decision theory (CDT) appeals to counterfactuals. It says that you should aim to choose an option that would have a good outcome, were you to choose it. However, this version of CDT faces trouble if the laws of nature are deterministic. After all, the standard theory of counterfactuals says that, if the laws are deterministic, then if anything—including the choice you make—were different in the present, either the laws would be violated or the distant past would…Read more
  •  135
    The punctuated equilibrium of scientific change: a Bayesian network model
    with Patrick Grim, Frank Seidl, Isabell N. Astor, and Caroline Diaso
    Synthese 200 (4): 1-25. 2022.
    Our scientific theories, like our cognitive structures in general, consist of propositions linked by evidential, explanatory, probabilistic, and logical connections. Those theoretical webs ‘impinge on the world at their edges,’ subject to a continuing barrage of incoming evidence. Our credences in the various elements of those structures change in response to that continuing barrage of evidence, as do the perceived connections between them. Here we model scientific theories as Bayesian nets, wit…Read more
  •  1675
    Scientific Theories as Bayesian Nets: Structure and Evidence Sensitivity
    with Patrick Grim, Frank Seidl, Hinton E. Rago, Isabell N. Astor, Caroline Diaso, and Peter Ryner
    Philosophy of Science 89 (1): 42-69. 2022.
    We model scientific theories as Bayesian networks. Nodes carry credences and function as abstract representations of propositions within the structure. Directed links carry conditional probabilities and represent connections between those propositions. Updating is Bayesian across the network as a whole. The impact of evidence at one point within a scientific theory can have a very different impact on the network than does evidence of the same strength at a different point. A Bayesian model allow…Read more