•  68
    AI Deception: A Survey of Examples, Risks, and Potential Solutions
    with Peter Park, Aidan O'Gara, Michael Chen, and Dan Hendrycks
    This paper argues that a range of current AI systems have learned how to deceive humans. We define deception as the systematic inducement of false beliefs in the pursuit of some outcome other than the truth. We first survey empirical examples of AI deception, discussing both special-use AI systems (including Meta's CICERO) built for specific competitive situations, and general-purpose AI systems (such as large language models). Next, we detail several risks from AI deception, such as fraud, elec…Read more
  •  6
    Losing confidence in luminosity
    Noûs 55 (4): 962-991. 2021.
    A mental state is luminous if, whenever an agent is in that state, they are in a position to know that they are. Following Timothy Williamson's Knowledge and Its Limits, a wave of recent work has explored whether there are any non‐trivial luminous mental states. A version of Williamson's anti‐luminosity appeals to a safety‐theoretic principle connecting knowledge and confidence: if an agent knows p, then p is true in any nearby scenario where she has a similar level of confidence in p. However, …Read more
  •  678
    Recent advances in natural language processing have given rise to a new kind of AI architecture: the language agent. By repeatedly calling an LLM to perform a variety of cognitive tasks, language agents are able to function autonomously to pursue goals specified in natural language and stored in a human-readable format. Because of their architecture, language agents exhibit behavior that is predictable according to the laws of folk psychology: they function as though they have desires and belief…Read more
  •  1259
    Under what conditions would an artificially intelligent system have wellbeing? Despite its obvious bearing on the ethics of human interactions with artificial systems, this question has received little attention. Because all major theories of wellbeing hold that an individual’s welfare level is partially determined by their mental life, we begin by considering whether artificial systems have mental states. We show that a wide range of theories of mental states, when combined with leading theorie…Read more
  •  830
    Getting Accurate about Knowledge
    with Sam Carter
    Mind 132 (525): 158-191. 2022.
    There is a large literature exploring how accuracy constrains rational degrees of belief. This paper turns to the unexplored question of how accuracy constrains knowledge. We begin by introducing a simple hypothesis: increases in the accuracy of an agent’s evidence never lead to decreases in what the agent knows. We explore various precise formulations of this principle, consider arguments in its favour, and explain how it interacts with different conceptions of evidence and accuracy. As we show…Read more
  •  288
    Omega Knowledge Matters
    Oxford Studies in Epistemology. forthcoming.
    You omega know something when you know it, and know that you know it, and know that you know that you know it, and so on. This paper first argues that omega knowledge matters, in the sense that it is required for rational assertion, action, inquiry, and belief. The paper argues that existing accounts of omega knowledge face major challenges. One account is skeptical, claiming that we have no omega knowledge of any ordinary claims about the world. Another account embraces the KK thesis, and iden…Read more
  •  664
    Iterated Knowledge
    Oxford University Press. 2024.
    You omega know p when you possess every iteration of knowledge of p. This book argues that omega knowledge plays a central role in philosophy. In particular, the book argues that omega knowledge is necessary for permissible assertion, action, inquiry, and belief. Although omega knowledge plays this important role, existing theories of omega knowledge are unsatisfying. One theory, KK, identifies knowledge with omega knowledge. This theory struggles to accommodate cases of inexact knowledge. The o…Read more
  •  585
    Safety, Closure, and Extended Methods
    Journal of Philosophy 121 (1): 26-54. 2024.
    Recent research has identified a tension between the Safety principle that knowledge is belief without risk of error, and the Closure principle that knowledge is preserved by competent deduction. Timothy Williamson reconciles Safety and Closure by proposing that when an agent deduces a conclusion from some premises, the agent’s method for believing the conclusion includes their method for believing each premise. We argue that this theory is untenable because it implies problematically easy epist…Read more
  •  441
    Attitude verbs’ local context
    with Kyle Blumberg
    Linguistics and Philosophy 46 (3): 483-507. 2022.
    Schlenker (Semant Pragmat 2(3):1–78, 2009; Philos Stud 151(1):115–142, 2010a; Mind 119(474):377–391, 2010b) provides an algorithm for deriving the presupposition projection properties of an expression from that expression’s classical semantics. In this paper, we consider the predictions of Schlenker’s algorithm as applied to attitude verbs. More specifically, we compare Schlenker’s theory with a prominent view which maintains that attitudes exhibit belief projection, so that presupposition trigg…Read more
  •  784
    Question-Sensitive Theory of Intention
    with Bob Beddor
    Philosophical Quarterly 73 (2): 346-378. 2022.
    This paper develops a question-sensitive theory of intention. We show that this theory explains some puzzling closure properties of intention. In particular, it can be used to explain why one is rationally required to intend the means to one’s ends, even though one is not rationally required to intend all the foreseen consequences of one’s intended actions. It also explains why rational intention is not always closed under logical implication, and why one can only intend outcomes that one believ…Read more
  •  480
    Contextology
    Philosophical Studies 179 (11): 3187-3209. 2022.
    Contextology is the science of the dynamics of the conversational context. Contextology formulates laws governing how the shared information states of interlocutors evolve in response to assertion. More precisely, the contextologist attempts to construct a function which, when provided with just a conversation’s pre-update context and the content of an assertion, delivers that conversation’s post-update context. Most contextologists have assumed that the function governing the evolution of the c…Read more
  •  443
    Sly Pete in Dynamic Semantics
    Journal of Philosophical Logic 51 (5): 1103-1117. 2022.
    In ‘Sly Pete’ or ‘standoff’ cases, reasonable speakers accept incompatible conditionals, and communicate them successfully to a trusting hearer. This paper uses the framework of dynamic semantics to offer a new model of the conversational dynamics at play in standoffs, and to articulate several puzzles posed by such cases. The paper resolves these puzzles by embracing a dynamic semantics for conditionals, according to which indicative conditionals require that their antecedents are possible in t…Read more
  •  613
    Knowledge from multiple experiences
    Philosophical Studies 179 (4): 1341-1372. 2021.
    This paper models knowledge in cases where an agent has multiple experiences over time. Using this model, we introduce a series of observations that undermine the pretheoretic idea that the evidential significance of experience depends on the extent to which that experience matches the world. On the basis of these observations, we model knowledge in terms of what is likely given the agent’s experience. An agent knows p when p is implied by her epistemic possibilities. A world is epistemically po…Read more
  •  914
    Fragile Knowledge
    Mind 131 (522): 487-515. 2022.
    This paper explores the principle that knowledge is fragile, in that whenever S knows that S doesn’t know that S knows that p, S thereby fails to know p. Fragility is motivated by the infelicity of dubious assertions, utterances which assert p while acknowledging higher-order ignorance whether p. Fragility is interestingly weaker than KK, the principle that if S knows p, then S knows that S knows p. Existing theories of knowledge which deny KK by accepting a Margin for Error principle can be con…Read more
  •  873
    Probability for Epistemic Modalities
    Philosophers' Imprint 21 (33). 2021.
    This paper develops an information-sensitive theory of the semantics and probability of conditionals and statements involving epistemic modals. The theory validates a number of principles linking probability and modality, including the principle that the probability of a conditional If A, then C equals the probability of C, updated with A. The theory avoids so-called triviality results, which are standardly taken to show that principles of this sort cannot be validated. To achieve this, we deny …Read more
  •  774
    Counterfactual Contamination
    Australasian Journal of Philosophy 100 (2): 262-278. 2022.
    Many defend the thesis that when someone knows p, they couldn’t easily have been wrong about p. But the notion of easy possibility in play is relatively undertheorized. One structural idea in the literature, the principle of Counterfactual Closure (CC), connects easy possibility with counterfactuals: if it easily could have happened that p, and if p were the case, then q would be the case, it follows that it easily could have happened that q. We first argue that while CC is false, there is a tru…Read more
  •  707
    The normality of error
    with Sam Carter
    Philosophical Studies 178 (8): 2509-2533. 2021.
    Formal models of appearance and reality have proved fruitful for investigating structural properties of perceptual knowledge. This paper applies the same approach to epistemic justification. Our central goal is to give a simple account of The Preface, in which justified belief fails to agglomerate. Following recent work by a number of authors, we understand knowledge in terms of normality. An agent knows p iff p is true throughout all relevant normal worlds. To model The Preface, we appeal to th…Read more
  •  752
    Mighty Knowledge
    with Bob Beddor
    Journal of Philosophy 118 (5): 229-269. 2021.
    We often claim to know what might be—or probably is—the case. Modal knowledge along these lines creates a puzzle for information-sensitive semantics for epistemic modals. This paper develops a solution. We start with the idea that knowledge requires safe belief: a belief amounts to knowledge only if it could not easily have been held falsely. We then develop an interpretation of the modal operator in safety that allows it to non-trivially embed information-sensitive contents. The resulting theor…Read more
  •  796
    Losing Confidence in Luminosity
    Noûs (4): 1-30. 2020.
    A mental state is luminous if, whenever an agent is in that state, they are in a position to know that they are. Following Timothy Williamson’s Knowledge and Its Limits, a wave of recent work has explored whether there are any non-trivial luminous mental states. A version of Williamson’s anti-luminosity appeals to a safety- theoretic principle connecting knowledge and confidence: if an agent knows p, then p is true in any nearby scenario where she has a similar level of confidence in p. However,…Read more
  •  846
    Epistemic Modal Credence
    Philosophers' Imprint 21 (26). 2021.
    Triviality results threaten plausible principles governing our credence in epistemic modal claims. This paper develops a new account of modal credence which avoids triviality. On the resulting theory, probabilities are assigned not to sets of worlds, but rather to sets of information state-world pairs. The theory avoids triviality by giving up the principle that rational credence is closed under conditionalization. A rational agent can become irrational by conditionalizing on new evidence. In pl…Read more
  •  322
    Free choice and homogeneity
    Semantics and Pragmatics 12 1-48. 2019.
    This paper develops a semantic solution to the puzzle of Free Choice permission. The paper begins with a battery of impossibility results showing that Free Choice is in tension with a variety of classical principles, including Disjunction Introduction and the Law of Excluded Middle. Most interestingly, Free Choice appears incompatible with a principle concerning the behavior of Free Choice under negation, Double Prohibition, which says that Mary can’t have soup or salad implies Mary can’t have s…Read more
  •  71
    The counterfactual direct argument
    Linguistics and Philosophy 43 (2): 193-232. 2020.
    Many have accepted that ordinary counterfactuals and might counterfactuals are duals. In this paper, I show that this thesis leads to paradoxical results when combined with a few different unorthodox yet increasingly popular theses, including the thesis that counterfactuals are strict conditionals. Given Duality and several other theses, we can quickly infer the validity of another paradoxical principle, ‘The Counterfactual Direct Argument’, which says that ‘A> ’ entails ‘A> ’. First, I provide …Read more
  •  53
    Free Choice Impossibility Results
    Journal of Philosophical Logic 49 (2): 249-282. 2020.
    Free Choice is the principle that possibly p or q implies and is implied by possibly p and possibly q. A variety of recent attempts to validate Free Choice rely on a nonclassical semantics for disjunction, where the meaning of p or q is not a set of possible worlds. This paper begins with a battery of impossibility results, showing that some kind of nonclassical semantics for disjunction is required in order to validate Free Choice. The paper then provides a positive account of Free Choice, by i…Read more
  •  449
    A Theory of Conditional Assertion
    Journal of Philosophy 116 (6): 293-318. 2019.
    According to one tradition, uttering an indicative conditional involves performing a special sort of speech act: a conditional assertion. We introduce a formal framework that models this speech act. Using this framework, we show that any theory of conditional assertion validates several inferences in the logic of conditionals, including the False Antecedent inference. Next, we determine the space of truth-conditional semantics for conditionals consistent with conditional assertion. The truth val…Read more
  •  116
    Generalized Update Semantics
    Mind 128 (511): 795-835. 2019.
    This paper explores the relationship between dynamic and truth conditional semantics for epistemic modals. It provides a generalization of a standard dynamic update semantics for modals. This new semantics derives a Kripke semantics for modals and a standard dynamic semantics for modals as special cases. The semantics allows for new characterizations of a variety of principles in modal logic, including the inconsistency of ‘p and might not p’. Finally, the semantics provides a construction proce…Read more
  •  1080
    Believing epistemic contradictions
    with Beddor Bob
    Review of Symbolic Logic (1): 87-114. 2018.
    What is it to believe something might be the case? We develop a puzzle that creates difficulties for standard answers to this question. We go on to propose our own solution, which integrates a Bayesian approach to belief with a dynamic semantics for epistemic modals. After showing how our account solves the puzzle, we explore a surprising consequence: virtually all of our beliefs about what might be the case provide counterexamples to the view that rational belief is closed under logical implica…Read more
  •  113
    A Preface Paradox for Intention
    Philosophers' Imprint 16. 2016.
    In this paper I argue that there is a preface paradox for intention. The preface paradox for intention shows that intentions do not obey an agglomeration norm, requiring one to intend conjunctions of whatever else one intends. But what norms do intentions obey? I will argue that intentions come in degrees. These partial intentions are governed by the norms of the probability calculus. First, I will give a dispositional theory of partial intention, on which degrees of intention are the degrees to…Read more
  •  485
    Triviality Results For Probabilistic Modals
    Philosophy and Phenomenological Research 99 (1): 188-222. 2017.
    In recent years, a number of theorists have claimed that beliefs about probability are transparent. To believe probably p is simply to have a high credence that p. In this paper, I prove a variety of triviality results for theses like the above. I show that such claims are inconsistent with the thesis that probabilistic modal sentences have propositions or sets of worlds as their meaning. Then I consider the extent to which a dynamic semantics for probabilistic modals can capture theses connecti…Read more
  •  135
    A Stronger Doctrine of Double Effect
    Australasian Journal of Philosophy 96 (4): 793-805. 2018.
    Many believe that intended harms are more difficult to justify than are harms that result as a foreseen side effect of one's conduct. We describe cases of harming in which the harm is not intended, yet the harmful act nevertheless runs afoul of the intuitive moral constraint that governs intended harms. We note that these cases provide new and improved counterexamples to the so-called Simple View, according to which intentionally phi-ing requires intending to phi. We then give a new theory of th…Read more