-
1078AI Survival Stories: a Taxonomic Analysis of AI Existential RiskPhilosophy of Ai. forthcoming.Since the release of ChatGPT, there has been a lot of debate about whether AI systems pose an existential risk to humanity. This paper develops a general framework for thinking about the existential risk of AI systems. We analyze a two-premise argument that AI systems pose a threat to humanity. Premise one: AI systems will become extremely powerful. Premise two: if AI systems become extremely powerful, they will destroy humanity. We use these two premises to construct a taxonomy of ‘survival sto…Read more
-
456This paper offers the first careful analysis of the possibility that AI and humanity will go to war. The paper focuses on the case of artificial general intelligence, AI with broadly human capabilities. The paper uses a bargaining model of war to apply standard causes of war to the special case of AI/human conflict. The paper argues that information failures and commitment problems are especially likely in AI/human conflict. Information failures would be driven by the difficulty of measuring AI …Read more
-
689LLMs have dramatically improved in capabilities in recent years. This raises the question of whether LLMs could become genuine agents with beliefs and desires. This paper demonstrates an in principle limit to LLM agency, based on their architecture. LLMs are next word predictors: given a string of text, they calculate the probability that various words can come next. LLMs produce outputs that reflect these probabilities. I show that next word predictors are exploitable. If LLMs are prompted to m…Read more
-
300AI companies are racing to create artificial general intelligence, or “AGI.” If they succeed, the result will be human-level AI systems that can independently pursue high-level goals by formulating and executing long-term plans in the real world. Leading AI researchers agree that some of these systems will likely be “misaligned”–pursuing goals that humans do not desire. This goal mismatch will put misaligned AIs and humans into strategic competition with one another. As with present-day strategi…Read more
-
1104It is generally assumed that existing artificial systems are not phenomenally conscious, and that the construction of phenomenally conscious artificial systems would require significant technological progress if it is possible at all. We challenge this assumption by arguing that if Global Workspace Theory (GWT) — a leading scientific theory of phenomenal consciousness — is correct, then instances of one widely implemented AI architecture, the artificial language agent, might easily be made pheno…Read more
-
730
-
917This paper examines the question of whether Large Language Models (LLMs) like ChatGPT possess minds, focusing specifically on whether they have a genuine folk psychology encompassing beliefs, desires, and intentions. We approach this question by investigating two key aspects: internal representations and dispositions to act. First, we survey various philosophical theories of representation, including informational, causal, structural, and teleosemantic accounts, arguing that LLMs satisfy key con…Read more
-
68Shutdown-seeking AIPhilosophical Studies 1-13. forthcoming.We propose developing AIs whose only final goal is being shut down. We argue that this approach to AI safety has three benefits: (i) it could potentially be implemented in reinforcement learning, (ii) it avoids some dangerous instrumental convergence dynamics, and (iii) it creates trip wires for monitoring dangerous capabilities. We also argue that the proposal can overcome a key challenge raised by Soares et al. (2015), that shutdown-seeking AIs will manipulate humans into shutting them down. W…Read more
-
111This paper argues that a range of current AI systems have learned how to deceive humans. We define deception as the systematic inducement of false beliefs in the pursuit of some outcome other than the truth. We first survey empirical examples of AI deception, discussing both special-use AI systems (including Meta's CICERO) built for specific competitive situations, and general-purpose AI systems (such as large language models). Next, we detail several risks from AI deception, such as fraud, elec…Read more
-
38Losing confidence in luminosityNoûs 55 (4): 962-991. 2021.A mental state is luminous if, whenever an agent is in that state, they are in a position to know that they are. Following Timothy Williamson's Knowledge and Its Limits, a wave of recent work has explored whether there are any non‐trivial luminous mental states. A version of Williamson's anti‐luminosity appeals to a safety‐theoretic principle connecting knowledge and confidence: if an agent knows p, then p is true in any nearby scenario where she has a similar level of confidence in p. However, …Read more
-
1477Language Agents Reduce the Risk of Existential CatastropheAI and Society 1-11. 2023.Recent advances in natural language processing have given rise to a new kind of AI architecture: the language agent. By repeatedly calling an LLM to perform a variety of cognitive tasks, language agents are able to function autonomously to pursue goals specified in natural language and stored in a human-readable format. Because of their architecture, language agents exhibit behavior that is predictable according to the laws of folk psychology: they function as though they have desires and belief…Read more
-
2433AI WellbeingAsian Journal of Philosophy. forthcoming.Under what conditions would an artificially intelligent system have wellbeing? Despite its clear bearing on the ethics of human interactions with artificial systems, this question has received little direct attention. Because all major theories of wellbeing hold that an individual’s welfare level is partially determined by their mental life, we begin by considering whether artificial systems have mental states. We show that a wide range of theories of mental states, when combined with leading th…Read more
-
1241Getting Accurate about KnowledgeMind 132 (525): 158-191. 2022.There is a large literature exploring how accuracy constrains rational degrees of belief. This paper turns to the unexplored question of how accuracy constrains knowledge. We begin by introducing a simple hypothesis: increases in the accuracy of an agent’s evidence never lead to decreases in what the agent knows. We explore various precise formulations of this principle, consider arguments in its favour, and explain how it interacts with different conceptions of evidence and accuracy. As we show…Read more
-
519Omega Knowledge MattersOxford Studies in Epistemology. forthcoming.You omega know something when you know it, and know that you know it, and know that you know that you know it, and so on. This paper first argues that omega knowledge matters, in the sense that it is required for rational assertion, action, inquiry, and belief. The paper argues that existing accounts of omega knowledge face major challenges. One account is skeptical, claiming that we have no omega knowledge of any ordinary claims about the world. Another account embraces the KK thesis, and iden…Read more
-
1160Iterated KnowledgeOxford University Press. 2024.You omega know p when you possess every iteration of knowledge of p. This book argues that omega knowledge plays a central role in philosophy. In particular, the book argues that omega knowledge is necessary for permissible assertion, action, inquiry, and belief. Although omega knowledge plays this important role, existing theories of omega knowledge are unsatisfying. One theory, KK, identifies knowledge with omega knowledge. This theory struggles to accommodate cases of inexact knowledge. The o…Read more
-
1181Safety, Closure, and Extended MethodsJournal of Philosophy 121 (1): 26-54. 2024.Recent research has identified a tension between the Safety principle that knowledge is belief without risk of error, and the Closure principle that knowledge is preserved by competent deduction. Timothy Williamson reconciles Safety and Closure by proposing that when an agent deduces a conclusion from some premises, the agent’s method for believing the conclusion includes their method for believing each premise. We argue that this theory is untenable because it implies problematically easy epist…Read more
-
747Attitude verbs’ local contextLinguistics and Philosophy 46 (3): 483-507. 2022.Schlenker (Semant Pragmat 2(3):1–78, 2009; Philos Stud 151(1):115–142, 2010a; Mind 119(474):377–391, 2010b) provides an algorithm for deriving the presupposition projection properties of an expression from that expression’s classical semantics. In this paper, we consider the predictions of Schlenker’s algorithm as applied to attitude verbs. More specifically, we compare Schlenker’s theory with a prominent view which maintains that attitudes exhibit belief projection, so that presupposition trigg…Read more
-
1191A Question-Sensitive Theory of IntentionPhilosophical Quarterly 73 (2): 346-378. 2022.This paper develops a question-sensitive theory of intention. We show that this theory explains some puzzling closure properties of intention. In particular, it can be used to explain why one is rationally required to intend the means to one’s ends, even though one is not rationally required to intend all the foreseen consequences of one’s intended actions. It also explains why rational intention is not always closed under logical implication, and why one can only intend outcomes that one believ…Read more
-
818ContextologyPhilosophical Studies 179 (11): 3187-3209. 2022.Contextology is the science of the dynamics of the conversational context. Contextology formulates laws governing how the shared information states of interlocutors evolve in response to assertion. More precisely, the contextologist attempts to construct a function which, when provided with just a conversation’s pre-update context and the content of an assertion, delivers that conversation’s post-update context. Most contextologists have assumed that the function governing the evolution of the c…Read more
-
769Sly Pete in Dynamic SemanticsJournal of Philosophical Logic 51 (5): 1103-1117. 2022.In ‘Sly Pete’ or ‘standoff’ cases, reasonable speakers accept incompatible conditionals, and communicate them successfully to a trusting hearer. This paper uses the framework of dynamic semantics to offer a new model of the conversational dynamics at play in standoffs, and to articulate several puzzles posed by such cases. The paper resolves these puzzles by embracing a dynamic semantics for conditionals, according to which indicative conditionals require that their antecedents are possible in t…Read more
-
1038Knowledge from multiple experiencesPhilosophical Studies 179 (4): 1341-1372. 2021.This paper models knowledge in cases where an agent has multiple experiences over time. Using this model, we introduce a series of observations that undermine the pretheoretic idea that the evidential significance of experience depends on the extent to which that experience matches the world. On the basis of these observations, we model knowledge in terms of what is likely given the agent’s experience. An agent knows p when p is implied by her epistemic possibilities. A world is epistemically po…Read more
-
2250Fragile KnowledgeMind 131 (522): 487-515. 2022.This paper explores the principle that knowledge is fragile, in that whenever S knows that S doesn’t know that S knows that p, S thereby fails to know p. Fragility is motivated by the infelicity of dubious assertions, utterances which assert p while acknowledging higher-order ignorance whether p. Fragility is interestingly weaker than KK, the principle that if S knows p, then S knows that S knows p. Existing theories of knowledge which deny KK by accepting a Margin for Error principle can be con…Read more
-
1261Probability for Epistemic ModalitiesPhilosophers' Imprint 21 (33). 2021.This paper develops an information-sensitive theory of the semantics and probability of conditionals and statements involving epistemic modals. The theory validates a number of principles linking probability and modality, including the principle that the probability of a conditional If A, then C equals the probability of C, updated with A. The theory avoids so-called triviality results, which are standardly taken to show that principles of this sort cannot be validated. To achieve this, we deny …Read more
-
1037Counterfactual ContaminationAustralasian Journal of Philosophy 100 (2): 262-278. 2022.Many defend the thesis that when someone knows p, they couldn’t easily have been wrong about p. But the notion of easy possibility in play is relatively undertheorized. One structural idea in the literature, the principle of Counterfactual Closure (CC), connects easy possibility with counterfactuals: if it easily could have happened that p, and if p were the case, then q would be the case, it follows that it easily could have happened that q. We first argue that while CC is false, there is a tru…Read more
-
1023The normality of errorPhilosophical Studies 178 (8): 2509-2533. 2021.Formal models of appearance and reality have proved fruitful for investigating structural properties of perceptual knowledge. This paper applies the same approach to epistemic justification. Our central goal is to give a simple account of The Preface, in which justified belief fails to agglomerate. Following recent work by a number of authors, we understand knowledge in terms of normality. An agent knows p iff p is true throughout all relevant normal worlds. To model The Preface, we appeal to th…Read more
-
1093Mighty KnowledgeJournal of Philosophy 118 (5): 229-269. 2021.We often claim to know what might be—or probably is—the case. Modal knowledge along these lines creates a puzzle for information-sensitive semantics for epistemic modals. This paper develops a solution. We start with the idea that knowledge requires safe belief: a belief amounts to knowledge only if it could not easily have been held falsely. We then develop an interpretation of the modal operator in safety that allows it to non-trivially embed information-sensitive contents. The resulting theor…Read more
-
1161Losing Confidence in LuminosityNoûs (4): 1-30. 2020.A mental state is luminous if, whenever an agent is in that state, they are in a position to know that they are. Following Timothy Williamson’s Knowledge and Its Limits, a wave of recent work has explored whether there are any non-trivial luminous mental states. A version of Williamson’s anti-luminosity appeals to a safety- theoretic principle connecting knowledge and confidence: if an agent knows p, then p is true in any nearby scenario where she has a similar level of confidence in p. However,…Read more
-
1024Epistemic Modal CredencePhilosophers' Imprint 21 (26). 2021.Triviality results threaten plausible principles governing our credence in epistemic modal claims. This paper develops a new account of modal credence which avoids triviality. On the resulting theory, probabilities are assigned not to sets of worlds, but rather to sets of information state-world pairs. The theory avoids triviality by giving up the principle that rational credence is closed under conditionalization. A rational agent can become irrational by conditionalizing on new evidence. In pl…Read more
-
492Free choice and homogeneitySemantics and Pragmatics 12 1-48. 2019.This paper develops a semantic solution to the puzzle of Free Choice permission. The paper begins with a battery of impossibility results showing that Free Choice is in tension with a variety of classical principles, including Disjunction Introduction and the Law of Excluded Middle. Most interestingly, Free Choice appears incompatible with a principle concerning the behavior of Free Choice under negation, Double Prohibition, which says that Mary can’t have soup or salad implies Mary can’t have s…Read more
-
121The counterfactual direct argumentLinguistics and Philosophy 43 (2): 193-232. 2020.Many have accepted that ordinary counterfactuals and might counterfactuals are duals. In this paper, I show that this thesis leads to paradoxical results when combined with a few different unorthodox yet increasingly popular theses, including the thesis that counterfactuals are strict conditionals. Given Duality and several other theses, we can quickly infer the validity of another paradoxical principle, ‘The Counterfactual Direct Argument’, which says that ‘A> ’ entails ‘A> ’. First, I provide …Read more
Pokfulam, Hong Kong
Areas of Specialization
1 more
Philosophy of Action |
Philosophy of Language |
Epistemology |
Metaphysics |
Philosophy of Mind |
M&E, Misc |
Areas of Interest
3 more
Philosophy of Action |
Philosophy of Language |
Formal Epistemology |
Knowledge |
Epistemology |
Metaphysics |
Philosophy of Mind |
M&E, Misc |