Ben Levinstein (Anthropic): Publications

More details

Anthropic

Other
University of Illinois, Urbana-Champaign
Department of Philosophy

Associate Professor
University of Oxford
Future of Humanity Institute

Post-doctoral Fellow

Rutgers - New Brunswick

Department of Philosophy

PhD, 2013

Homepage

Urbana and Champaign, Illinois, United States of America

0000-0002-7497-1091

Areas of Specialization

Metaphysics and Epistemology

The Nature of Artificial Intelligence

Value Theory

Science, Logic, and Mathematics

Areas of Interest

Metaphysics and Epistemology

Value Theory

Science, Logic, and Mathematics

430

Radical AI Interpretability
with Daniel Herrmann

We develop a framework for interpreting AI systems as agents, drawing on the philosophical tradition of radical interpretation and the tools of mechanistic interpretability. The core question is: given the computational facts about a system, how do we solve for its beliefs, desires, and meanings? This matters increasingly for safety. We want to be able to trust the systems we deploy, whether by understanding their goals or, more modestly, by reliably detecting deception. Interpretability researc…Read more
We develop a framework for interpreting AI systems as agents, drawing on the philosophical tradition of radical interpretation and the tools of mechanistic interpretability. The core question is: given the computational facts about a system, how do we solve for its beliefs, desires, and meanings? This matters increasingly for safety. We want to be able to trust the systems we deploy, whether by understanding their goals or, more modestly, by reliably detecting deception. Interpretability researchers are building tools to read beliefs and desires off a model's internals, but there is no settled account of when such a tool has succeeded. This book supplies one. We propose criteria on both representationalist and interpretationist approaches, and tie each to tests current interpretability methods can carry out. A central lesson is that these attributions cannot be made piecemeal. Beliefs, desires, and the propositional structure they presuppose are jointly constrained, and a method that fixes one while measuring the others inherits whatever distortions that introduces. This holism becomes pressing for AI systems, which may not share the interpreter's concepts. However, it also provides leverage: a system's attitudes constrain its propositional structure, that structure constrains which attitudes can be attributed, and mechanistic interpretability can help us measure both.

Decision Theory Philosophy of AI, General Works Philosophy of Mind, Miscellaneous Philosophy of AI, Mis…Read more
Decision Theory Philosophy of AI, General Works Philosophy of Mind, Miscellaneous Philosophy of AI, Misc Artificial Intelligence Safety General Philosophy of Science Radical Interpretation
107

Tickles, iteration, and habits
with Kenny Easwaran and Ted Shear

Theory and Decision 100 (2): 531-559. 2026.

At first pass, Evidential Decision Theory (EDT) recommends one-boxing in Newcomb’s Problem and Causal Decision Theory (CDT) recommends two-boxing. However, it has been acknowledged that concrete instances of the problem have messy features complicating their analyses. Recently, a third competitor, Functional Decision Theory (FDT) has emerged recommending one-boxing in some versions and two-boxing in others. This paper explores the verdicts of these competing theories in a few variations of the p…Read more
At first pass, Evidential Decision Theory (EDT) recommends one-boxing in Newcomb’s Problem and Causal Decision Theory (CDT) recommends two-boxing. However, it has been acknowledged that concrete instances of the problem have messy features complicating their analyses. Recently, a third competitor, Functional Decision Theory (FDT) has emerged recommending one-boxing in some versions and two-boxing in others. This paper explores the verdicts of these competing theories in a few variations of the problem involving iteration and habit. We argue that this motivates the re-evaluation of case intuitions in decision theory and greater consideration of formal and structural relationships between CDT and FDT.

Evidential Decision Theory Newcomb's Problem Causal Decision Theory
1587

Standards for Belief Representations in LLMs
with Daniel A. Herrmann

Minds and Machines 35 (1): 1-25. 2024.

As large language models (LLMs) continue to demonstrate remarkable abilities across various domains, computer scientists are developing methods to understand their cognitive processes, particularly concerning how (and if) LLMs internally represent their beliefs about the world. However, this field currently lacks a unified theoretical foundation to underpin the study of belief in LLMs. This article begins filling this gap by proposing adequacy conditions for a representation in an LLM to count a…Read more
As large language models (LLMs) continue to demonstrate remarkable abilities across various domains, computer scientists are developing methods to understand their cognitive processes, particularly concerning how (and if) LLMs internally represent their beliefs about the world. However, this field currently lacks a unified theoretical foundation to underpin the study of belief in LLMs. This article begins filling this gap by proposing adequacy conditions for a representation in an LLM to count as belief-like. We argue that, while the project of belief measurement in LLMs shares striking features with belief measurement as carried out in decision theory and formal epistemology, it also differs in ways that should change how we measure belief. Thus, drawing from insights in philosophy and contemporary practices of machine learning, we establish four criteria that balance theoretical considerations with practical constraints. Our proposed criteria include accuracy, coherence, uniformity, and use, which together help lay the groundwork for a comprehensive understanding of belief representation in LLMs. We draw on empirical work showing the limitations of using various criteria in isolation to identify belief representations.

Large Language Models
963

Bigger, Badder Bugs
with Jack Spencer

Mind 134 (533): 134-170. 2025.

In this paper we motivate the ‘principles of trust’, chance-credence principles that are strictly stronger than the New Principle yet strictly weaker than the Principal Principle, and argue, by proving some limitative results, that the principles of trust conflict with Humean Supervenience.

Chance-Credence Principles Humeanism and Nonhumeanism about Chance
2763

Does ChatGPT Have a Mind?
with Simon Goldstein

Philosophy of Ai. forthcoming.

This paper examines the question of whether Large Language Models (LLMs) like ChatGPT possess minds, focusing specifically on whether they have a genuine folk psychology encompassing beliefs, desires, and intentions. We approach this question by investigating two key aspects: internal representations and dispositions to act. First, we survey various philosophical theories of representation, including informational, causal, structural, and teleosemantic accounts, arguing that LLMs satisfy key con…Read more
This paper examines the question of whether Large Language Models (LLMs) like ChatGPT possess minds, focusing specifically on whether they have a genuine folk psychology encompassing beliefs, desires, and intentions. We approach this question by investigating two key aspects: internal representations and dispositions to act. First, we survey various philosophical theories of representation, including informational, causal, structural, and teleosemantic accounts, arguing that LLMs satisfy key conditions proposed by each. We draw on recent interpretability research in machine learning to support these claims. Second, we explore whether LLMs exhibit robust dispositions to perform actions, a necessary component of folk psychology. We consider two prominent philosophical traditions, interpretationism and representationalism, to assess LLM action dispositions. While we find evidence suggesting LLMs may satisfy some criteria for having a mind, particularly in game-theoretic environments, we conclude that the data remains inconclusive. Additionally, we reply to several skeptical challenges to LLM folk psychology, including issues of sensory grounding, the "stochastic parrots" argument, and concerns about memorization. Our paper has three main upshots. First, LLMs do have robust internal representations. Second, there is an open question to answer about whether LLMs have robust action dispositions. Third, existing skeptical challenges to LLM representation do not survive philosophical scrutiny.

Philosophy of AI, General Works
858

Evidential Decision Theory and the Ostrich
with Yoaav Isaacs

Philosophers' Imprint 24 (1). 2024.

Evidential Decision Theory is flawed, but its flaws are not fully understood. David Lewis (1981) famously charged that EDT recommends an irrational policy of managing the news and “commends the ostrich as rational”. Lewis was right, but the case he appealed to—Newcomb’s Problem—does not demonstrate his conclusion. Indeed, decision theories other than EDT, such as Committal Decision Theory and Functional Decision Theory, agree with EDT's verdicts in Newcomb’s Problem, but their flaws, whatever th…Read more
Evidential Decision Theory is flawed, but its flaws are not fully understood. David Lewis (1981) famously charged that EDT recommends an irrational policy of managing the news and “commends the ostrich as rational”. Lewis was right, but the case he appealed to—Newcomb’s Problem—does not demonstrate his conclusion. Indeed, decision theories other than EDT, such as Committal Decision Theory and Functional Decision Theory, agree with EDT's verdicts in Newcomb’s Problem, but their flaws, whatever they may be, do not stem from any ostrich-like recommendations. We offer a new case which shows that EDT mismanages the news, thus vindicating Lewis’s original charge. We argue that this case reveals a flaw in the “Why ain’cha rich?” defense of EDT. We argue further that this case is an advance on extant putative counterexamples to EDT.

Evidential Decision Theory
352

Still no lie detector for language models: probing empirical and conceptual roadblocks
with Daniel A. Herrmann

Philosophical Studies 182 (7). 2025.

We consider the questions of whether or not large language models (LLMs) have beliefs, and, if they do, how we might measure them. First, we consider whether or not we should expect LLMs to have something like beliefs in the first place. We consider some recent arguments aiming to show that LLMs cannot have beliefs. We show that these arguments are misguided. We provide a more productive framing of questions surrounding the status of beliefs in LLMs, and highlight the empirical nature of the pro…Read more
We consider the questions of whether or not large language models (LLMs) have beliefs, and, if they do, how we might measure them. First, we consider whether or not we should expect LLMs to have something like beliefs in the first place. We consider some recent arguments aiming to show that LLMs cannot have beliefs. We show that these arguments are misguided. We provide a more productive framing of questions surrounding the status of beliefs in LLMs, and highlight the empirical nature of the problem. With this lesson in hand, we evaluate two existing approaches for measuring the beliefs of LLMs, one due to Azaria and Mitchell (The internal state of an llm knows when its lying, 2023) and the other to Burns et al. (Discovering latent knowledge in language models without supervision, 2022). Moving from the armchair to the desk chair, we provide empirical results that show that these methods fail to generalize in very basic ways. We then argue that, even if LLMs have beliefs, these methods are unlikely to be successful for conceptual reasons. Thus, there is still no lie-detector for LLMs. We conclude by suggesting some concrete paths for future work.

Large Language Models The Nature of Belief
1577

Probability and Informed Consent
with Nir Ben-Moshe and Jonathan Livengood

Theoretical Medicine and Bioethics 44 (6): 545-566. 2023.

In this paper, we illustrate some serious difficulties involved in conveying information about uncertain risks and securing informed consent for risky interventions in a clinical setting. We argue that in order to secure informed consent for a medical intervention, physicians often need to do more than report a bare, numerical probability value. When probabilities are given, securing informed consent generally requires communicating how probability expressions are to be interpreted and communica…Read more
In this paper, we illustrate some serious difficulties involved in conveying information about uncertain risks and securing informed consent for risky interventions in a clinical setting. We argue that in order to secure informed consent for a medical intervention, physicians often need to do more than report a bare, numerical probability value. When probabilities are given, securing informed consent generally requires communicating how probability expressions are to be interpreted and communicating something about the quality and quantity of the evidence for the probabilities reported. Patients may also require guidance on how probability claims may or may not be relevant to their decisions, and physicians should be ready to help patients understand these issues.

Bayesian Reasoning, Misc Medical Epistemology Frequentism Philosophy of Medicine, Miscellaneous Medical …Read more
Bayesian Reasoning, Misc Medical Epistemology Frequentism Philosophy of Medicine, Miscellaneous Medical Ethics, Misc Informed Consent in Medicine Evidence-Based Medicine Medical Methodology Biomedical Ethics, Misc Interpretation of Probability, Misc
235

Decision Theory without Luminosity
with Yoaav Isaacs

Mind 133 (530): 346-376. 2023.

Our decision-theoretic states are not luminous. We are imperfectly reliable at identifying our own credences, utilities and available acts, and thus can never be more than imperfectly reliable at identifying the prescriptions of decision theory. The lack of luminosity affords decision theory a remarkable opportunity — to issue guidance on the basis of epistemically inaccessible facts. We show how a decision theory can guarantee action in accordance with contingent truths about which an agent is …Read more
Our decision-theoretic states are not luminous. We are imperfectly reliable at identifying our own credences, utilities and available acts, and thus can never be more than imperfectly reliable at identifying the prescriptions of decision theory. The lack of luminosity affords decision theory a remarkable opportunity — to issue guidance on the basis of epistemically inaccessible facts. We show how a decision theory can guarantee action in accordance with contingent truths about which an agent is arbitrarily uncertain. It may seem that such advantages would require dubiously adverting to externalist facts that go beyond the internalism of traditional decision theory, but this is not so. Using only the standard repertoire of decision-theoretic tools, we show how to modify existing decision theories to take advantage of this opportunity. These improved decision theories require agents to maximize conditional expected utility — expected utility conditional upon an agent’s actual decision situation. We call such modified decision theories ‘self-confident’. These self-confident decision theories have a distinct advantage over standard decision theories: their prescriptions are better.

Luminosity Decision Theory
268

Accuracy, Deference, and Chance
Philosophical Review 132 (1): 43-87. 2023.

Chance both guides our credences and is an objective feature of the world. How and why we should conform our credences to chance depends on the underlying metaphysical account of what chance is. I use considerations of accuracy (how close your credences come to truth-values) to propose a new way of deferring to chance. The principle I endorse, called the Trust Principle, requires chance to be a good guide to the world, permits modest chances, tells us how to listen to chance even when the chance…Read more
Chance both guides our credences and is an objective feature of the world. How and why we should conform our credences to chance depends on the underlying metaphysical account of what chance is. I use considerations of accuracy (how close your credences come to truth-values) to propose a new way of deferring to chance. The principle I endorse, called the Trust Principle, requires chance to be a good guide to the world, permits modest chances, tells us how to listen to chance even when the chances are modest, and entails but is not entailed by the New Principle. As I show, a rational agent will obey this principle if and only if she expects chance to be at least as accurate as she is on every good way of measuring accuracy. Much of the discussion, and the technical results, extend beyond chance to deference to any kind of expert. Indeed, you will trust someone about a particular question just in case you expect that person to be more accurate than you are about that question.

Chance-Credence Principles Probabilistic Principles, Misc Scoring Rules
2133

Deference Done Better
with Kevin Dorst, Bernhard Salow, Brooke E. Husic, and Branden Fitelson

Philosophical Perspectives 35 (1): 99-150. 2021.

There are many things—call them ‘experts’—that you should defer to in forming your opinions. The trouble is, many experts are modest: they’re less than certain that they are worthy of deference. When this happens, the standard theories of deference break down: the most popular (“Reflection”-style) principles collapse to inconsistency, while their most popular (“New-Reflection”-style) variants allow you to defer to someone while regarding them as an anti-expert. We propose a middle way: deferring…Read more
There are many things—call them ‘experts’—that you should defer to in forming your opinions. The trouble is, many experts are modest: they’re less than certain that they are worthy of deference. When this happens, the standard theories of deference break down: the most popular (“Reflection”-style) principles collapse to inconsistency, while their most popular (“New-Reflection”-style) variants allow you to defer to someone while regarding them as an anti-expert. We propose a middle way: deferring to someone involves preferring to make any decision using their opinions instead of your own. In a slogan, deferring opinions is deferring decisions. Generalizing the proposal of Dorst (2020a), we first formulate a new principle that shows exactly how your opinions must relate to an expert’s for this to be so. We then build off the results of Levinstein (2019) and Campbell-Moore (2020) to show that this principle is also equivalent to the constraint that you must always expect the expert’s estimates to be more accurate than your own. Finally, we characterize the conditions an expert’s opinions must meet to be worthy of deference in this sense, showing how they sit naturally between the too-strong constraints of Reflection and the too-weak constraints of New Reflection.

Probabilistic Frameworks, Misc Judgment Aggregation Rational Requirements The Reflection Principle Epist…Read more
Probabilistic Frameworks, Misc Judgment Aggregation Rational Requirements The Reflection Principle Epistemology of Disagreement Trust Formal Epistemology, Misc
141

Strict propriety is weak
with Catrin Campbell-Moore

Analysis 81 (1): 8-13. 2021.

Considerations of accuracy – the epistemic good of having credences close to truth-values – have led to the justification of a host of epistemic norms. These arguments rely on specific ways of measuring accuracy. In particular, the accuracy measure should be strictly proper. However, the main argument for strict propriety supports only weak propriety. But strict propriety follows from weak propriety given strict truth directedness and additivity. So no further argument is necessary.

Scoring Rules Epistemic Value
1800

Act Consequentialism without Free Rides
with Preston Greene and Benjamin A. Levinstein

Philosophical Perspectives 34 (1): 88-116. 2020.

Consequentialist theories determine rightness solely based on real or expected consequences. Although such theories are popular, they often have difficulty with generalizing intuitions, which demand concern for questions like “What if everybody did that?” Rule consequentialism attempts to incorporate these intuitions by shifting the locus of evaluation from the consequences of acts to those of rules. However, detailed rule-consequentialist theories seem ad hoc or arbitrary compared to act conseq…Read more
Consequentialist theories determine rightness solely based on real or expected consequences. Although such theories are popular, they often have difficulty with generalizing intuitions, which demand concern for questions like “What if everybody did that?” Rule consequentialism attempts to incorporate these intuitions by shifting the locus of evaluation from the consequences of acts to those of rules. However, detailed rule-consequentialist theories seem ad hoc or arbitrary compared to act consequentialist ones. We claim that generalizing can be better incorporated into consequentialism by keeping the locus of evaluation on acts but adjusting the decision theory behind act selection. Specifically, we should adjust which types of dependencies the theory takes to be decision-relevant. Using this strategy, we formulate a new theory, generalized act consequentialism, which we argue is more compelling than rule consequentialism both in modeling the actual reasoning of generalizers and in delivering correct verdicts.

Evidential Decision Theory Causal Decision Theory Act- and Rule-Utilitarianism Decision Theory and Ethi…Read more
Evidential Decision Theory Causal Decision Theory Act- and Rule-Utilitarianism Decision Theory and Ethics Act- and Rule-Consequentalism Newcomb's Problem
264

Cheating Death in Damascus
with Nate Soares

Journal of Philosophy 117 (5): 237-266. 2020.

Evidential Decision Theory and Causal Decision Theory are the leading contenders as theories of rational action, but both face counterexamples. We present some new counterexamples, including one in which the optimal action is causally dominated. We also present a novel decision theory, Functional Decision Theory, which simultaneously solves both sets of counterexamples. Instead of considering which physical action of theirs would give rise to the best outcomes, FDT agents consider which output o…Read more
Evidential Decision Theory and Causal Decision Theory are the leading contenders as theories of rational action, but both face counterexamples. We present some new counterexamples, including one in which the optimal action is causally dominated. We also present a novel decision theory, Functional Decision Theory, which simultaneously solves both sets of counterexamples. Instead of considering which physical action of theirs would give rise to the best outcomes, FDT agents consider which output of their decision function would give rise to the best outcome. This theory relies on a notion of subjunctive dependence, where multiple implementations of the same mathematical function are considered to have identical results for logical rather than causal reasons. Taking these subjunctive dependencies into account allows FDT agents to outperform CDT and EDT agents in, for example, the presence of accurate predictors.
228

Imprecise Epistemic Values and Imprecise Credences
Australasian Journal of Philosophy 97 (4): 741-760. 2019.

A number of recent arguments purport to show that imprecise credences are incompatible with accuracy-first epistemology. If correct, this conclusion suggests a conflict between evidential a...
636

The Foundations of Epistemic Decision Theory
with Jason Konek and Ben Levinstein

Mind 128 (509): 69-107. 2019.

According to accuracy-first epistemology, accuracy is the fundamental epistemic good. Epistemic norms — Probabilism, Conditionalization, the Principal Principle, etc. — have their binding force in virtue of helping to secure this good. To make this idea precise, accuracy-firsters invoke Epistemic Decision Theory (EpDT) to determine which epistemic policies are the best means toward the end of accuracy. Hilary Greaves and others have recently challenged the tenability of this programme. Their arg…Read more
According to accuracy-first epistemology, accuracy is the fundamental epistemic good. Epistemic norms — Probabilism, Conditionalization, the Principal Principle, etc. — have their binding force in virtue of helping to secure this good. To make this idea precise, accuracy-firsters invoke Epistemic Decision Theory (EpDT) to determine which epistemic policies are the best means toward the end of accuracy. Hilary Greaves and others have recently challenged the tenability of this programme. Their arguments purport to show that EpDT encourages obviously epistemically irrational behavior. We develop firmer conceptual foundations for EpDT. First, we detail a theory of praxic and epistemic good. Then we show that, in light of their very different good-making features, EpDT will evaluate epistemic states and epistemic acts according to different criteria. So, in general, rational preference over states and acts won’t agree. Finally, we argue that based on direction-of-fit considerations, it’s preferences over the former that matter for normative epistemology, and that EpDT, properly spelt out, arrives at the correct verdicts in a range of putative problem cases.

Formal Epistemology, Misc Scoring Rules
961

An objection of varying importance to epistemic utility theory
Philosophical Studies 176 (11): 2919-2931. 2019.

Some propositions are more epistemically important than others. Further, how important a proposition is is often a contingent matter—some propositions count more in some worlds than in others. Epistemic Utility Theory cannot accommodate this fact, at least not in any standard way. For EUT to be successful, legitimate measures of epistemic utility must be proper, i.e., every probability function must assign itself maximum expected utility. Once we vary the importance of propositions across worlds…Read more
Some propositions are more epistemically important than others. Further, how important a proposition is is often a contingent matter—some propositions count more in some worlds than in others. Epistemic Utility Theory cannot accommodate this fact, at least not in any standard way. For EUT to be successful, legitimate measures of epistemic utility must be proper, i.e., every probability function must assign itself maximum expected utility. Once we vary the importance of propositions across worlds, however, normal measures of epistemic utility become improper. I argue there isn’t any good way out for EUT.
2366

A Pragmatist’s Guide to Epistemic Utility
Philosophy of Science 84 (4): 613-638. 2017.

We use a theorem from M. J. Schervish to explore the relationship between accuracy and practical success. If an agent is pragmatically rational, she will quantify the expected loss of her credence with a strictly proper scoring rule. Which scoring rule is right for her will depend on the sorts of decisions she expects to face. We relate this pragmatic conception of inaccuracy to the purely epistemic one popular among epistemic utility theorists.

Scoring Rules
1126

Accuracy Uncomposed: Against Calibrationism
Episteme 14 (1): 59-69. 2017.

Pettigrew offers new axiomatic constraints on legitimate measures of inaccuracy. His axiom called ‘Decomposition’ stipulates that legitimate measures of inaccuracy evaluate a credence function in part based on its level of calibration at a world. I argue that if calibration is valuable, as Pettigrew claims, then this fact is an explanandum for accuracy-rst epistemologists, not an explanans, for three reasons. First, the intuitive case for the importance of calibration isn’t as strong as Pettigr…Read more
Pettigrew offers new axiomatic constraints on legitimate measures of inaccuracy. His axiom called ‘Decomposition’ stipulates that legitimate measures of inaccuracy evaluate a credence function in part based on its level of calibration at a world. I argue that if calibration is valuable, as Pettigrew claims, then this fact is an explanandum for accuracy-rst epistemologists, not an explanans, for three reasons. First, the intuitive case for the importance of calibration isn’t as strong as Pettigrew believes. Second, calibration is a perniciously global property that both contravenes Pettigrew’s own views about the nature of credence functions themselves and undercuts the achievements and ambitions of accuracy-rst epistemology. Finally, Decomposition introduces a new kind of value compatible with but separate from accuracy-proper in violation of Pettigrew’s alethic monism. introduction

Social Epistemology Scoring Rules
193

With All Due Respect: The Macro-Epistemology of Disagreement
Philosophers' Imprint 15. 2015.

In this paper, I develop a new kind of conciliatory answer to the problem of peer disagreement. Instead of trying to guide an agent’s updating behaviour in any particular disagreement, I establish constraints on an agent’s expected behaviour and argue that, in the long run, she should tend to be conciliatory toward her peers. I first claim that this macro-approach affords us new conceptual insight on the problem of peer disagreement and provides an important angle complementary to the standard m…Read more
In this paper, I develop a new kind of conciliatory answer to the problem of peer disagreement. Instead of trying to guide an agent’s updating behaviour in any particular disagreement, I establish constraints on an agent’s expected behaviour and argue that, in the long run, she should tend to be conciliatory toward her peers. I first claim that this macro-approach affords us new conceptual insight on the problem of peer disagreement and provides an important angle complementary to the standard micro-approaches in the literature. I then detail the import of two novel results based on accuracy-considerations that establish the following: An agent should, on average, give her peers equal weight. However, if the agent takes herself and her advisor to be reliable, she should usually give the party with a stronger opinion more weight. In other words, an agent’s response to peer disagreement should over the course of many disagreements average out to equal weight, but in any particular disagreement, her response should tend to deviate from equal weight in a way that systematically depends on the actual credences she and her advisor report

Epistemology of Disagreement Formal Social Epistemology, Misc Judgment Aggregation Scoring Rules
1676

Permissive Rationality and Sensitivity
Philosophy and Phenomenological Research 94 (2): 342-370. 2017.

Permissivism about rationality is the view that there is sometimes more than one rational response to a given body of evidence. In this paper I discuss the relationship between permissivism, deference to rationality, and peer disagreement. I begin by arguing that—contrary to popular opinion—permissivism supports at least a moderate version of conciliationism. I then formulate a worry for permissivism. I show that, given a plausible principle of rational deference, permissive rationality seems to…Read more
Permissivism about rationality is the view that there is sometimes more than one rational response to a given body of evidence. In this paper I discuss the relationship between permissivism, deference to rationality, and peer disagreement. I begin by arguing that—contrary to popular opinion—permissivism supports at least a moderate version of conciliationism. I then formulate a worry for permissivism. I show that, given a plausible principle of rational deference, permissive rationality seems to become unstable and to collapse into unique rationality. I conclude with a formulation of a way out of this problem on behalf of the permissivist.

Rationality Formal Epistemology, Misc Epistemology of Disagreement Scoring Rules Ethics of Belief Epistem…Read more
Rationality Formal Epistemology, Misc Epistemology of Disagreement Scoring Rules Ethics of Belief Epistemic Permissivism
200

Leitgeb and Pettigrew on Accuracy and Updating
Philosophy of Science 79 (3): 413-424. 2012.

Leitgeb and Pettigrew argue that (1) agents should minimize the expected inaccuracy of their beliefs and (2) inaccuracy should be measured via the Brier score. They show that in certain diachronic cases, these claims require an alternative to Jeffrey Conditionalization. I claim that this alternative is an irrational updating procedure and that the Brier score, and quadratic scoring rules generally, should be rejected as legitimate measures of inaccuracy.

Scoring Rules Updating Principles

Ben Levinstein

Radical AI Interpretability
with Daniel Herrmann

Tickles, iteration, and habits
with Kenny Easwaran and Ted Shear

Theory and Decision 100 (2): 531-559. 2026.

Standards for Belief Representations in LLMs
with Daniel A. Herrmann

Minds and Machines 35 (1): 1-25. 2024.

Bigger, Badder Bugs
with Jack Spencer

Mind 134 (533): 134-170. 2025.

Does ChatGPT Have a Mind?
with Simon Goldstein

Philosophy of Ai. forthcoming.

Evidential Decision Theory and the Ostrich
with Yoaav Isaacs

Philosophers' Imprint 24 (1). 2024.

Still no lie detector for language models: probing empirical and conceptual roadblocks
with Daniel A. Herrmann

Philosophical Studies 182 (7). 2025.

Probability and Informed Consent
with Nir Ben-Moshe and Jonathan Livengood

Theoretical Medicine and Bioethics 44 (6): 545-566. 2023.

Decision Theory without Luminosity
with Yoaav Isaacs

Mind 133 (530): 346-376. 2023.

Accuracy, Deference, and Chance
Philosophical Review 132 (1): 43-87. 2023.

Deference Done Better
with Kevin Dorst, Bernhard Salow, Brooke E. Husic, and Branden Fitelson

Philosophical Perspectives 35 (1): 99-150. 2021.

Strict propriety is weak
with Catrin Campbell-Moore

Analysis 81 (1): 8-13. 2021.

Act Consequentialism without Free Rides
with Preston Greene and Benjamin A. Levinstein

Philosophical Perspectives 34 (1): 88-116. 2020.

Cheating Death in Damascus
with Nate Soares

Journal of Philosophy 117 (5): 237-266. 2020.

Imprecise Epistemic Values and Imprecise Credences
Australasian Journal of Philosophy 97 (4): 741-760. 2019.

The Foundations of Epistemic Decision Theory
with Jason Konek and Ben Levinstein

Mind 128 (509): 69-107. 2019.

An objection of varying importance to epistemic utility theory
Philosophical Studies 176 (11): 2919-2931. 2019.

A Pragmatist’s Guide to Epistemic Utility
Philosophy of Science 84 (4): 613-638. 2017.

Accuracy Uncomposed: Against Calibrationism
Episteme 14 (1): 59-69. 2017.

With All Due Respect: The Macro-Epistemology of Disagreement
Philosophers' Imprint 15. 2015.

Permissive Rationality and Sensitivity
Philosophy and Phenomenological Research 94 (2): 342-370. 2017.

Leitgeb and Pettigrew on Accuracy and Updating
Philosophy of Science 79 (3): 413-424. 2012.

Ben Levinstein

Radical AI Interpretability with Daniel Herrmann

Tickles, iteration, and habits with Kenny Easwaran and Ted Shear Theory and Decision 100 (2): 531-559. 2026.

Standards for Belief Representations in LLMs with Daniel A. Herrmann Minds and Machines 35 (1): 1-25. 2024.

Bigger, Badder Bugs with Jack Spencer Mind 134 (533): 134-170. 2025.

Does ChatGPT Have a Mind? with Simon Goldstein Philosophy of Ai. forthcoming.

Evidential Decision Theory and the Ostrich with Yoaav Isaacs Philosophers' Imprint 24 (1). 2024.

Still no lie detector for language models: probing empirical and conceptual roadblocks with Daniel A. Herrmann Philosophical Studies 182 (7). 2025.

Probability and Informed Consent with Nir Ben-Moshe and Jonathan Livengood Theoretical Medicine and Bioethics 44 (6): 545-566. 2023.

Decision Theory without Luminosity with Yoaav Isaacs Mind 133 (530): 346-376. 2023.

Accuracy, Deference, and Chance Philosophical Review 132 (1): 43-87. 2023.

Deference Done Better with Kevin Dorst, Bernhard Salow, Brooke E. Husic, and Branden Fitelson Philosophical Perspectives 35 (1): 99-150. 2021.

Strict propriety is weak with Catrin Campbell-Moore Analysis 81 (1): 8-13. 2021.

Act Consequentialism without Free Rides with Preston Greene and Benjamin A. Levinstein Philosophical Perspectives 34 (1): 88-116. 2020.

Cheating Death in Damascus with Nate Soares Journal of Philosophy 117 (5): 237-266. 2020.

Imprecise Epistemic Values and Imprecise Credences Australasian Journal of Philosophy 97 (4): 741-760. 2019.

The Foundations of Epistemic Decision Theory with Jason Konek and Ben Levinstein Mind 128 (509): 69-107. 2019.

An objection of varying importance to epistemic utility theory Philosophical Studies 176 (11): 2919-2931. 2019.

A Pragmatist’s Guide to Epistemic Utility Philosophy of Science 84 (4): 613-638. 2017.

Accuracy Uncomposed: Against Calibrationism Episteme 14 (1): 59-69. 2017.

With All Due Respect: The Macro-Epistemology of Disagreement Philosophers' Imprint 15. 2015.

Permissive Rationality and Sensitivity Philosophy and Phenomenological Research 94 (2): 342-370. 2017.

Leitgeb and Pettigrew on Accuracy and Updating Philosophy of Science 79 (3): 413-424. 2012.

Radical AI Interpretability
with Daniel Herrmann

Tickles, iteration, and habits
with Kenny Easwaran and Ted Shear

Theory and Decision 100 (2): 531-559. 2026.

Standards for Belief Representations in LLMs
with Daniel A. Herrmann

Minds and Machines 35 (1): 1-25. 2024.

Bigger, Badder Bugs
with Jack Spencer

Mind 134 (533): 134-170. 2025.

Does ChatGPT Have a Mind?
with Simon Goldstein

Philosophy of Ai. forthcoming.

Evidential Decision Theory and the Ostrich
with Yoaav Isaacs

Philosophers' Imprint 24 (1). 2024.

Still no lie detector for language models: probing empirical and conceptual roadblocks
with Daniel A. Herrmann

Philosophical Studies 182 (7). 2025.

Probability and Informed Consent
with Nir Ben-Moshe and Jonathan Livengood

Theoretical Medicine and Bioethics 44 (6): 545-566. 2023.

Decision Theory without Luminosity
with Yoaav Isaacs

Mind 133 (530): 346-376. 2023.

Accuracy, Deference, and Chance
Philosophical Review 132 (1): 43-87. 2023.

Deference Done Better
with Kevin Dorst, Bernhard Salow, Brooke E. Husic, and Branden Fitelson

Philosophical Perspectives 35 (1): 99-150. 2021.

Strict propriety is weak
with Catrin Campbell-Moore

Analysis 81 (1): 8-13. 2021.

Act Consequentialism without Free Rides
with Preston Greene and Benjamin A. Levinstein

Philosophical Perspectives 34 (1): 88-116. 2020.

Cheating Death in Damascus
with Nate Soares

Journal of Philosophy 117 (5): 237-266. 2020.

Imprecise Epistemic Values and Imprecise Credences
Australasian Journal of Philosophy 97 (4): 741-760. 2019.

The Foundations of Epistemic Decision Theory
with Jason Konek and Ben Levinstein

Mind 128 (509): 69-107. 2019.

An objection of varying importance to epistemic utility theory
Philosophical Studies 176 (11): 2919-2931. 2019.

A Pragmatist’s Guide to Epistemic Utility
Philosophy of Science 84 (4): 613-638. 2017.

Accuracy Uncomposed: Against Calibrationism
Episteme 14 (1): 59-69. 2017.

With All Due Respect: The Macro-Epistemology of Disagreement
Philosophers' Imprint 15. 2015.

Permissive Rationality and Sensitivity
Philosophy and Phenomenological Research 94 (2): 342-370. 2017.

Leitgeb and Pettigrew on Accuracy and Updating
Philosophy of Science 79 (3): 413-424. 2012.