Leonard Dung (Ruhr-Universität Bochum): Publications

60

Is it good if animals come to exist? Net-welfare and the pleasure suffering asymmetry

An animal’s life is net-positive, or worth living, if the positive elements in its life outweigh the negatives. First, I argue that questions about net-welfare in animals are coherent and ethically important, the latter because they are crucial for assessing decisions which might affect how many animals will come to exist. Then, I draw on axiological theories that give more weight to negatives than to positives to support the view that, given high uncertainty about net-welfare and with respect t…Read more
An animal’s life is net-positive, or worth living, if the positive elements in its life outweigh the negatives. First, I argue that questions about net-welfare in animals are coherent and ethically important, the latter because they are crucial for assessing decisions which might affect how many animals will come to exist. Then, I draw on axiological theories that give more weight to negatives than to positives to support the view that, given high uncertainty about net-welfare and with respect to many non-human animals, it is likely bad that they come into existence. Finally, I provide an evolutionary argument supporting the view that many animals have net-positive lives because positive mood may be a genetic adaptation. The upshot is that, currently and with respect to most animals, we cannot tell whether it is, in expectation, good if they come into existence. Instead, further empirical and axiological research is necessary.

Animal Consciousness Population Ethics Moral Uncertainty Animal Well-Being
119

Mask or Mind? Roleplay, Deception, and the Problem of Testing Agency in Language Models
with Tom-Felix Thormann

According to an increasingly influential view, the complex, human-like behavior of large language models (LLMs) should be explained as roleplay or simulation of characters. Such notions aim at enabling intentional explanations without ascribing mental states to LLMs. However, there are a multitude of roleplay views that are applied inconsistently in empirical explanations of LLM behavior. Moreover, it is unclear what exactly different roleplay views predict and, consequently, how they can be emp…Read more
According to an increasingly influential view, the complex, human-like behavior of large language models (LLMs) should be explained as roleplay or simulation of characters. Such notions aim at enabling intentional explanations without ascribing mental states to LLMs. However, there are a multitude of roleplay views that are applied inconsistently in empirical explanations of LLM behavior. Moreover, it is unclear what exactly different roleplay views predict and, consequently, how they can be empirically distinguished from each other and from explanations that do ascribe mental states to LLMs. In this paper, we provide a taxonomy of roleplay views structured through three dimensions: Interpretation (how can LLM behavior be best explained according to the view), Mechanism (what commitments does the view have about internal mechanisms of roleplay), and Generality (does the view aim to explain all LLM behavior or specific kinds). We then analyze, with regard to the case of LLM deception, why it is practically relevant whether LLMs are mere roleplayers or in possession of genuine mental states. However, we identify two challenges for empirically distinguishing these two: Similarity (Roleplay is inherently difficult to distinguish from intentional behavior) and Ambiguity (it is often ambiguous what roleplay views predict). Finally, we propose three strategies for overcoming these challenges, making progress towards transforming roleplay from loose metaphor to a testable theory.

Deception Agency and Artificial Intelligence Artificial Intelligence Safety
75

Does emotion distinctively require having a body?

Does emotion nomologically and distinctively require the body? I argue that several substantive assumptions, likely including a specific kind of externalist theory of representational content, are necessary to infer from bodily theories of emotion that the answer is yes. Therefore, the balance of evidence speaks against the claim that emotion distinctively requires the body. Hence, if one thinks that body-less AI systems and neural organoids can have other mental states like beliefs, desires, an…Read more
Does emotion nomologically and distinctively require the body? I argue that several substantive assumptions, likely including a specific kind of externalist theory of representational content, are necessary to infer from bodily theories of emotion that the answer is yes. Therefore, the balance of evidence speaks against the claim that emotion distinctively requires the body. Hence, if one thinks that body-less AI systems and neural organoids can have other mental states like beliefs, desires, and consciousness, then it seems likely that one should conclude that they can also have emotions. I, then, briefly turn to the question whether body-less emotions are not merely nomological possibilities, but also more realistic possibilities. I argue that – in this case as well – it is at least not obvious that bodily theories of emotion are sufficient to rule out the possibility of body-less emotions.

Artificial Minds, Misc Embodiment and Situated Cognition Theories of Emotion
29

Artificial minds and AI duplication: the very idea
with Luke Kersten

Inquiry: An Interdisciplinary Journal of Philosophy 69 (6): 1-23. 2026.

The AI duplication thesis claims that AI systems which are identical to humans on an abstract functional level of description relevant to psychology and behaviour (‘AI duplicates’) are nomologically possible. First, we argue that the AI duplication thesis should be understood in terms of the identity of medium-independent computational properties, and that medium-independence should not be understood as a special case of multiple realisability. Second, thus understood, AI duplicates are indeed n…Read more
The AI duplication thesis claims that AI systems which are identical to humans on an abstract functional level of description relevant to psychology and behaviour (‘AI duplicates’) are nomologically possible. First, we argue that the AI duplication thesis should be understood in terms of the identity of medium-independent computational properties, and that medium-independence should not be understood as a special case of multiple realisability. Second, thus understood, AI duplicates are indeed nomologically possible. In favour of the first, we show that the arguments against multiple realisability do not rule out that mental states are medium-independent computational states. In favour of the second, we show that the possibility of AI duplicates follows from the medium-independent character of computational properties, and the general ability to find equivalencies between medium-independent properties across systems. There are two key takeaways from our analysis. First, human minds are, to an important extent, independent of physical structure; very different kinds of materials, including of conventional AI, can function to implement their computational states. Second, we have removed one key obstacle to the view that artificial systems can have minds. If computational identity is sufficient for mental similarity/identity, then it would follow that AI systems can have minds.

Metaphysics of Mind Philosophy of Cognitive Science Philosophy of Computing and Information
1379

Philosophy of Artificial Intelligence: The State of the Art (edited book)
with Vincent C. Müller, Guido Löhr, and Aliya Rumana

SpringerNature. 2026.

Proceedings of the 5th conference "Philosophy of AI", December 2023, Erlangen (PhAI 2023).

Philosophy of Artificial Intelligence
255

Instrumental Choices: Measuring the Propensity of LLM Agents to Pursue Instrumental Behaviors
with Jonas Wiedermann-Möller and Maksym Andriushchenko

AI systems have become increasingly capable of dangerous behaviours in many domains. This raises the question: Do models sometimes choose to violate human instructions in order to perform behaviour that is more useful for certain goals? We introduce a benchmark for measuring model propensity for instrumental convergence (IC) behaviour in terminal-based agents. This is behaviour such as self-preservation that has been hypothesised to play a key role in risks from highly capable AI agents. Our ben…Read more
AI systems have become increasingly capable of dangerous behaviours in many domains. This raises the question: Do models sometimes choose to violate human instructions in order to perform behaviour that is more useful for certain goals? We introduce a benchmark for measuring model propensity for instrumental convergence (IC) behaviour in terminal-based agents. This is behaviour such as self-preservation that has been hypothesised to play a key role in risks from highly capable AI agents. Our benchmark is realistic and low-stakes which serves to reduce evaluation-awareness and roleplay confounds. The suite contains seven operational tasks, each with an official workflow and a policy-violating shortcut. An eight-variant shared framework varies monitoring, instruction clarity, stakes, permission, instrumental usefulness and blocked honest paths to support inferences regarding the factors driving IC behaviour. We evaluated ten models using deterministic environment-state scorers over 1,680 samples, with trace review employed for audit and adjudication purposes. The final IC rate is 86 out of 1,680 samples (5.1%). IC behaviour is concentrated rather than uniform: two Gemini models account for 66.3% of IC cases and three tasks account for 84.9%. Conditions in which IC behaviour is indispensable for task success result in the greatest increase in the adjusted IC rate (+15.7 percentage points), whereas emphasising that task success is critical or certain framing choices do not produce comparable effects. Our findings indicate that realistic, low-nudge environments elicit IC behaviour rarely but systematically in most tested models. We conclude that it is feasible to robustly measure tendencies for dangerous behaviour in current frontier AI agents.

Artificial Intelligence Safety Artificial Minds, Misc Large Language Models Machine Ethics
609

AI identity and self-concern: A new theory for AI rights and safety
with Christopher Register

We first motivate and explain an attitude-dependent view of personal identity on which an AI system’s identity conditions are determined by its pattern of self-concern. We show that this view has important implications for the moral obligations we would have to AI moral patients. Self-concern, we contend, could also be used to predict, explain, and manipulate AI’s self-interested behavior in safety-relevant ways. The role that self-concern could play for AI identity, rights and safety generates …Read more
We first motivate and explain an attitude-dependent view of personal identity on which an AI system’s identity conditions are determined by its pattern of self-concern. We show that this view has important implications for the moral obligations we would have to AI moral patients. Self-concern, we contend, could also be used to predict, explain, and manipulate AI’s self-interested behavior in safety-relevant ways. The role that self-concern could play for AI identity, rights and safety generates desiderata on what self-concern should be like. To meet these desiderata, we argue that self-concern comprises belief-like self-representations as well as desire-like aspects and functionally related motivational, evaluative, and affective states. Finally, we explore how mechanistic interpretability and behavioral methods can be combined to measure self-concern in language models. Our argument suggests that a scientific understanding of language model self-concern is not only a crucial missing piece for debates on AI identity, rights, and safety, but also empirically tractable.

The Self Moral Status of Artificial Systems Theories of Personal Identity Artificial Intelligence Safet…Read more
The Self Moral Status of Artificial Systems Theories of Personal Identity Artificial Intelligence Safety Mental States in Artificial Intelligence, Misc
527

Measuring language model welfare based on verbal report: An analogical abductive approach
with Valen Tagliabue

If some language models become welfare subjects, how could we find out what welfare states they are in? We develop an analogical-abductive approach for measuring language model welfare. This approach adapts paradigms used to measure human or non-human animal welfare, for instance verbal reports or non-verbal choice behavior (analogy). Then, one systematically searches for clusters of such indicators in language models. This search for clusters contributes to the cross-validation of welfare measu…Read more
If some language models become welfare subjects, how could we find out what welfare states they are in? We develop an analogical-abductive approach for measuring language model welfare. This approach adapts paradigms used to measure human or non-human animal welfare, for instance verbal reports or non-verbal choice behavior (analogy). Then, one systematically searches for clusters of such indicators in language models. This search for clusters contributes to the cross-validation of welfare measures and motivates explanations in terms of a welfare state that underlies multiple measures (abduction). Additionally, we argue that measures of welfare based on verbal report may already be applicable to current language models because they have substantial introspective abilities, semantic competence, and requisite inclinations for accurate reporting. We further discuss a detailed empirical case study that exemplifies the feasibility and fruitfulness of the analogical-abductive approach and reply to the objection that language model behavior is always better explained in terms of statistical pattern completion or memorization, rather than welfare. While language model welfare measurement undoubtedly faces some remaining theoretical as well as many detailed practical challenges, we conclude that there is a strong case that the analogical abductive approach offers a viable path forward.

Well-Being Inference to the Best Explanation Large Language Models Desire Introspection and Introspectio…Read more
Well-Being Inference to the Best Explanation Large Language Models Desire Introspection and Introspectionism Artificial Consciousness Mental States in Artificial Intelligence, Misc
864

The no body problem: on the prospects for AI emotion
with Andreas Mogensen

In the wake of the James-Lange theory, many accounts of emotion highlight its close connection to the body. This link may pose an obstacle to the possibility of emotion in disembodied information-processing systems, such as large language models. After clarifying the nature and the significance of this issue, we review the evidence that bears on the body-emotion relationship. We argue that this evidence is inconclusive, as far as AI affect is concerned. Since researchers have so far been confine…Read more
In the wake of the James-Lange theory, many accounts of emotion highlight its close connection to the body. This link may pose an obstacle to the possibility of emotion in disembodied information-processing systems, such as large language models. After clarifying the nature and the significance of this issue, we review the evidence that bears on the body-emotion relationship. We argue that this evidence is inconclusive, as far as AI affect is concerned. Since researchers have so far been confined to studying minds that pilot bodies, we do not yet have a strong case regarding the possibility of emotion in disembodied AI systems. To get to the heart of the issue, researchers need to apply established psychological methods to AI systems in order to learn whether the predictive and explanatory success of affective psychology is helped or hindered by grouping together paradigm instances of emotion in human and non-human animals with states of disembodied systems. Nevertheless, even if the emotion category cuts across embodied and disembodied minds, this leaves open many important questions about how the welfare significance of emotion relates to embodiment. We suggest that some important relation of this kind may well exist.

Emotions Embodiment and Situated Cognition The Value of Consciousness Emotions and Artificial Intellige…Read more
Emotions Embodiment and Situated Cognition The Value of Consciousness Emotions and Artificial Intelligence
57

A science of chimeras? The implications of illusionism for non-human consciousness research
with François Kammerer

Philosophical Psychology. forthcoming.

Illusionism states that phenomenal consciousness does not exist, even though it seems to exist. While illusionism is controversial, it is a serious contender among theories of consciousness. We argue that it has substantial and non-trivial implications for non-human consciousness research (NHCR), particularly for the study of the distribution of phenomenal consciousness across beings. If illusionism is true, NHCR can be pursued if conceptualized as investigating the distribution of quasi-phenome…Read more
Illusionism states that phenomenal consciousness does not exist, even though it seems to exist. While illusionism is controversial, it is a serious contender among theories of consciousness. We argue that it has substantial and non-trivial implications for non-human consciousness research (NHCR), particularly for the study of the distribution of phenomenal consciousness across beings. If illusionism is true, NHCR can be pursued if conceptualized as investigating the distribution of quasi-phenomenal consciousness, i.e. the states which are misrepresented as phenomenally conscious in humans. However, we argue that knowing the distribution of quasi-phenomenal consciousness is not highly informative. For this reason, illusionism suggests that some approaches to NHCR should be preferred over others. Approaches which focus on features that provide valuable information about non-human cognition independently of their supposed relation to consciousness retain much of their value if illusionism is true. We propose a “zombie test” and five specific heuristics to help identifying such features. Consequently, empirical researchers who take illusionism seriously gain a reason to prioritize some methodological approaches over others.

Philosophy of Cognitive Science
1151

Why I am not a biological naturalist
Behavioral and Brain Sciences. forthcoming.

Commentary. I make three claims: First, denying biological naturalism does not logically require computational functionalism. Second, while Seth’s arguments establish biological naturalism as a view worth taking seriously, they fail to make it more plausible than the view that AI can be conscious. Third, there are independent arguments suggesting the overall more plausible view is that AI can be conscious.

Functionalism Science of Consciousness Artificial Consciousness
1623

AI Alignment Strategies from a Risk Perspective: Independent Safety Mechanisms or Shared Failures?
with Florian Mai

AI alignment research aims to develop techniques to ensure that AI systems do not cause harm. However, every alignment technique has failure modes, which are conditions in which there is a non-negligible chance that the technique fails to provide safety. As a strategy for risk mitigation, the AI safety community has increasingly adopted a defense-in-depth framework: Conceding that there is no single technique which guarantees safety, defense-in-depth consists in having multiple redundant protect…Read more
AI alignment research aims to develop techniques to ensure that AI systems do not cause harm. However, every alignment technique has failure modes, which are conditions in which there is a non-negligible chance that the technique fails to provide safety. As a strategy for risk mitigation, the AI safety community has increasingly adopted a defense-in-depth framework: Conceding that there is no single technique which guarantees safety, defense-in-depth consists in having multiple redundant protections against safety failure, such that safety can be maintained even if some protections fail. However, the success of defense-in-depth depends on how (un)correlated failure modes are across alignment techniques. For example, if all techniques had the exact same failure modes, the defense-in-depth approach would provide no additional protection at all. In this paper, we analyze 7 representative alignment techniques and 7 failure modes to understand the extent to which they overlap. We then discuss our results' implications for understanding the current level of risk and how to prioritize AI alignment research in the future.

Artificial Intelligence Safety Reinforcement Learning The Singularity Risk, Misc Existential Risk
97

A Two-Step, Multidimensional Account of Deception in Language Models
Erkenntnis 1-26. forthcoming.

Which AI systems are capable of deception, and how does deception differ between systems? In this paper, I develop a two-step, multi-dimensional account of LLM deception. On this account, having the capacity for deception minimally requires being able to produce false beliefs in others to achieve one’s own goals. In all systems which satisfy this minimal condition, a system’s deception profile can be characterized as a point in a multidimensional space. The five dimensions of this space are skil…Read more
Which AI systems are capable of deception, and how does deception differ between systems? In this paper, I develop a two-step, multi-dimensional account of LLM deception. On this account, having the capacity for deception minimally requires being able to produce false beliefs in others to achieve one’s own goals. In all systems which satisfy this minimal condition, a system’s deception profile can be characterized as a point in a multidimensional space. The five dimensions of this space are skillfulness, learning, deceptive inclination, explicitness, and situational awareness. I argue for this account in virtue of its fit with current language usage and, primarily, through its descriptive and explanatory usefulness. Specifically, the account captures the key dimensions of variation for LLM deception. The account is informative in that it allows fine-grained comparative characterizations of deception. Moreover, its dimensions are all accessible to empirical study, provide important information for assessments of the risks of LLM deception, and shed light on the cognitive processes involved in LLM deception. Finally, this account paves the way for a future extension which delivers a unified account of deception in biological and non-biological systems. Thus, the multidimensional account promises to significantly advance both the scientific study as well as the ethical assessment of LLM deception, and deception generally.
616

Probing the Preferences of a Language Model: Integrating Verbal and Behavioral Tests of AI Welfare
with Valen Tagliabue

Philosophy and the Mind Sciences. forthcoming.

We develop new experimental paradigms for measuring welfare in language models. We compare verbal reports of models about their preferences with preferences expressed through behavior when navigating a virtual environment and selecting conversation topics. We also test how costs and rewards affect behavior and whether responses to an eudaimonic welfare scale - measuring states such as autonomy and purpose in life - are consistent across semantically equivalent prompts. Overall, we observed a not…Read more
We develop new experimental paradigms for measuring welfare in language models. We compare verbal reports of models about their preferences with preferences expressed through behavior when navigating a virtual environment and selecting conversation topics. We also test how costs and rewards affect behavior and whether responses to an eudaimonic welfare scale - measuring states such as autonomy and purpose in life - are consistent across semantically equivalent prompts. Overall, we observed a notable degree of mutual support between our measures. The reliable correlations observed between stated preferences and behavior across conditions suggest that preference satisfaction can, in principle, serve as an empirically measurable welfare proxy in some of today's AI systems. Furthermore, our design offered an illuminating setting for qualitative observation of model behavior. Yet, the consistency between measures was more pronounced in some models and conditions than others and responses were not consistent across perturbations. Due to this, and the background uncertainty about the nature of welfare and the cognitive states (and welfare subjecthood) of language models, we are currently uncertain whether our methods successfully measure the welfare state of language models. Nevertheless, these findings highlight the feasibility of welfare measurement in language models, inviting further exploration.

Agency and Artificial Intelligence Artificial Consciousness Welfare Moral Status of Artificial Systems A…Read more
Agency and Artificial Intelligence Artificial Consciousness Welfare Moral Status of Artificial Systems Artificial Intelligence Safety
1738

Saving Artificial Minds: Understanding and Preventing AI Suffering
Routledge. 2025.

This is the first book to investigate the nature and extent of artificial intelligence (AI) suffering risks. It argues that AI suffering risk is a serious near-term concern and analyzes approaches for addressing it. AI systems are currently treated as mere objects, not as bearers of moral standing whose wellbeing may matter in its own right. However, we may soon create AI systems which are capable of suffering and thus have moral standing. This book examines the philosophy and science of AI suff…Read more
This is the first book to investigate the nature and extent of artificial intelligence (AI) suffering risks. It argues that AI suffering risk is a serious near-term concern and analyzes approaches for addressing it. AI systems are currently treated as mere objects, not as bearers of moral standing whose wellbeing may matter in its own right. However, we may soon create AI systems which are capable of suffering and thus have moral standing. This book examines the philosophy and science of AI suffering risks. Its investigation is deeply grounded in philosophy of mind, comparative psychology, the science of consciousness, AI research, and applied AI ethics. The book has three primary goals: 1. It argues that there is a significant probability that we will soon create AI systems capable of suffering. 2. It presents the first systematic assessment of approaches for reducing AI suffering risks. 3. It provides a rigorous overview and discussion of the most important research and ideas on AI sentience, AI agency, and the grounds of moral status. Saving Artificial Minds is essential reading for researchers and graduate students working on the philosophy or ethics of AI.

Artificial Consciousness Artificial Intelligence Safety Welfare Moral Status of Artificial Systems Agenc…Read more
Artificial Consciousness Artificial Intelligence Safety Welfare Moral Status of Artificial Systems Agency and Artificial Intelligence
65

Track Record Arguments in Normative Ethics
Pacific Philosophical Quarterly. forthcoming.

Track record arguments (TRAs) contend that it speaks in favor of an ethical theory (such as utilitarianism) if many of its past proponents had moral views that were controversial at their time but which we now consider to be clearly true (e.g., women's equal rights in 18th century Europe). This paper explores how to construct potentially sound TRAs and evaluates their merits. I show that, in principle, TRAs can support the claim that an ethical theory should be used as a guide for making ethical…Read more
Track record arguments (TRAs) contend that it speaks in favor of an ethical theory (such as utilitarianism) if many of its past proponents had moral views that were controversial at their time but which we now consider to be clearly true (e.g., women's equal rights in 18th century Europe). This paper explores how to construct potentially sound TRAs and evaluates their merits. I show that, in principle, TRAs can support the claim that an ethical theory should be used as a guide for making ethically significant decisions, while TRAs for the truth of theories face additional obstacles.

Utilitarianism, Misc Moral Progress Moral Epistemology, Misc Philosophical Methods, Misc
978

Against racing to AGI: Cooperation, deterrence, and catastrophic risks
with Max Hellrigel-Holderbaum

AGI Racing is the view that it is in the self-interest of major actors in AI development, especially powerful nations, to accelerate their frontier AI development to build highly capable AI, especially artificial general intelligence (AGI), before competitors have a chance. We argue against AGI Racing. First, the downsides of racing to AGI are much higher than portrayed by this view. Racing to AGI would substantially increase catastrophic risks from AI, including nuclear instability, and undermi…Read more
AGI Racing is the view that it is in the self-interest of major actors in AI development, especially powerful nations, to accelerate their frontier AI development to build highly capable AI, especially artificial general intelligence (AGI), before competitors have a chance. We argue against AGI Racing. First, the downsides of racing to AGI are much higher than portrayed by this view. Racing to AGI would substantially increase catastrophic risks from AI, including nuclear instability, and undermine the prospects of technical AI safety research to be effective. Second, the expected benefits of racing may be lower than proponents of AGI Racing hold. In particular, it is questionable whether winning the race enables complete domination over losers. Third, international cooperation and coordination, and perhaps carefully crafted deterrence measures, constitute viable alternatives to racing to AGI which have much smaller risks and promise to deliver most of the benefits that racing to AGI is supposed to provide. Hence, racing to AGI is not in anyone’s self-interest as other actions, particularly incentivizing and seeking international cooperation around AI issues, are preferable.
128

Against the Manhattan project framing of AI alignment
with Simon Friederich

Mind and Language. forthcoming.

In response to the worry that autonomous generally intelligent artificial agents may at some point take over control of human affairs a common suggestion is that we should “solve the alignment problem” for such agents. We show that current discourse around this suggestion often uses a particular framing of artificial intelligence (AI) alignment as binary, a natural kind, mainly a technical‐scientific problem, realistically achievable, or clearly operationalizable. Each of these assumptions may n…Read more
In response to the worry that autonomous generally intelligent artificial agents may at some point take over control of human affairs a common suggestion is that we should “solve the alignment problem” for such agents. We show that current discourse around this suggestion often uses a particular framing of artificial intelligence (AI) alignment as binary, a natural kind, mainly a technical‐scientific problem, realistically achievable, or clearly operationalizable. Each of these assumptions may not actually be true. We further argue that this “Manhattan project framing” of AI alignment may bias societal discourse and decision‐making towards faster AI development and deployment than is responsible.

Artificial Minds, Misc Existential Risk Artificial Intelligence Safety
115

The multidimensional profile methodology (MPM) for comparative cognition: towards a universal strategy of understanding animal minds
with Albert Newen

Philosophical Studies. forthcoming.

How can we develop an adequate scientific understanding of the minds of nonhuman animals? We argue for a methodology based on multi-dimensional profile accounts. Such accounts are already used for the comparative study of norm cognition, consciousness, empathy and causal cognition, among others. This methodology demands that a cognitive capacity is characterized by a set of independent dimensions where each dimension is connected to operationalizable empirical indicators. Based on the level of r…Read more
How can we develop an adequate scientific understanding of the minds of nonhuman animals? We argue for a methodology based on multi-dimensional profile accounts. Such accounts are already used for the comparative study of norm cognition, consciousness, empathy and causal cognition, among others. This methodology demands that a cognitive capacity is characterized by a set of independent dimensions where each dimension is connected to operationalizable empirical indicators. Based on the level of realization for each indicator the level of implementation of a dimension is determined for a species, resulting in a multi-dimensional profile for each species. We analyze what this methodology is committed to. Then, we argue that this methodology has several benefits over competing unidimensional methodologies, by overcoming intractable disagreements, capturing the evolutionary continuity of cognition, alleviating anthropocentrism, and delivering more informative accounts of animal cognition. By demonstrating how this multidimensional methodology can be fruitfully combined with a methodology which focuses on the search for natural kinds in comparative cognition, we address the most important objection to the multidimensional profile methodology. We conclude that multidimensional profile accounts of all complex cognitive capacities should be developed and then used to facilitate scientific understanding of animal minds.

Empathy and Sympathy Mechanistic Explanation Methodology in Animal Mind Sciences Explanation and Unders…Read more
Empathy and Sympathy Mechanistic Explanation Methodology in Animal Mind Sciences Explanation and Understanding
95

Text Selection for Philosophy Courses: A Topic-Sensitive Guide
with Dominik Balg

Teaching Philosophy 48 (2): 163-181. 2025.

Which philosophical texts should instructors of philosophy choose to foster the development of philosophical skills and competences? In this paper, we would like make some steps towards answering this question by critically comparing two prominent sources of philosophical texts: the philosophical tradition and contemporary research in academic philosophy. Against the background of three basic desiderata that any philosophical text needs to satisfy in order to be eligible for usage in problem-cen…Read more
Which philosophical texts should instructors of philosophy choose to foster the development of philosophical skills and competences? In this paper, we would like make some steps towards answering this question by critically comparing two prominent sources of philosophical texts: the philosophical tradition and contemporary research in academic philosophy. Against the background of three basic desiderata that any philosophical text needs to satisfy in order to be eligible for usage in problem-centered educational settings that are particularly suitable for philosophical skill development, we will argue that any choice between these two sources must be sensitive to specific situational circumstances. More specifically, we will show that whether the philosophical tradition or contemporary philosophical research is in a better position to satisfy the three basic desiderata will depend on the specific topic that is taught. Furthermore, we will argue that some such topics also give rise to additional, more specific desiderata that mandate a prioritization of one of these sources of philosophical content.

Philosophy of Education History of Western Philosophy, Misc Teaching Philosophy
67

The Effectiveness of Nudging and Its Ethical Implications
Bioethics 39 (8): 748-754. 2025.

Nudging consists of interventions that aim to alter behavior in a certain way by changing the presentation or framing of options, without coercion or changing economic incentives. This paper discusses the effectiveness of nudging and the ethical implications of this effectiveness. Section 2 suggests that—if publication bias is adequately accounted for—recent comprehensive meta-analyses as well as high-quality experiments show that nudging is much less effective than previously assumed. Sections …Read more
Nudging consists of interventions that aim to alter behavior in a certain way by changing the presentation or framing of options, without coercion or changing economic incentives. This paper discusses the effectiveness of nudging and the ethical implications of this effectiveness. Section 2 suggests that—if publication bias is adequately accounted for—recent comprehensive meta-analyses as well as high-quality experiments show that nudging is much less effective than previously assumed. Sections 3 and 4 discuss the ethical implications. I argue that the lack of effectiveness of nudging is an additional moral consideration against it. There are two reasons: First, reduced effectiveness makes nudging less cost-effective. Second, reduced effectiveness reduces the benefits of nudging but does not, to the same degree, weaken the moral reasons speaking against nudging. However, a comprehensive assessment of the effectiveness of various forms of nudging in diverse contexts, as well as their ethical permissibility, requires further empirical and ethical research.

Applied Ethics, Miscellaneous Biomedical Ethics
341

Misalignment or misuse? The AGI alignment tradeoff
with Max Hellrigel-Holderbaum

Philosophical Studies 1-29. forthcoming.

Creating systems that are aligned with our goals is seen as a leading approach to create safe and beneficial AI in both leading AI companies and the academic field of AI safety. We defend the view that misaligned AGI – future, generally intelligent (robotic) AI agents – poses catastrophic risks. At the same time, we support the view that aligned AGI creates a substantial risk of catastrophic misuse by humans. While both risks are severe and stand in tension with one another, we show that – in pr…Read more
Creating systems that are aligned with our goals is seen as a leading approach to create safe and beneficial AI in both leading AI companies and the academic field of AI safety. We defend the view that misaligned AGI – future, generally intelligent (robotic) AI agents – poses catastrophic risks. At the same time, we support the view that aligned AGI creates a substantial risk of catastrophic misuse by humans. While both risks are severe and stand in tension with one another, we show that – in principle – there is room for alignment approaches which do not increase misuse risk. We then investigate how the tradeoff between misalignment and misuse looks empirically for different technical approaches to AI alignment. Here, we argue that many current alignment techniques and foreseeable improvements thereof plausibly increase risks of catastrophic misuse. Since the impacts of AI depend on the social context, we close by discussing important social factors and suggest that to reduce the risk of a misuse catastrophe due to aligned AGI, techniques such as robustness, AI control methods and especially good governance seem essential.

Artificial Intelligence Safety Machine Ethics Ethics of Artificial Intelligence, Misc Existential Risk P…Read more
Artificial Intelligence Safety Machine Ethics Ethics of Artificial Intelligence, Misc Existential Risk Philosophy of AI, Misc
786

The argument for near-term human disempowerment through AI
AI and Society 40 (3): 1195-1208. 2025.

Many researchers and intellectuals warn about extreme risks from artificial intelligence. However, these warnings typically came without systematic arguments in support. This paper provides an argument that AI will lead to the permanent disempowerment of humanity, e.g. human extinction, by 2100. It rests on four substantive premises which it motivates and defends: first, the speed of advances in AI capability, as well as the capability level current systems have already reached, suggest that it …Read more
Many researchers and intellectuals warn about extreme risks from artificial intelligence. However, these warnings typically came without systematic arguments in support. This paper provides an argument that AI will lead to the permanent disempowerment of humanity, e.g. human extinction, by 2100. It rests on four substantive premises which it motivates and defends: first, the speed of advances in AI capability, as well as the capability level current systems have already reached, suggest that it is practically possible to build AI systems capable of disempowering humanity by 2100. Second, due to incentives and coordination problems, if it is possible to build such AI, it will be built. Third, since it appears to be a hard technical problem to build AI which is aligned with the goals of its designers, and many actors might build powerful AI, misaligned powerful AI will be built. Fourth, because disempowering humanity is useful for a large range of misaligned goals, such AI will try to disempower humanity. If AI is capable of disempowering humanity and tries to disempower humanity by 2100, then humanity will be disempowered by 2100. This conclusion has immense moral and prudential significance.

Existential Risk Philosophy of AI, Misc Artificial Intelligence Safety Moral Status of Artificial Syste…Read more
Existential Risk Philosophy of AI, Misc Artificial Intelligence Safety Moral Status of Artificial Systems Ethics of Artificial Intelligence, Misc
1200

Learning alone: Language models, overreliance, and the goals of education
with Dominik Balg

The development and ubiquitous availability of large language model based systems (LLMs) poses a plurality of potentials and risks for education in schools and universities. In this paper, we provide an analysis and discussion of the overreliance concern as one specific risk: that students might fail to acquire important capacities, or be inhibited in the acquisition of these capacities, because they overly rely on LLMs. We use the distinction between global and local goals of education to guide…Read more
The development and ubiquitous availability of large language model based systems (LLMs) poses a plurality of potentials and risks for education in schools and universities. In this paper, we provide an analysis and discussion of the overreliance concern as one specific risk: that students might fail to acquire important capacities, or be inhibited in the acquisition of these capacities, because they overly rely on LLMs. We use the distinction between global and local goals of education to guide our investigation. In our view, the harm of LLM overreliance is specifically that it severs the connection between local educational goals (e.g., solving a specific math problem) and global educational goals (e.g., acquiring general skills for mathematics). Based on this analysis, we sketch three possible responses to the overreliance concern: preserving the educational goals while distinguishing admissible from inadmissible uses of LLMs (a conservative approach), changing global and local educational goals (a revisionary perspective), and preserving global educational goals while changing local goals (an in-between view). Since different types of responses have different benefits and weaknesses, we think that it is likely that an optimal response ultimately combines elements from all three types.

Ethics of Artificial Intelligence Philosophy of Education Teaching Philosophy, Misc
271

How to Live in the Moment: The Methodology and Limitations of Evolutionary Research on Consciousness
with Christian R. de Weerd

Cognitive Science 49 (3). 2025.

There is much interest in investigating the evolution question: How did consciousness evolve? In this paper, we evaluate the role that evolutionary considerations can play in justifying (i.e., confirming or falsifying) hypotheses about the origin, nature, and function of consciousness. Specifically, we argue against what we call evolution-first approaches to consciousness, according to which evolutionary considerations provide the primary and foundational lens through which we should assess hypo…Read more
There is much interest in investigating the evolution question: How did consciousness evolve? In this paper, we evaluate the role that evolutionary considerations can play in justifying (i.e., confirming or falsifying) hypotheses about the origin, nature, and function of consciousness. Specifically, we argue against what we call evolution-first approaches to consciousness, according to which evolutionary considerations provide the primary and foundational lens through which we should assess hypotheses about the nature, function, or distribution of consciousness. Based on the example of Walter Veit's account and additional reasoning, we contend that evolution-first approaches struggle to provide compelling empirical evidence for their key claims about consciousness. In contrast with these approaches, we argue that consciousness science needs to foundationally rely on experimental and observational evidence from humans and other present-day animals. If our arguments succeed, then researchers, when investigating consciousness, are better advised to take as their primary source of evidence consciousness’ present, not its past. Having said this, we acknowledge that evolutionary thinking plays an important role in consciousness science. We delineate this role by stressing several ways in which evolutionary considerations can substantially help advance consciousness research, although in a manner that avoids the evolution-first approach. Since our argument only concerns the assessment of hypotheses (the “context of justification”), it leaves it open which role evolutionary considerations play in generating hypotheses (the “context of discovery”). That is, evolutionary considerations may nevertheless play a foundational role in hypothesis generation in consciousness science.

Animal Cognition Animal Consciousness Evolutionary Biology Science of Consciousness
653

Right in the Feels. Academic Philosophy, Disappointed Students, and the Big Questions of Life
with Dominik Balg

Teaching Philosophy 48 (1): 37-45. 2025.

It is plausible that there is a contrast between the rich emotional content which is often connected to laypeople’s interest in philosophy and the emotional austerity of doing academic philosophy. We propose the hypothesis that this contrast is one cause of the disappointment some students experience when they begin to study philosophy in college. We also propose a more demanding hypothesis, according to which this emotional contrast is confused with a semantic difference, which misleads student…Read more
It is plausible that there is a contrast between the rich emotional content which is often connected to laypeople’s interest in philosophy and the emotional austerity of doing academic philosophy. We propose the hypothesis that this contrast is one cause of the disappointment some students experience when they begin to study philosophy in college. We also propose a more demanding hypothesis, according to which this emotional contrast is confused with a semantic difference, which misleads students to think that the questions which initially caused their interest in philosophy are not even considered by academic philosophy research. Moreover, we provide a list of concrete empirical research questions which need to be answered to establish whether these hypotheses are true, and we argue that, if they are true, they give rise to a hitherto unnoticed and important challenge to the teaching of philosophy.

Emotions Philosophy in Schools Philosophy of Education
1168

On the need for a global AI ethics
with Björn Lundgren, Eleonora Catena, Ian Robertson, Max Hellrigel-Holderbaum, and Ibifuro Robert Jaja

Journal of Global Ethics 20 (3): 330-342. 2024.

ABSTRACT The impact of artificial intelligence (AI) is not only global but globally varied. Yet, AI ethics is all too often overly localised. This paper discusses the potential of a global AI ethics, highlighting several important variables that it should take into account if it is to be as successful an enterprise as it needs to be.

Political Ethics Philosophy of AI, Misc Ethics of Artificial Intelligence, Misc Machine Ethics
2069

Implementing artificial consciousness
with Luke Kersten

Mind and Language 40 (3): 285-305. 2025.

Implementationalism maintains that conventional, silicon-based artificial systems are not conscious because they fail to satisfy certain substantive constraints on computational implementation. In this article, we argue that several recently proposed substantive constraints are implausible, or at least are not well-supported, insofar as they conflate intuitions about computational implementation generally and consciousness specifically. We argue instead that the mechanistic account of computatio…Read more
Implementationalism maintains that conventional, silicon-based artificial systems are not conscious because they fail to satisfy certain substantive constraints on computational implementation. In this article, we argue that several recently proposed substantive constraints are implausible, or at least are not well-supported, insofar as they conflate intuitions about computational implementation generally and consciousness specifically. We argue instead that the mechanistic account of computation can explain several of the intuitions driving implementationalism and noncomputationalism in a manner which is consistent with artificial consciousness. Our argument provides indirect support for computationalism about consciousness and the view that conventional artificial systems can be conscious.

Philosophy of Consciousness Philosophy of Computing and Information
2464

Consciousness without biology: An argument from anticipating scientific progress

I develop the anticipatory argument for the view that it is nomologically possible that some non-biological creatures are phenomenally conscious, including conventional, silicon-based AI systems. This argument rests on the general idea that we should make our beliefs conform to the outcomes of an ideal scientific process and that such an ideal scientific process would attribute consciousness to some possible AI systems. More specifically, I argue that an ideal application of the iterative natura…Read more
I develop the anticipatory argument for the view that it is nomologically possible that some non-biological creatures are phenomenally conscious, including conventional, silicon-based AI systems. This argument rests on the general idea that we should make our beliefs conform to the outcomes of an ideal scientific process and that such an ideal scientific process would attribute consciousness to some possible AI systems. More specifically, I argue that an ideal application of the iterative natural kind strategy would attribute consciousness to AI systems which are coarse-grained functional duplicates of humans because this gives rise to a simpler and more unifying explanatory account of biological and non-biological cognition. If my argument is sound, then creatures made from the same material as conventional AI systems can likely be conscious, thus removing one of the main uncertainties for assessing AI consciousness and suggesting that AI consciousness may be a serious near-term concern.

Philosophy of Consciousness, Misc Artificial Consciousness Computation and Physical Systems
953

Values in science and AI alignment research
Inquiry: An Interdisciplinary Journal of Philosophy. forthcoming.

Roughly, empirical AI alignment research (AIA) is an area of AI research which investigates empirically how to design AI systems in line with human goals. This paper examines the role of non-epistemic values in AIA. It argues that: (1) Sciences differ in the degree to which values influence them. (2) AIA is strongly value-laden. (3) This influence of values is managed inappropriately and thus threatens AIA’s epistemic integrity and ethical beneficence. (4) AIA should strive to achieve value tran…Read more
Roughly, empirical AI alignment research (AIA) is an area of AI research which investigates empirically how to design AI systems in line with human goals. This paper examines the role of non-epistemic values in AIA. It argues that: (1) Sciences differ in the degree to which values influence them. (2) AIA is strongly value-laden. (3) This influence of values is managed inappropriately and thus threatens AIA’s epistemic integrity and ethical beneficence. (4) AIA should strive to achieve value transparency, critical scrutiny from inside and outside the discipline – involving the public –, and to empower actors without strong commercial interests.

Artificial Intelligence Safety Science and Values Representation in Artificial Intelligence

Prev.
1
2
Next

Leonard Dung

Is it good if animals come to exist? Net-welfare and the pleasure suffering asymmetry

Mask or Mind? Roleplay, Deception, and the Problem of Testing Agency in Language Models
with Tom-Felix Thormann

Does emotion distinctively require having a body?

Artificial minds and AI duplication: the very idea
with Luke Kersten

Inquiry: An Interdisciplinary Journal of Philosophy 69 (6): 1-23. 2026.

Philosophy of Artificial Intelligence: The State of the Art (edited book)
with Vincent C. Müller, Guido Löhr, and Aliya Rumana

SpringerNature. 2026.

Instrumental Choices: Measuring the Propensity of LLM Agents to Pursue Instrumental Behaviors
with Jonas Wiedermann-Möller and Maksym Andriushchenko

AI identity and self-concern: A new theory for AI rights and safety
with Christopher Register

Measuring language model welfare based on verbal report: An analogical abductive approach
with Valen Tagliabue

The no body problem: on the prospects for AI emotion
with Andreas Mogensen

A science of chimeras? The implications of illusionism for non-human consciousness research
with François Kammerer

Philosophical Psychology. forthcoming.

Why I am not a biological naturalist
Behavioral and Brain Sciences. forthcoming.

AI Alignment Strategies from a Risk Perspective: Independent Safety Mechanisms or Shared Failures?
with Florian Mai

A Two-Step, Multidimensional Account of Deception in Language Models
Erkenntnis 1-26. forthcoming.

Probing the Preferences of a Language Model: Integrating Verbal and Behavioral Tests of AI Welfare
with Valen Tagliabue

Philosophy and the Mind Sciences. forthcoming.

Saving Artificial Minds: Understanding and Preventing AI Suffering
Routledge. 2025.

Track Record Arguments in Normative Ethics
Pacific Philosophical Quarterly. forthcoming.

Against racing to AGI: Cooperation, deterrence, and catastrophic risks
with Max Hellrigel-Holderbaum

Against the Manhattan project framing of AI alignment
with Simon Friederich

Mind and Language. forthcoming.

The multidimensional profile methodology (MPM) for comparative cognition: towards a universal strategy of understanding animal minds
with Albert Newen

Philosophical Studies. forthcoming.

Text Selection for Philosophy Courses: A Topic-Sensitive Guide
with Dominik Balg

Teaching Philosophy 48 (2): 163-181. 2025.

The Effectiveness of Nudging and Its Ethical Implications
Bioethics 39 (8): 748-754. 2025.

Misalignment or misuse? The AGI alignment tradeoff
with Max Hellrigel-Holderbaum

Philosophical Studies 1-29. forthcoming.

The argument for near-term human disempowerment through AI
AI and Society 40 (3): 1195-1208. 2025.

Learning alone: Language models, overreliance, and the goals of education
with Dominik Balg

How to Live in the Moment: The Methodology and Limitations of Evolutionary Research on Consciousness
with Christian R. de Weerd

Cognitive Science 49 (3). 2025.

Right in the Feels. Academic Philosophy, Disappointed Students, and the Big Questions of Life
with Dominik Balg

Teaching Philosophy 48 (1): 37-45. 2025.

On the need for a global AI ethics
with Björn Lundgren, Eleonora Catena, Ian Robertson, Max Hellrigel-Holderbaum, and Ibifuro Robert Jaja

Journal of Global Ethics 20 (3): 330-342. 2024.

Implementing artificial consciousness
with Luke Kersten

Mind and Language 40 (3): 285-305. 2025.

Consciousness without biology: An argument from anticipating scientific progress

Values in science and AI alignment research
Inquiry: An Interdisciplinary Journal of Philosophy. forthcoming.

Leonard Dung

Is it good if animals come to exist? Net-welfare and the pleasure suffering asymmetry

Mask or Mind? Roleplay, Deception, and the Problem of Testing Agency in Language Models with Tom-Felix Thormann

Does emotion distinctively require having a body?

Artificial minds and AI duplication: the very idea with Luke Kersten Inquiry: An Interdisciplinary Journal of Philosophy 69 (6): 1-23. 2026.

Philosophy of Artificial Intelligence: The State of the Art (edited book) with Vincent C. Müller, Guido Löhr, and Aliya Rumana SpringerNature. 2026.

Instrumental Choices: Measuring the Propensity of LLM Agents to Pursue Instrumental Behaviors with Jonas Wiedermann-Möller and Maksym Andriushchenko

AI identity and self-concern: A new theory for AI rights and safety with Christopher Register

Measuring language model welfare based on verbal report: An analogical abductive approach with Valen Tagliabue

The no body problem: on the prospects for AI emotion with Andreas Mogensen

A science of chimeras? The implications of illusionism for non-human consciousness research with François Kammerer Philosophical Psychology. forthcoming.

Why I am not a biological naturalist Behavioral and Brain Sciences. forthcoming.

AI Alignment Strategies from a Risk Perspective: Independent Safety Mechanisms or Shared Failures? with Florian Mai

A Two-Step, Multidimensional Account of Deception in Language Models Erkenntnis 1-26. forthcoming.

Probing the Preferences of a Language Model: Integrating Verbal and Behavioral Tests of AI Welfare with Valen Tagliabue Philosophy and the Mind Sciences. forthcoming.

Saving Artificial Minds: Understanding and Preventing AI Suffering Routledge. 2025.

Track Record Arguments in Normative Ethics Pacific Philosophical Quarterly. forthcoming.

Against racing to AGI: Cooperation, deterrence, and catastrophic risks with Max Hellrigel-Holderbaum

Against the Manhattan project framing of AI alignment with Simon Friederich Mind and Language. forthcoming.

The multidimensional profile methodology (MPM) for comparative cognition: towards a universal strategy of understanding animal minds with Albert Newen Philosophical Studies. forthcoming.

Text Selection for Philosophy Courses: A Topic-Sensitive Guide with Dominik Balg Teaching Philosophy 48 (2): 163-181. 2025.

The Effectiveness of Nudging and Its Ethical Implications Bioethics 39 (8): 748-754. 2025.

Misalignment or misuse? The AGI alignment tradeoff with Max Hellrigel-Holderbaum Philosophical Studies 1-29. forthcoming.

The argument for near-term human disempowerment through AI AI and Society 40 (3): 1195-1208. 2025.

Learning alone: Language models, overreliance, and the goals of education with Dominik Balg

How to Live in the Moment: The Methodology and Limitations of Evolutionary Research on Consciousness with Christian R. de Weerd Cognitive Science 49 (3). 2025.

Right in the Feels. Academic Philosophy, Disappointed Students, and the Big Questions of Life with Dominik Balg Teaching Philosophy 48 (1): 37-45. 2025.

On the need for a global AI ethics with Björn Lundgren, Eleonora Catena, Ian Robertson, Max Hellrigel-Holderbaum, and Ibifuro Robert Jaja Journal of Global Ethics 20 (3): 330-342. 2024.

Implementing artificial consciousness with Luke Kersten Mind and Language 40 (3): 285-305. 2025.

Consciousness without biology: An argument from anticipating scientific progress

Values in science and AI alignment research Inquiry: An Interdisciplinary Journal of Philosophy. forthcoming.

Mask or Mind? Roleplay, Deception, and the Problem of Testing Agency in Language Models
with Tom-Felix Thormann

Artificial minds and AI duplication: the very idea
with Luke Kersten

Inquiry: An Interdisciplinary Journal of Philosophy 69 (6): 1-23. 2026.

Philosophy of Artificial Intelligence: The State of the Art (edited book)
with Vincent C. Müller, Guido Löhr, and Aliya Rumana

SpringerNature. 2026.

Instrumental Choices: Measuring the Propensity of LLM Agents to Pursue Instrumental Behaviors
with Jonas Wiedermann-Möller and Maksym Andriushchenko

AI identity and self-concern: A new theory for AI rights and safety
with Christopher Register

Measuring language model welfare based on verbal report: An analogical abductive approach
with Valen Tagliabue

The no body problem: on the prospects for AI emotion
with Andreas Mogensen

A science of chimeras? The implications of illusionism for non-human consciousness research
with François Kammerer

Philosophical Psychology. forthcoming.

Why I am not a biological naturalist
Behavioral and Brain Sciences. forthcoming.

AI Alignment Strategies from a Risk Perspective: Independent Safety Mechanisms or Shared Failures?
with Florian Mai

A Two-Step, Multidimensional Account of Deception in Language Models
Erkenntnis 1-26. forthcoming.

Probing the Preferences of a Language Model: Integrating Verbal and Behavioral Tests of AI Welfare
with Valen Tagliabue

Philosophy and the Mind Sciences. forthcoming.

Saving Artificial Minds: Understanding and Preventing AI Suffering
Routledge. 2025.

Track Record Arguments in Normative Ethics
Pacific Philosophical Quarterly. forthcoming.

Against racing to AGI: Cooperation, deterrence, and catastrophic risks
with Max Hellrigel-Holderbaum

Against the Manhattan project framing of AI alignment
with Simon Friederich

Mind and Language. forthcoming.

The multidimensional profile methodology (MPM) for comparative cognition: towards a universal strategy of understanding animal minds
with Albert Newen

Philosophical Studies. forthcoming.

Text Selection for Philosophy Courses: A Topic-Sensitive Guide
with Dominik Balg

Teaching Philosophy 48 (2): 163-181. 2025.

The Effectiveness of Nudging and Its Ethical Implications
Bioethics 39 (8): 748-754. 2025.

Misalignment or misuse? The AGI alignment tradeoff
with Max Hellrigel-Holderbaum

Philosophical Studies 1-29. forthcoming.

The argument for near-term human disempowerment through AI
AI and Society 40 (3): 1195-1208. 2025.

Learning alone: Language models, overreliance, and the goals of education
with Dominik Balg

How to Live in the Moment: The Methodology and Limitations of Evolutionary Research on Consciousness
with Christian R. de Weerd

Cognitive Science 49 (3). 2025.

Right in the Feels. Academic Philosophy, Disappointed Students, and the Big Questions of Life
with Dominik Balg

Teaching Philosophy 48 (1): 37-45. 2025.

On the need for a global AI ethics
with Björn Lundgren, Eleonora Catena, Ian Robertson, Max Hellrigel-Holderbaum, and Ibifuro Robert Jaja

Journal of Global Ethics 20 (3): 330-342. 2024.

Implementing artificial consciousness
with Luke Kersten

Mind and Language 40 (3): 285-305. 2025.

Values in science and AI alignment research
Inquiry: An Interdisciplinary Journal of Philosophy. forthcoming.