Atoosa Kasirzadeh (Carnegie Mellon University): Publications

More details

Carnegie Mellon University
Department of Philosophy

Assistant Professor

University of Toronto, St. George Campus

PhD, 2021

Homepage

Pittsburgh, Pennsylvania, United States of America

0000-0002-5967-3782

Areas of Specialization

Philosophy of Artificial Intelligence

Ethics of Artificial Intelligence

General Philosophy of Science

Philosophy of Technology

Areas of Interest

Philosophical Traditions

Metaphysics and Epistemology

Value Theory

Philosophy of Mathematics

69

The Case for Globally Beneficial Technology
with Iason Gabriel

To whom do the fruits of advanced technological innovation belong? To their inventors, to the organizations and individuals involved in making such discoveries possible, or to still larger groups of people, potentially encompassing all of humanity? This question sits at the heart of the present investigation. The arguments developed here focus on an expansive reading of the entitlement to benefit from technological breakthroughs: we argue that they should be designed, developed, and distributed …Read more
To whom do the fruits of advanced technological innovation belong? To their inventors, to the organizations and individuals involved in making such discoveries possible, or to still larger groups of people, potentially encompassing all of humanity? This question sits at the heart of the present investigation. The arguments developed here focus on an expansive reading of the entitlement to benefit from technological breakthroughs: we argue that they should be designed, developed, and distributed in ways that benefit everyone. This central claim, which encompasses technologies such as advanced forms of artificial intelligence, is grounded in an exploration of five moral arguments that involve human rights, beneficence, contingencies of birth, the global tree of knowledge, and global economic justice. Taken together, they underpin the argument for globally beneficial technologies.
419

Bridging the Gap in Responsible AI Divides
with Balint Gyevnar

Under Review. forthcoming.

Tensions between AI Safety (AIS) and AI Ethics (AIE) have increasingly surfaced in AI governance and public debates about AI, leading to what we term the “responsible AI divides.” We introduce a model that categorizes four modes of engagement with the tensions: radical confrontation, disengagement, compartmentalized coexistence, and critical bridging. We then investigate how critical bridging, with a particular focus on bridging problems, offers one of the most viable constructive paths for adva…Read more
Tensions between AI Safety (AIS) and AI Ethics (AIE) have increasingly surfaced in AI governance and public debates about AI, leading to what we term the “responsible AI divides.” We introduce a model that categorizes four modes of engagement with the tensions: radical confrontation, disengagement, compartmentalized coexistence, and critical bridging. We then investigate how critical bridging, with a particular focus on bridging problems, offers one of the most viable constructive paths for advancing responsible AI. Using computational tools to analyze a curated dataset of 3, 550 papers, we map the research landscapes of AIE and AIS to identify both distinct and overlapping problems. Our findings point to both thematic divides and overlaps. For example, we find that AIE has long grappled with overcoming injustice and tangible AI harms, whereas AIS has primarily embodied an anticipatory approach focused on the mitigation of risks from AI capabilities. At the same time, we find significant overlap in core research concerns across both AIE and AIS around transparency, reproducibility, and inadequate governance mechanisms. As AIE and AIS continue to evolve, we recommend focusing on bridging problems as a constructive path forward for enhancing collaborative AI governance. We offer a series of recommendations to integrate shared considerations into a collaborative approach to responsible AI. Alongside our proposal, we highlight its limitations and explore open problems for future research.

Machine Learning Philosophy of Artificial Intelligence, Miscellaneous Impact of Artificial Intelligenc…Read more
Machine Learning Philosophy of Artificial Intelligence, Miscellaneous Impact of Artificial Intelligence Areas of Artificial Intelligence Ethics of Artificial Intelligence
478

Characterizing AI Agents for Alignment and Governance
with Iason Gabriel

Nature. forthcoming.

The creation of effective governance mechanisms for AI agents requires a deeper understanding of their core properties and how these properties relate to questions surrounding the deployment and operation of agents in the world. This paper provides a characterization of AI agents that focuses on four dimensions: autonomy, efficacy, goal complexity, and generality. We propose different gradations for each dimension, and argue that each dimension raises unique questions about the design, operation…Read more
The creation of effective governance mechanisms for AI agents requires a deeper understanding of their core properties and how these properties relate to questions surrounding the deployment and operation of agents in the world. This paper provides a characterization of AI agents that focuses on four dimensions: autonomy, efficacy, goal complexity, and generality. We propose different gradations for each dimension, and argue that each dimension raises unique questions about the design, operation, and governance of these systems. Moreover, we draw upon this framework to construct “agentic profiles” for different kinds of AI agents. These profiles help to illuminate cross-cutting technical and non-technical governance challenges posed by different classes of AI agents, ranging from narrow task-specific assistants to highly autonomous general-purpose systems. By mapping out key axes of variation and continuity across four dimensions, agentic profiles provide developers, policymakers, and members of the public with the opportunity to develop governance approaches that better align with collective societal goals.

Ethics of Artificial Intelligence Impact of Artificial Intelligence Philosophy of Artificial Intellige…Read more
Ethics of Artificial Intelligence Impact of Artificial Intelligence Philosophy of Artificial Intelligence, Miscellaneous Areas of Artificial Intelligence Machine Learning
2

Contemporary Debates in the Ethics of Artificial Intelligence (edited book)
with Sven Nyholm and John Zerilli

Wiley-Blackwell. 2026.

_A cutting-edge selection of current issues and explorations of the ethics of artificial intelligence_ As artificial intelligence continues to influence virtually every facet of modern life, _Contemporary Debates in the Ethics of Artificial Intelligence_ offers a timely and rigorous examination of the field's most pressing questions. Equally useful in the classroom or as a reference for interdisciplinary research, this volume fosters informed and critical engagement with the ethical dimensions o…Read more
_A cutting-edge selection of current issues and explorations of the ethics of artificial intelligence_ As artificial intelligence continues to influence virtually every facet of modern life, _Contemporary Debates in the Ethics of Artificial Intelligence_ offers a timely and rigorous examination of the field's most pressing questions. Equally useful in the classroom or as a reference for interdisciplinary research, this volume fosters informed and critical engagement with the ethical dimensions of artificial intelligence in today's world. Curated by renowned scholars Sven Nyholm, Atoosa Kasirzadeh, and John Zerilli, _Contemporary Debates in the Ethics of Artificial Intelligence_ brings together a dynamic mix of established leaders and emerging voices from both philosophy and computer science. The result is a uniquely structured collection of debates that not only introduces key concepts—such as agency, moral status, and value alignment—but also challenges readers to engage deeply with controversies around bias, transparency, and the societal risks posed by AI technologies. Providing frameworks for engaging responsibly with current and future AI technologies, _Contemporary Debates in the Ethics of Artificial Intelligence:_ Presents a dual-perspective debate format that fosters critical thinking and comparative analysis Includes both foundational conceptual discussions and cutting-edge applied ethical issues Features original contributions from interdisciplinary experts in philosophy, law, cognitive science, and computer science Addresses timely topics such as algorithmic bias, opacity, value alignment, and the moral status of AI Explores forward-looking concerns, including the future of AI governance and long-term existential risks _Contemporary Debates in the Ethics of Artificial Intelligence_ is ideal for undergraduate, advanced undergraduate, and graduate-level courses in philosophy, computer science, public policy, and related disciplines. It is well-suited for courses such as Ethics of Artificial Intelligence, Technology and Society, and Digital Ethics in philosophy, computer science, political science, international relations, and data science programs.
74

Beyond model interpretability: socio-structural explanations in machine learning
with Andrew Smart

AI and Society 40 (4): 2045-2053. 2025.

What is it to interpret the outputs of an opaque machine learning model? One approach is to develop interpretable machine learning techniques. These techniques aim to show how machine learning models function by providing either model-centric local or global explanations, which can be based on mechanistic interpretations (revealing the inner working mechanisms of models) or non-mechanistic approximations (showing input feature–output data relationships). In this paper, we draw on social philosop…Read more
What is it to interpret the outputs of an opaque machine learning model? One approach is to develop interpretable machine learning techniques. These techniques aim to show how machine learning models function by providing either model-centric local or global explanations, which can be based on mechanistic interpretations (revealing the inner working mechanisms of models) or non-mechanistic approximations (showing input feature–output data relationships). In this paper, we draw on social philosophy to argue that interpreting machine learning outputs in certain normatively salient domains could require appealing to a third type of explanation that we call “socio-structural” explanation. The relevance of this explanation type is motivated by the fact that machine learning models are not isolated entities but are embedded within and shaped by social structures. Socio-structural explanations aim to illustrate how social structures contribute to and partially explain the outputs of machine learning models. We demonstrate the importance of socio-structural explanations by examining a racially biased healthcare allocation algorithm. Our proposal highlights the need for transparency beyond model interpretability: understanding the outputs of machine learning systems could require a broader analysis that extends beyond the understanding of the machine learning model itself.

Philosophy of Artificial Intelligence Artificial Intelligence in Science
3344

Two Types of AI Existential Risk: Decisive and Accumulative
Philosophical Studies 182 (7): 1975-2003. 2025.

The conventional discourse on existential risks (x-risks) from AI typically focuses on abrupt, dire events caused by advanced AI systems, particularly those that might achieve or surpass human-level intelligence. These events have severe consequences that either lead to human extinction or irreversibly cripple human civilization to a point beyond recovery. This decisive view, however, often neglects the serious possibility of AI x-risk manifesting gradually through an incremental series of small…Read more
The conventional discourse on existential risks (x-risks) from AI typically focuses on abrupt, dire events caused by advanced AI systems, particularly those that might achieve or surpass human-level intelligence. These events have severe consequences that either lead to human extinction or irreversibly cripple human civilization to a point beyond recovery. This decisive view, however, often neglects the serious possibility of AI x-risk manifesting gradually through an incremental series of smaller yet interconnected disruptions, crossing critical thresholds over time. This paper contrasts the conventional decisive AI x-risk hypothesis with what I call an accumulative AI x-risk hypothesis. While the former envisions an overt AI takeover pathway, characterized by scenarios like uncontrollable superintelligence, the latter suggests a different pathway to existential catastrophes. This involves a gradual accumulation of AI-induced threats such as severe vulnerabilities and systemic erosion of critical economic and political structures. The accumulative hypothesis suggests a boiling frog scenario where incremental AI risks slowly undermine systemic and societal resilience until a triggering event results in irreversible collapse. Through complex systems analysis, this paper examines the distinct assumptions differentiating these two hypotheses. It is then argued that the accumulative view can reconcile seemingly incompatible perspectives on AI risks. The implications of differentiating between the two types of pathway—the decisive and the accumulative—for the governance of AI as well as long-term AI safety are discussed.

Computer Ethics Philosophy of Artificial Intelligence Philosophy of Technology
70

Beyond model interpretability: socio-structural explanations in machine learning
with Andrew Smart

AI and Society 1-9. forthcoming.

What is it to interpret the outputs of an opaque machine learning model? One approach is to develop interpretable machine learning techniques. These techniques aim to show how machine learning models function by providing either model-centric local or global explanations, which can be based on mechanistic interpretations (revealing the inner working mechanisms of models) or non-mechanistic approximations (showing input feature–output data relationships). In this paper, we draw on social philosop…Read more
What is it to interpret the outputs of an opaque machine learning model? One approach is to develop interpretable machine learning techniques. These techniques aim to show how machine learning models function by providing either model-centric local or global explanations, which can be based on mechanistic interpretations (revealing the inner working mechanisms of models) or non-mechanistic approximations (showing input feature–output data relationships). In this paper, we draw on social philosophy to argue that interpreting machine learning outputs in certain normatively salient domains could require appealing to a third type of explanation that we call “socio-structural” explanation. The relevance of this explanation type is motivated by the fact that machine learning models are not isolated entities but are embedded within and shaped by social structures. Socio-structural explanations aim to illustrate how social structures contribute to and partially explain the outputs of machine learning models. We demonstrate the importance of socio-structural explanations by examining a racially biased healthcare allocation algorithm. Our proposal highlights the need for transparency beyond model interpretability: understanding the outputs of machine learning systems could require a broader analysis that extends beyond the understanding of the machine learning model itself.

Philosophy of Artificial Intelligence
1415

Explanation Hacking: The perils of algorithmic recourse
with E. Sullivan

In Juan Manuel Durán & Giorgia Pozzi (eds.), Philosophy of science for machine learning: Core issues and new perspectives, Springer. forthcoming.

We argue that the trend toward providing users with feasible and actionable explanations of AI decisions—known as recourse explanations—comes with ethical downsides. Specifically, we argue that recourse explanations face several conceptual pitfalls and can lead to problematic explanation hacking, which undermines their ethical status. As an alternative, we advocate that explanations of AI decisions should aim at understanding.

Understanding Interpretability in Artificial Intelligence Social Epistemology Explainability in Artific…Read more
Understanding Interpretability in Artificial Intelligence Social Epistemology Explainability in Artificial Intelligence
1335

Intelligent capacities in artificial systems
with Victoria McGeer

In William A. Bauer & Anna Marmodoro (eds.), Artificial Dispositions: Investigating Ethical and Metaphysical Issues, Bloomsbury Academic. 2023.

This paper investigates the nature of dispositional properties in the context of artificial intelligence systems. We start by examining the distinctive features of natural dispositions according to criteria introduced by McGeer (2018) for distinguishing between object-centered dispositions (i.e., properties like ‘fragility’) and agent-based abilities, including both ‘habits’ and ‘skills’ (a.k.a. ‘intelligent capacities’, Ryle 1949). We then explore to what extent the distinction applies to artif…Read more
This paper investigates the nature of dispositional properties in the context of artificial intelligence systems. We start by examining the distinctive features of natural dispositions according to criteria introduced by McGeer (2018) for distinguishing between object-centered dispositions (i.e., properties like ‘fragility’) and agent-based abilities, including both ‘habits’ and ‘skills’ (a.k.a. ‘intelligent capacities’, Ryle 1949). We then explore to what extent the distinction applies to artificial dispositions in the context of two very different kinds of artificial systems, one based on rule-based classical logic and the other on reinforcement learning. Here we defend three substantive claims. First, we argue that artificial systems are not equal in the kinds of dispositional properties they instantiate. In particular, we show that logical systems instantiate merely object-centered dispositions whereas reinforcement learning systems allow for the instantiation of agent-based abilities. Second, we explore the similarities and differences between the agent-centered abilities of artificial systems and those of humans, especially as relates to the important distinction made in the human case between habits and skills/intelligent capacities. The upshot is that the agent-centered abilities of truly intelligent artificial systems are distinctive enough to constitute a third type of agent-based ability — blended agent-based ability — raising substantial questions as to how we understand the nature of their agency. Third, we explore one aspect of this problem, focussing on whether systems of this type are properly considered ‘responsible agents’, at least in some contexts and for some purposes. The ramifications of our analysis will turn out to be directly relevant to various ethical concerns of artificial intelligence.

Metaphysics and Epistemology Representation in Artificial Intelligence Agency and Artificial Intellige…Read more
Metaphysics and Epistemology Representation in Artificial Intelligence Agency and Artificial Intelligence
953

Algorithmic Fairness and Structural Injustice: Insights from Feminist Political Philosophy
Aies '22: Proceedings of the 2022 Aaai/Acm Conference on Ai, Ethics, and Society. 2022.

Data-driven predictive algorithms are widely used to automate and guide high-stake decision making such as bail and parole recommendation, medical resource distribution, and mortgage allocation. Nevertheless, harmful outcomes biased against vulnerable groups have been reported. The growing research field known as 'algorithmic fairness' aims to mitigate these harmful biases. Its primary methodology consists in proposing mathematical metrics to address the social harms resulting from an algorithm'…Read more
Data-driven predictive algorithms are widely used to automate and guide high-stake decision making such as bail and parole recommendation, medical resource distribution, and mortgage allocation. Nevertheless, harmful outcomes biased against vulnerable groups have been reported. The growing research field known as 'algorithmic fairness' aims to mitigate these harmful biases. Its primary methodology consists in proposing mathematical metrics to address the social harms resulting from an algorithm's biased outputs. The metrics are typically motivated by -- or substantively rooted in -- ideals of distributive justice, as formulated by political and legal philosophers. The perspectives of feminist political philosophers on social justice, by contrast, have been largely neglected. Some feminist philosophers have criticized the local scope of the paradigm of distributive justice and have proposed corrective amendments to surmount its limitations. The present paper brings some key insights of feminist political philosophy to algorithmic fairness. The paper has three goals. First, I show that algorithmic fairness does not accommodate structural injustices in its current scope. Second, I defend the relevance of structural injustices -- as pioneered in the contemporary philosophical literature by Iris Marion Young -- to algorithmic fairness. Third, I take some steps in developing the paradigm of 'responsible algorithmic fairness' to correct for errors in the current scope and implementation of algorithmic fairness. I close by some reflections of directions for future research.

Ethics of Artificial Intelligence, Misc Algorithmic Fairness
2417

In Conversation with Artificial Intelligence: Aligning language Models with Human Values
Philosophy and Technology 36 (2): 1-24. 2023.

Large-scale language technologies are increasingly used in various forms of communication with humans across different contexts. One particular use case for these technologies is conversational agents, which output natural language text in response to prompts and queries. This mode of engagement raises a number of social and ethical questions. For example, what does it mean to align conversational agents with human norms or values? Which norms or values should they be aligned with? And how can t…Read more
Large-scale language technologies are increasingly used in various forms of communication with humans across different contexts. One particular use case for these technologies is conversational agents, which output natural language text in response to prompts and queries. This mode of engagement raises a number of social and ethical questions. For example, what does it mean to align conversational agents with human norms or values? Which norms or values should they be aligned with? And how can this be accomplished? In this paper, we propose a number of steps that help answer these questions. We start by developing a philosophical analysis of the building blocks of linguistic communication between conversational agents and human interlocutors. We then use this analysis to identify and formulate ideal norms of conversation that can govern successful linguistic communication between humans and conversational agents. Furthermore, we explore how these norms can be used to align conversational agents with human values across a range of different discursive domains. We conclude by discussing the practical implications of our proposal for the design of conversational agents that are aligned with these norms and values.

Philosophy of Technology Philosophy of Artificial Intelligence
892

The Use and Misuse of Counterfactuals in Ethical Machine Learning
with Andrew Smart

In Atoosa Kasirzadeh & Andrew Smart (eds.), ACM Conference on Fairness, Accountability, and Transparency (FAccT 21), . 2021.

The use of counterfactuals for considerations of algorithmic fairness and explainability is gaining prominence within the machine learning community and industry. This paper argues for more caution with the use of counterfactuals when the facts to be considered are social categories such as race or gender. We review a broad body of papers from philosophy and social sciences on social ontology and the semantics of counterfactuals, and we conclude that the counterfactual approach in machine learni…Read more
The use of counterfactuals for considerations of algorithmic fairness and explainability is gaining prominence within the machine learning community and industry. This paper argues for more caution with the use of counterfactuals when the facts to be considered are social categories such as race or gender. We review a broad body of papers from philosophy and social sciences on social ontology and the semantics of counterfactuals, and we conclude that the counterfactual approach in machine learning fairness and social explainability can require an incoherent theory of what social categories are. Our findings suggest that most often the social categories may not admit counterfactual manipulation, and hence may not appropriately satisfy the demands for evaluating the truth or falsity of counterfactuals. This is important because the wide- spread use of counterfactuals in machine learning can lead to misleading results when applied in high-stakes domains. Accordingly, we argue that even though counterfactuals play an essential part in some causal inferences, their use for questions of algorithmic fairness and social explanations can create more problems than they resolve. Our positive result is a set of tenets about using counterfactuals for fairness and explanations in machine learning.

Philosophy of Technology Conditionals Algorithmic Fairness Machine Ethics
33

Boyer-Kassem et al.'s Scientific Collaboration and Collective Knowledge (review)
BJPS Review of Books. 2018.

Science, Logic, and Mathematics
1476

Counter Countermathematical Explanations
Erkenntnis 88 (6): 2537-2560. 2021.

Recently, there have been several attempts to generalize the counterfactual theory of causal explanations to mathematical explanations. The central idea of these attempts is to use conditionals whose antecedents express a mathematical impossibility. Such countermathematical conditionals are plugged into the explanatory scheme of the counterfactual theory and—so is the hope—capture mathematical explanations. Here, I dash the hope that countermathematical explanations simply parallel counterfactua…Read more
Recently, there have been several attempts to generalize the counterfactual theory of causal explanations to mathematical explanations. The central idea of these attempts is to use conditionals whose antecedents express a mathematical impossibility. Such countermathematical conditionals are plugged into the explanatory scheme of the counterfactual theory and—so is the hope—capture mathematical explanations. Here, I dash the hope that countermathematical explanations simply parallel counterfactual explanations. In particular, I show that explanations based on countermathematicals are susceptible to three problems counterfactual explanations do not face. These problems seriously challenge the prospects for a counterfactual theory of explanation that is meant to cover mathematical explanations.

Science, Logic, and Mathematics
1732

The Ethical Gravity Thesis: Marrian Levels and the Persistence of Bias in Automated Decision-making Systems
with Colin Klein

Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society (AIES '21). 2021.

Computers are used to make decisions in an increasing number of domains. There is widespread agreement that some of these uses are ethically problematic. Far less clear is where ethical problems arise, and what might be done about them. This paper expands and defends the Ethical Gravity Thesis: ethical problems that arise at higher levels of analysis of an automated decision-making system are inherited by lower levels of analysis. Particular instantiations of systems can add new problems, but no…Read more
Computers are used to make decisions in an increasing number of domains. There is widespread agreement that some of these uses are ethically problematic. Far less clear is where ethical problems arise, and what might be done about them. This paper expands and defends the Ethical Gravity Thesis: ethical problems that arise at higher levels of analysis of an automated decision-making system are inherited by lower levels of analysis. Particular instantiations of systems can add new problems, but not ameliorate more general ones. We defend this thesis by adapting Marr’s famous 1982 framework for understanding information-processing systems. We show how this framework allows one to situate ethical problems at the appropriate level of abstraction, which in turn can be used to target appropriate interventions.

Ethics of Artificial Intelligence, Misc
1022

Algorithmic and human decision making: for a double standard of transparency
with Mario Günther

AI and Society 37 (1): 375-381. 2022.

Should decision-making algorithms be held to higher standards of transparency than human beings? The way we answer this question directly impacts what we demand from explainable algorithms, how we govern them via regulatory proposals, and how explainable algorithms may help resolve the social problems associated with decision making supported by artificial intelligence. Some argue that algorithms and humans should be held to the same standards of transparency and that a double standard of transp…Read more
Should decision-making algorithms be held to higher standards of transparency than human beings? The way we answer this question directly impacts what we demand from explainable algorithms, how we govern them via regulatory proposals, and how explainable algorithms may help resolve the social problems associated with decision making supported by artificial intelligence. Some argue that algorithms and humans should be held to the same standards of transparency and that a double standard of transparency is hardly justified. We give two arguments to the contrary and specify two kinds of situations for which higher standards of transparency are required from algorithmic decisions as compared to humans. Our arguments have direct implications on the demands from explainable algorithms in decision-making contexts such as automated transportation.

Philosophy of Artificial Intelligence Value Theory
1332

A New Role for Mathematics in Empirical Sciences
Philosophy of Science 88 (4): 686-706. 2021.

Mathematics is often taken to play one of two roles in the empirical sciences: either it represents empirical phenomena or it explains these phenomena by imposing constraints on them. This article identifies a third and distinct role that has not been fully appreciated in the literature on applicability of mathematics and may be pervasive in scientific practice. I call this the “bridging” role of mathematics, according to which mathematics acts as a connecting scheme in our explanatory reasoning…Read more
Mathematics is often taken to play one of two roles in the empirical sciences: either it represents empirical phenomena or it explains these phenomena by imposing constraints on them. This article identifies a third and distinct role that has not been fully appreciated in the literature on applicability of mathematics and may be pervasive in scientific practice. I call this the “bridging” role of mathematics, according to which mathematics acts as a connecting scheme in our explanatory reasoning about why and how two different descriptions of an empirical phenomenon relate to each other. I discuss two bridging roles appearing in biological and physical explanations.

Scientific Practice Mathematical Explanation Scientific Representation Explanation in Biology Models and…Read more
Scientific Practice Mathematical Explanation Scientific Representation Explanation in Biology Models and Explanation Mathematical Structure of Quantum Mechanics
125

Otavio Bueno and Steven French. Applying Mathematics: Immersion, Inference, Interpretation
with James Robert Brown

Philosophy of Science 87 (1): 207-211. 2020.

Science, Logic, and Mathematics

Atoosa Kasirzadeh

The Case for Globally Beneficial Technology
with Iason Gabriel

Bridging the Gap in Responsible AI Divides
with Balint Gyevnar

Under Review. forthcoming.

Characterizing AI Agents for Alignment and Governance
with Iason Gabriel

Nature. forthcoming.

Contemporary Debates in the Ethics of Artificial Intelligence (edited book)
with Sven Nyholm and John Zerilli

Wiley-Blackwell. 2026.

Beyond model interpretability: socio-structural explanations in machine learning
with Andrew Smart

AI and Society 40 (4): 2045-2053. 2025.

Two Types of AI Existential Risk: Decisive and Accumulative
Philosophical Studies 182 (7): 1975-2003. 2025.

Beyond model interpretability: socio-structural explanations in machine learning
with Andrew Smart

AI and Society 1-9. forthcoming.

Explanation Hacking: The perils of algorithmic recourse
with E. Sullivan

In Juan Manuel Durán & Giorgia Pozzi (eds.), Philosophy of science for machine learning: Core issues and new perspectives, Springer. forthcoming.

Intelligent capacities in artificial systems
with Victoria McGeer

In William A. Bauer & Anna Marmodoro (eds.), Artificial Dispositions: Investigating Ethical and Metaphysical Issues, Bloomsbury Academic. 2023.

Algorithmic Fairness and Structural Injustice: Insights from Feminist Political Philosophy
Aies '22: Proceedings of the 2022 Aaai/Acm Conference on Ai, Ethics, and Society. 2022.

In Conversation with Artificial Intelligence: Aligning language Models with Human Values
Philosophy and Technology 36 (2): 1-24. 2023.

The Use and Misuse of Counterfactuals in Ethical Machine Learning
with Andrew Smart

In Atoosa Kasirzadeh & Andrew Smart (eds.), ACM Conference on Fairness, Accountability, and Transparency (FAccT 21), . 2021.

Boyer-Kassem et al.'s Scientific Collaboration and Collective Knowledge (review)
BJPS Review of Books. 2018.

Counter Countermathematical Explanations
Erkenntnis 88 (6): 2537-2560. 2021.

The Ethical Gravity Thesis: Marrian Levels and the Persistence of Bias in Automated Decision-making Systems
with Colin Klein

Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society (AIES '21). 2021.

Algorithmic and human decision making: for a double standard of transparency
with Mario Günther

AI and Society 37 (1): 375-381. 2022.

A New Role for Mathematics in Empirical Sciences
Philosophy of Science 88 (4): 686-706. 2021.

Otavio Bueno and Steven French. Applying Mathematics: Immersion, Inference, Interpretation
with James Robert Brown

Philosophy of Science 87 (1): 207-211. 2020.

Atoosa Kasirzadeh

The Case for Globally Beneficial Technology with Iason Gabriel

Bridging the Gap in Responsible AI Divides with Balint Gyevnar Under Review. forthcoming.

Characterizing AI Agents for Alignment and Governance with Iason Gabriel Nature. forthcoming.

Contemporary Debates in the Ethics of Artificial Intelligence (edited book) with Sven Nyholm and John Zerilli Wiley-Blackwell. 2026.

Beyond model interpretability: socio-structural explanations in machine learning with Andrew Smart AI and Society 40 (4): 2045-2053. 2025.

Two Types of AI Existential Risk: Decisive and Accumulative Philosophical Studies 182 (7): 1975-2003. 2025.

Beyond model interpretability: socio-structural explanations in machine learning with Andrew Smart AI and Society 1-9. forthcoming.

Explanation Hacking: The perils of algorithmic recourse with E. Sullivan In Juan Manuel Durán & Giorgia Pozzi (eds.), Philosophy of science for machine learning: Core issues and new perspectives, Springer. forthcoming.

Intelligent capacities in artificial systems with Victoria McGeer In William A. Bauer & Anna Marmodoro (eds.), Artificial Dispositions: Investigating Ethical and Metaphysical Issues, Bloomsbury Academic. 2023.

Algorithmic Fairness and Structural Injustice: Insights from Feminist Political Philosophy Aies '22: Proceedings of the 2022 Aaai/Acm Conference on Ai, Ethics, and Society. 2022.

In Conversation with Artificial Intelligence: Aligning language Models with Human Values Philosophy and Technology 36 (2): 1-24. 2023.

The Use and Misuse of Counterfactuals in Ethical Machine Learning with Andrew Smart In Atoosa Kasirzadeh & Andrew Smart (eds.), ACM Conference on Fairness, Accountability, and Transparency (FAccT 21), . 2021.

Boyer-Kassem et al.'s Scientific Collaboration and Collective Knowledge (review) BJPS Review of Books. 2018.

Counter Countermathematical Explanations Erkenntnis 88 (6): 2537-2560. 2021.

The Ethical Gravity Thesis: Marrian Levels and the Persistence of Bias in Automated Decision-making Systems with Colin Klein Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society (AIES '21). 2021.

Algorithmic and human decision making: for a double standard of transparency with Mario Günther AI and Society 37 (1): 375-381. 2022.

A New Role for Mathematics in Empirical Sciences Philosophy of Science 88 (4): 686-706. 2021.

Otavio Bueno and Steven French. Applying Mathematics: Immersion, Inference, Interpretation with James Robert Brown Philosophy of Science 87 (1): 207-211. 2020.

The Case for Globally Beneficial Technology
with Iason Gabriel

Bridging the Gap in Responsible AI Divides
with Balint Gyevnar

Under Review. forthcoming.

Characterizing AI Agents for Alignment and Governance
with Iason Gabriel

Nature. forthcoming.

Contemporary Debates in the Ethics of Artificial Intelligence (edited book)
with Sven Nyholm and John Zerilli

Wiley-Blackwell. 2026.

Beyond model interpretability: socio-structural explanations in machine learning
with Andrew Smart

AI and Society 40 (4): 2045-2053. 2025.

Two Types of AI Existential Risk: Decisive and Accumulative
Philosophical Studies 182 (7): 1975-2003. 2025.

Beyond model interpretability: socio-structural explanations in machine learning
with Andrew Smart

AI and Society 1-9. forthcoming.

Explanation Hacking: The perils of algorithmic recourse
with E. Sullivan

In Juan Manuel Durán & Giorgia Pozzi (eds.), Philosophy of science for machine learning: Core issues and new perspectives, Springer. forthcoming.

Intelligent capacities in artificial systems
with Victoria McGeer

In William A. Bauer & Anna Marmodoro (eds.), Artificial Dispositions: Investigating Ethical and Metaphysical Issues, Bloomsbury Academic. 2023.

Algorithmic Fairness and Structural Injustice: Insights from Feminist Political Philosophy
Aies '22: Proceedings of the 2022 Aaai/Acm Conference on Ai, Ethics, and Society. 2022.

In Conversation with Artificial Intelligence: Aligning language Models with Human Values
Philosophy and Technology 36 (2): 1-24. 2023.

The Use and Misuse of Counterfactuals in Ethical Machine Learning
with Andrew Smart

In Atoosa Kasirzadeh & Andrew Smart (eds.), ACM Conference on Fairness, Accountability, and Transparency (FAccT 21), . 2021.

Boyer-Kassem et al.'s Scientific Collaboration and Collective Knowledge (review)
BJPS Review of Books. 2018.

Counter Countermathematical Explanations
Erkenntnis 88 (6): 2537-2560. 2021.

The Ethical Gravity Thesis: Marrian Levels and the Persistence of Bias in Automated Decision-making Systems
with Colin Klein

Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society (AIES '21). 2021.

Algorithmic and human decision making: for a double standard of transparency
with Mario Günther

AI and Society 37 (1): 375-381. 2022.

A New Role for Mathematics in Empirical Sciences
Philosophy of Science 88 (4): 686-706. 2021.

Otavio Bueno and Steven French. Applying Mathematics: Immersion, Inference, Interpretation
with James Robert Brown

Philosophy of Science 87 (1): 207-211. 2020.