Russell Stuart: Publications

More details

130

Social Choice Should Guide AI Alignment in Dealing with Diverse Human Feedback
with Vincent Conitzer, Rachel Freedman, Jobst Heitzig, Wesley H. Holliday, Bob M. Jacobs, Nathan Lambert, Milan Mosse, Eric Pacuit, Hailey Schoelkopf, Emanuel Tewolde, and William S. Zwicker

Proceedings of the 41St International Conference on Machine Learning 41 9346-9360. 2024.

Foundation models such as GPT-4 are fine-tuned to avoid unsafe or otherwise problematic behavior, such as helping to commit crimes or producing racist text. One approach to fine-tuning, called reinforcement learning from human feedback, learns from humans' expressed preferences over multiple outputs. Another approach is constitutional AI, in which the input from humans is a list of high-level principles. But how do we deal with potentially diverging input from humans? How can we aggregate the in…Read more
Foundation models such as GPT-4 are fine-tuned to avoid unsafe or otherwise problematic behavior, such as helping to commit crimes or producing racist text. One approach to fine-tuning, called reinforcement learning from human feedback, learns from humans' expressed preferences over multiple outputs. Another approach is constitutional AI, in which the input from humans is a list of high-level principles. But how do we deal with potentially diverging input from humans? How can we aggregate the input into consistent data about "collective" preferences or otherwise use it to make collective choices about model behavior? In this paper, we argue that the field of social choice is well positioned to address these questions, and we discuss ways forward for this agenda, drawing on discussions in a recent workshop on Social Choice for AI Ethics and Safety held in Berkeley, CA, USA in December 2023.

Reinforcement Learning Artificial Intelligence Safety Artificial Intelligence Methodology Social Choice…Read more
Reinforcement Learning Artificial Intelligence Safety Artificial Intelligence Methodology Social Choice Theory, Misc Ethics of Artificial Intelligence, Misc Large Language Models
13

Artificial Intelligence
In S. Matthew Liao (ed.), Ethics of Artificial Intelligence, Oxford University Press. pp. 327-341. 2020.

This chapter argues that there is very little chance that we humans can specify our objectives completely and correctly, in such a way that the pursuit of those objectives by more capable machines is guaranteed to result in beneficial outcomes for humans. Consequently, this chapter defends and further articulates the need for “provably beneficial AI,” which is the idea that to the extent that human values are revealed in our behavior, we should be able to get machines to learn underlying human p…Read more
This chapter argues that there is very little chance that we humans can specify our objectives completely and correctly, in such a way that the pursuit of those objectives by more capable machines is guaranteed to result in beneficial outcomes for humans. Consequently, this chapter defends and further articulates the need for “provably beneficial AI,” which is the idea that to the extent that human values are revealed in our behavior, we should be able to get machines to learn underlying human preferences from observing human behavior. It then discusses the technical challenges involved in building provably beneficial AI and responds to some possible concerns to this approach.
8

Rationality and Intelligence
In Renee Elio (ed.), Common sense, reasoning, & rationality, Oxford University Press. pp. 37-59. 2002.

This chapter considers how to formalize intelligence or rationality in a way that has value for the development of agents built for a specific application and of general theories of intelligence. It presents three candidates that traditionally have stood as formalizations of intelligence: perfect rationality, calculative rationality, and meta-level rationality. Perfect rationality is an abstraction that does not correspond to any physical reasoner. Calculative rationality fails to scale up to pr…Read more
This chapter considers how to formalize intelligence or rationality in a way that has value for the development of agents built for a specific application and of general theories of intelligence. It presents three candidates that traditionally have stood as formalizations of intelligence: perfect rationality, calculative rationality, and meta-level rationality. Perfect rationality is an abstraction that does not correspond to any physical reasoner. Calculative rationality fails to scale up to problems of sufficient and interesting complexity. Meta-level rationality pushes the problem into a never-ending regress. As an alternative, this chapter considers the notion of bounded optimality as a workable proxy for theorizing about machine intelligence. This notion rests on two crucial elements: that behaviors and decisions happen in real time and that an agent is defined by a particular (software and hardware) architecture and a particular program that runs on that architecture. Under this view, an agent is bounded optimal if it maximizes the utility of its behavior for a task within the demands of the environment. The chapter then elaborates on the role of adaptive, inductive mechanisms as the means for making gains in calculative and meta-level rationality for real-world application systems, and for bounded optimality more generally.
42

Artificial Intelligence: A Modern Approach
with Peter Norvig

Pearson. 2020.

"Updated edition of popular textbook on Artificial Intelligence. This edition specific looks at ways of keeping artificial intelligence under control"--
73

AI content detection in the emerging information ecosystem: new obligations for media and tech companies
with Alistair Knott, Dino Pedreschi, Toshiya Jitsuzumi, Susan Leavy, David Eyers, Tapabrata Chakraborti, Andrew Trotman, Sundar Sundareswaran, Ricardo Baeza-Yates, Przemyslaw Biecek, Adrian Weller, Paul D. Teal, Subhadip Basu, Mehmet Haklidir, Virginia Morini, and Yoshua Bengio

Ethics and Information Technology 26 (4): 1-14. 2024.

The world is about to be swamped by an unprecedented wave of AI-generated content. We need reliable ways of identifying such content, to supplement the many existing social institutions that enable trust between people and organisations and ensure social resilience. In this paper, we begin by highlighting an important new development: providers of AI content generators have new obligations to support the creation of reliable detectors for the content they generate. These new obligations arise ma…Read more
The world is about to be swamped by an unprecedented wave of AI-generated content. We need reliable ways of identifying such content, to supplement the many existing social institutions that enable trust between people and organisations and ensure social resilience. In this paper, we begin by highlighting an important new development: providers of AI content generators have new obligations to support the creation of reliable detectors for the content they generate. These new obligations arise mainly from the EU’s newly finalised AI Act, but they are enhanced by the US President’s recent Executive Order on AI, and by several considerations of self-interest. These new steps towards reliable detection mechanisms are by no means a panacea—but we argue they will usher in a new adversarial landscape, in which reliable methods for identifying AI-generated content are commonly available. In this landscape, many new questions arise for policymakers. Firstly, if reliable AI-content detection mechanisms are available, who should be required to use them? And how should they be used? We argue that new duties arise for media and Web search companies arise for media companies, and for Web search companies, in the deployment of AI-content detectors. Secondly, what broader regulation of the tech ecosystem will maximise the likelihood of reliable AI-content detectors? We argue for a range of new duties, relating to provenance-authentication protocols, open-source AI generators, and support for research and enforcement. Along the way, we consider how the production of AI-generated content relates to ‘free expression’, and discuss the important case of content that is generated jointly by humans and AIs.

Computer Ethics
54

Correction: AI content detection in the emerging information ecosystem: new obligations for media and tech companies
with Alistair Knott, Dino Pedreschi, Toshiya Jitsuzumi, Susan Leavy, David Eyers, Tapabrata Chakraborti, Andrew Trotman, Sundar Sundareswaran, Ricardo Baeza-Yates, Przemyslaw Biecek, Adrian Weller, Paul D. Teal, Subhadip Basu, Mehmet Haklidir, Virginia Morini, and Yoshua Bengio

Ethics and Information Technology 26 (4): 1-2. 2024.

Computer Ethics
159

Generative AI models should include detection mechanisms as a condition for public release
with Alistair Knott, Dino Pedreschi, Raja Chatila, Tapabrata Chakraborti, Susan Leavy, Ricardo Baeza-Yates, David Eyers, Andrew Trotman, Paul D. Teal, Przemyslaw Biecek, and Yoshua Bengio

Ethics and Information Technology 25 (4): 1-7. 2023.

The new wave of ‘foundation models’—general-purpose generative AI models, for production of text (e.g., ChatGPT) or images (e.g., MidJourney)—represent a dramatic advance in the state of the art for AI. But their use also introduces a range of new risks, which has prompted an ongoing conversation about possible regulatory mechanisms. Here we propose a specific principle that should be incorporated into legislation: that any organization developing a foundation model intended for public use must …Read more
The new wave of ‘foundation models’—general-purpose generative AI models, for production of text (e.g., ChatGPT) or images (e.g., MidJourney)—represent a dramatic advance in the state of the art for AI. But their use also introduces a range of new risks, which has prompted an ongoing conversation about possible regulatory mechanisms. Here we propose a specific principle that should be incorporated into legislation: that any organization developing a foundation model intended for public use must demonstrate a reliable detection mechanism for the content it generates, as a condition of its public release. The detection mechanism should be made publicly available in a tool that allows users to query, for an arbitrary item of content, whether the item was generated (wholly or partly) by the model. In this paper, we argue that this requirement is technically feasible and would play an important role in reducing certain risks from new AI models in many domains. We also outline a number of options for the tool’s design, and summarize a number of points where further input from policymakers and researchers would be required.

Computer Ethics
36

Object identification: a Bayesian analysis with application to traffic surveillance
with Timothy Huang

Artificial Intelligence 103 (1-2): 77-93. 1998.
52

Optimal composition of real-time systems
with Shlomo Zilberstein

Artificial Intelligence 82 (1-2): 181-213. 1996.

Science, Logic, and Mathematics
68

Principles of metareasoning
with Eric Wefald

Artificial Intelligence 49 (1-3): 361-395. 1991.

Science, Logic, and Mathematics
1806

A Logical Approach to Reasoning by Analogy
with Todd R. Davies

In John P. McDermott (ed.), Proceedings of the 10th International Joint Conference on Artificial Intelligence (IJCAI'87), Morgan Kaufmann Publishers. pp. 264-270. 1987.

We analyze the logical form of the domain knowledge that grounds analogical inferences and generalizations from a single instance. The form of the assumptions which justify analogies is given schematically as the "determination rule", so called because it expresses the relation of one set of variables determining the values of another set. The determination relation is a logical generalization of the different types of dependency relations defined in database theory. Specifically, we define dete…Read more
We analyze the logical form of the domain knowledge that grounds analogical inferences and generalizations from a single instance. The form of the assumptions which justify analogies is given schematically as the "determination rule", so called because it expresses the relation of one set of variables determining the values of another set. The determination relation is a logical generalization of the different types of dependency relations defined in database theory. Specifically, we define determination as a relation between schemata of first order logic that have two kinds of free variables: (1) object variables and (2) what we call "polar" variables, which hold the place of truth values. Determination rules facilitate sound rule inference and valid conclusions projected by analogy from single instances, without implying what the conclusion should be prior to an inspection of the instance. They also provide a way to specify what information is sufficiently relevant to decide a question, prior to knowledge of the answer to the question.

Artificial Intelligence Methodology New Riddle of Induction Representation in Cognitive Science Inducti…Read more
Artificial Intelligence Methodology New Riddle of Induction Representation in Cognitive Science Inductive Reasoning Analogy in Science
61

Rationality and intelligence
Artificial Intelligence 94 (1-2): 57-77. 1997.

Science, Logic, and Mathematics
125

Rationality as an explanation of language?
Behavioral and Brain Sciences 10 (4): 730-731. 1987.

Philosophy of Cognitive Science Rationality and Cognitive Science
147

Inductive learning by machines
Philosophical Studies 64 (1): 37-64. 1991.

Philosophy of Artificial Intelligence, Miscellaneous Bertrand Russell
24

Rationality and Intelligence: A Brief Update
In Vincent C. Müller (ed.), Fundamental Issues of Artificial Intelligence, Springer. pp. 7-28. 2016.

The long-term goal of AI is the creation and understanding of intelligence. This requires a notion of intelligence that is precise enough to allow the cumulative development of robust systems and general results. The concept of rational agency has long been considered a leading candidate to fulfill this role. This paper, which updates a much earlier version (Russell, Artif Intell 94:57–77, 1997), reviews the sequence of conceptual shifts leading to a different candidate, bounded optimality, that…Read more
The long-term goal of AI is the creation and understanding of intelligence. This requires a notion of intelligence that is precise enough to allow the cumulative development of robust systems and general results. The concept of rational agency has long been considered a leading candidate to fulfill this role. This paper, which updates a much earlier version (Russell, Artif Intell 94:57–77, 1997), reviews the sequence of conceptual shifts leading to a different candidate, bounded optimality, that is closer to our informal conception of intelligence and reduces the gap between theory and practice. Some promising recent developments are also described.

Russell Stuart

Artificial Intelligence
In S. Matthew Liao (ed.), Ethics of Artificial Intelligence, Oxford University Press. pp. 327-341. 2020.

Rationality and Intelligence
In Renee Elio (ed.), Common sense, reasoning, & rationality, Oxford University Press. pp. 37-59. 2002.

Artificial Intelligence: A Modern Approach
with Peter Norvig

Pearson. 2020.

Object identification: a Bayesian analysis with application to traffic surveillance
with Timothy Huang

Artificial Intelligence 103 (1-2): 77-93. 1998.

Optimal composition of real-time systems
with Shlomo Zilberstein

Artificial Intelligence 82 (1-2): 181-213. 1996.

Principles of metareasoning
with Eric Wefald

Artificial Intelligence 49 (1-3): 361-395. 1991.

A Logical Approach to Reasoning by Analogy
with Todd R. Davies

In John P. McDermott (ed.), Proceedings of the 10th International Joint Conference on Artificial Intelligence (IJCAI'87), Morgan Kaufmann Publishers. pp. 264-270. 1987.

Rationality and intelligence
Artificial Intelligence 94 (1-2): 57-77. 1997.

Rationality as an explanation of language?
Behavioral and Brain Sciences 10 (4): 730-731. 1987.

Inductive learning by machines
Philosophical Studies 64 (1): 37-64. 1991.

Rationality and Intelligence: A Brief Update
In Vincent C. Müller (ed.), Fundamental Issues of Artificial Intelligence, Springer. pp. 7-28. 2016.

Russell Stuart

Artificial Intelligence In S. Matthew Liao (ed.), Ethics of Artificial Intelligence, Oxford University Press. pp. 327-341. 2020.

Rationality and Intelligence In Renee Elio (ed.), Common sense, reasoning, & rationality, Oxford University Press. pp. 37-59. 2002.

Artificial Intelligence: A Modern Approach with Peter Norvig Pearson. 2020.

Object identification: a Bayesian analysis with application to traffic surveillance with Timothy Huang Artificial Intelligence 103 (1-2): 77-93. 1998.

Optimal composition of real-time systems with Shlomo Zilberstein Artificial Intelligence 82 (1-2): 181-213. 1996.

Principles of metareasoning with Eric Wefald Artificial Intelligence 49 (1-3): 361-395. 1991.

A Logical Approach to Reasoning by Analogy with Todd R. Davies In John P. McDermott (ed.), Proceedings of the 10th International Joint Conference on Artificial Intelligence (IJCAI'87), Morgan Kaufmann Publishers. pp. 264-270. 1987.

Rationality and intelligence Artificial Intelligence 94 (1-2): 57-77. 1997.

Rationality as an explanation of language? Behavioral and Brain Sciences 10 (4): 730-731. 1987.

Inductive learning by machines Philosophical Studies 64 (1): 37-64. 1991.

Rationality and Intelligence: A Brief Update In Vincent C. Müller (ed.), Fundamental Issues of Artificial Intelligence, Springer. pp. 7-28. 2016.

Artificial Intelligence
In S. Matthew Liao (ed.), Ethics of Artificial Intelligence, Oxford University Press. pp. 327-341. 2020.

Rationality and Intelligence
In Renee Elio (ed.), Common sense, reasoning, & rationality, Oxford University Press. pp. 37-59. 2002.

Artificial Intelligence: A Modern Approach
with Peter Norvig

Pearson. 2020.

Object identification: a Bayesian analysis with application to traffic surveillance
with Timothy Huang

Artificial Intelligence 103 (1-2): 77-93. 1998.

Optimal composition of real-time systems
with Shlomo Zilberstein

Artificial Intelligence 82 (1-2): 181-213. 1996.

Principles of metareasoning
with Eric Wefald

Artificial Intelligence 49 (1-3): 361-395. 1991.

A Logical Approach to Reasoning by Analogy
with Todd R. Davies

In John P. McDermott (ed.), Proceedings of the 10th International Joint Conference on Artificial Intelligence (IJCAI'87), Morgan Kaufmann Publishers. pp. 264-270. 1987.

Rationality and intelligence
Artificial Intelligence 94 (1-2): 57-77. 1997.

Rationality as an explanation of language?
Behavioral and Brain Sciences 10 (4): 730-731. 1987.

Inductive learning by machines
Philosophical Studies 64 (1): 37-64. 1991.

Rationality and Intelligence: A Brief Update
In Vincent C. Müller (ed.), Fundamental Issues of Artificial Intelligence, Springer. pp. 7-28. 2016.