-
A principal wants to deploy an artificial intelligence (AI) system to perform some task. But the AI may be misaligned and pursue a conflicting objective. The principal cannot restrict its options or deliver punishments. Instead, the principal can (i) simulate the task in a testing environment and (ii) impose imperfect recall on the AI, obscuring whether the task being performed is real or part of a test. By committing to a testing mechanism, the principal can screen the misaligned AI during test…Read more
Oxford, England, United Kingdom of Great Britain and Northern Ireland
Areas of Specialization
| Normative Ethics |
| Social Choice Theory |
| Decision Theory |
| Formal Epistemology |
| Value Theory |