•  543
    Artificially Intelligent (AI) systems have ushered in a transformative era across various domains, yet their inherent traits of unpredictability, unexplainability, and uncontrollability have given rise to concerns surrounding AI safety. This paper aims to demonstrate the infeasibility of accurately monitoring advanced AI systems to predict the emergence of certain capabilities prior to their manifestation. Through an analysis of the intricacies of AI systems, the boundaries of human comprehensio…Read more
  •  368
    Ikigai is a Japanese concept, which, in brief, refers to the “reason or purpose to live”. I-risks have been identified as a category of risks complementing x- risks, i.e., existential risks, and s-risks, i.e., suffering risks, which describes undesirable future scenarios in which humans are deprived of the pursuit of their individual ikigai. While some developments in AI increase i-risks, there are also AI-driven virtual opportunities, which reduce i-risks by increasing the space of potential ik…Read more
  •  264
    Many researchers have conjectured that the humankind is simulated along with the rest of the physical universe – a Simulation Hypothesis. In this paper, we do not evaluate evidence for or against such claim, but instead ask a computer science question, namely: Can we hack the simulation? More formally the question could be phrased as: Could generally intelligent agents placed in virtual environments find a way to jailbreak out of them. Given that the state-of-the-art literature on AI containment…Read more
  •  2
    The Technological Singularity: Managing the Journey (edited book)
    with Stuart Armstrong, Victor Callaghan, and James Miller
    Imprint: Springer. 2017.
    This volume contains a selection of authoritative essays exploring the central questions raised by the conjectured technological singularity. In informed yet jargon-free contributions written by active research scientists, philosophers and sociologists, it goes beyond philosophical discussion to provide a detailed account of the risks that the singularity poses to human society and, perhaps most usefully, the possible actions that society and technologists can take to manage the journey to any s…Read more
  •  293
    The problem of personal identity is a longstanding philosophical topic albeit without final consensus. In this article the somewhat similar problem of AI identity is discussed, which has not gained much traction yet, although this investigation is increasingly relevant for different fields, such as ownership issues, personhood of AI, AI welfare, brain–machine interfaces, the distinction between singletons and multi-agent systems as well as to potentially support finding a solution to the problem…Read more
  •  394
    To hold developers responsible, it is important to establish the concept of AI ownership. In this paper we review different obstacles to ownership claims over advanced intelligent systems, including unexplainability, unpredictability, uncontrollability, self-modification, AI-rights, ease of theft when it comes to AI models and code obfuscation. We conclude that it is difficult if not impossible to establish ownership claims over AI models beyond a reasonable doubt.
  •  17
    Understanding and Avoiding AI Failures: A Practical Guide
    with Robert Williams
    Philosophies 6 (3): 53. 2021.
    As AI technologies increase in capability and ubiquity, AI accidents are becoming more common. Based on normal accident theory, high reliability theory, and open systems theory, we create a framework for understanding the risks associated with AI applications. This framework is designed to direct attention to pertinent system properties without requiring unwieldy amounts of accuracy. In addition, we also use AI safety principles to quantify the unique risks of increased intelligence and human-li…Read more
  •  669
    In this work, we survey skepticism regarding AI risk and show parallels with other types of scientific skepticism. We start by classifying different types of AI Risk skepticism and analyze their root causes. We conclude by suggesting some intervention approaches, which may be successful in reducing AI risk skepticism, at least amongst artificial intelligence researchers.
  •  19
    Do No Harm Policy for Minds in Other Substrates
    Journal of Ethics and Emerging Technologies 29 (2): 1-11. 2019.
    Various authors have argued that in the future not only will it be technically feasible for human minds to be transferred to other substrates, but this will become, for most humans, the preferred option over the current biological limitations. It has even been claimed that such a scenario is inevitable in order to solve the challenging, but imperative, multi-agent value alignment problem. In all these considerations, it has been overlooked that, in order to create a suitable environment for a pa…Read more
  •  5
    Transdisciplinary AI Observatory—Retrospective Analyses and Future-Oriented Contradistinctions
    with Nadisha-Marie Aliman and Leon Kester
    Philosophies 6 (1): 6. 2021.
    In the last years, artificial intelligence (AI) safety gained international recognition in the light of heterogeneous safety-critical and ethical issues that risk overshadowing the broad beneficial impacts of AI. In this context, the implementation of AI observatory endeavors represents one key research direction. This paper motivates the need for an inherently _transdisciplinary_ AI observatory approach integrating diverse retrospective and counterfactual views. We delineate aims and limitation…Read more
  •  29
    An AGI Modifying Its Utility Function in Violation of the Strong Orthogonality Thesis
    with James D. Miller and Olle Häggström
    Philosophies 5 (4): 40. 2020.
    An artificial general intelligence (AGI) might have an instrumental drive to modify its utility function to improve its ability to cooperate, bargain, promise, threaten, and resist and engage in blackmail. Such an AGI would necessarily have a utility function that was at least partially observable and that was influenced by how other agents chose to interact with it. This instrumental drive would conflict with the strong orthogonality thesis since the modifications would be influenced by the AGI…Read more
  •  2803
    Invention of artificial general intelligence is predicted to cause a shift in the trajectory of human civilization. In order to reap the benefits and avoid pitfalls of such powerful technology it is important to be able to control it. However, possibility of controlling artificial general intelligence and its more advanced version, superintelligence, has not been formally established. In this paper, we present arguments as well as supporting evidence from multiple domains indicating that advance…Read more
  •  377
    Terms Artificial General Intelligence (AGI) and Human-Level Artificial Intelligence (HLAI) have been used interchangeably to refer to the Holy Grail of Artificial Intelligence (AI) research, creation of a machine capable of achieving goals in a wide range of environments. However, widespread implicit assumption of equivalence between capabilities of AGI and HLAI appears to be unjustified, as humans are not general intelligences. In this paper, we will prove this distinction.
  •  179
    Two interconnected surveys are presented, one of artifacts and one of designometry. Artifacts are objects, which have an originator and do not exist in nature. Designometry is a new field of study, which aims to identify the originators of artifacts. The space of artifacts is described and also domains, which pursue designometry, yet currently doing so without collaboration or common methodologies. On this basis, synergies as well as a generic axiom and heuristics for the quest of the creators …Read more
  •  1206
    When Should Co-Authorship Be Given to AI?
    with G. P. Transformer Jr, End X. Note, and M. S. Spellchecker
    If an AI makes a significant contribution to a research paper, should it be listed as a co-author? The current guidelines in the field have been created to reduce duplication of credit between two different authors in scientific articles. A new computer program could be identified and credited for its impact in an AI research paper that discusses an early artificial intelligence system which is currently under development at Lawrence Berkeley National. One way to imagine the future of artificial…Read more
  •  670
    Safety Engineering for Artificial General Intelligence
    with Joshua Fox
    Topoi 32 (2): 217-226. 2013.
    Machine ethics and robot rights are quickly becoming hot topics in artificial intelligence and robotics communities. We will argue that attempts to attribute moral agency and assign rights to all intelligent machines are misguided, whether applied to infrahuman or superhuman AIs, as are proposals to limit the negative effects of AIs by constraining their behavior. As an alternative, we propose a new science of safety engineering for intelligent artificial agents based on maximizing for what huma…Read more
  •  474
    Responses to Catastrophic AGI Risk: A Survey
    with Kaj Sotala
    Physica Scripta 90. 2015.
    Many researchers have argued that humanity will create artificial general intelligence (AGI) within the next twenty to one hundred years. It has been suggested that AGI may inflict serious damage to human well-being on a global scale ('catastrophic risk'). After summarizing the arguments for why AGI may pose such a risk, we review the fieldʼs proposed responses to AGI risk. We consider societal proposals, proposals for external constraints on AGI behaviors and proposals for creating AGIs that ar…Read more
  •  613
    Long-Term Trajectories of Human Civilization
    with Seth D. Baum, Stuart Armstrong, Timoteus Ekenstedt, Olle Häggström, Robin Hanson, Karin Kuhlemann, Matthijs M. Maas, James D. Miller, Markus Salmela, Anders Sandberg, Kaj Sotala, Phil Torres, and Alexey Turchin
    Foresight 21 (1): 53-83. 2019.
    Purpose This paper aims to formalize long-term trajectories of human civilization as a scientific and ethical field of study. The long-term trajectory of human civilization can be defined as the path that human civilization takes during the entire future time period in which human civilization could continue to exist. Design/methodology/approach This paper focuses on four types of trajectories: status quo trajectories, in which human civilization persists in a state broadly similar to its curren…Read more
  •  601
    One of the possible solutions of the Fermi paradox is that all civilizations go extinct because they hit some Late Great Filter. Such a universal Late Great Filter must be an unpredictable event that all civilizations unexpectedly encounter, even if they try to escape extinction. This is similar to the “Death in Damascus” paradox from decision theory. However, this unpredictable Late Great Filter could be escaped by choosing a random strategy for humanity’s future development. However, if all ci…Read more
  •  1011
    Abstract. Boltzmann brains (BBs) are minds which randomly appear as a result of thermodynamic or quantum fluctuations. In this article, the question of if we are BBs, and the observational consequences if so, is explored. To address this problem, a typology of BBs is created, and the evidence is compared with the Simulation Argument. Based on this comparison, we conclude that while the existence of a “normal” BB is either unlikely or irrelevant, BBs with some ordering may have observable consequ…Read more
  •  7137
    Explainability and comprehensibility of AI are important requirements for intelligent systems deployed in real-world domains. Users want and frequently need to understand how decisions impacting them are made. Similarly it is important to understand how an intelligent system functions for safety and security reasons. In this paper, we describe two complementary impossibility results (Unexplainability and Incomprehensibility), essentially showing that advanced AIs would not be able to accurately …Read more
  •  359
    The young field of AI Safety is still in the process of identifying its challenges and limitations. In this paper, we formally describe one such impossibility result, namely Unpredictability of AI. We prove that it is impossible to precisely and consistently predict what specific actions a smarter-than-human intelligent system will take to achieve its objectives, even if we know terminal goals of the system. In conclusion, impact of Unpredictability on AI Safety is discussed.
  •  1599
    Abstract: In the last decade, an urban legend about “glitches in the matrix” has become popular. As it is typical for urban legends, there is no evidence for most such stories, and the phenomenon could be explained as resulting from hoaxes, creepypasta, coincidence, and different forms of cognitive bias. In addition, the folk understanding of probability does not bear much resemblance to actual probability distributions, resulting in the illusion of improbable events, like the “birthday paradox”…Read more
  •  363
    The goal of the article is to explore what is the most probable type of simulation in which humanity lives (if any) and how this affects simulation termination risks. We firstly explore the question of what kind of simulation in which humanity is most likely located based on pure theoretical reasoning. We suggest a new patch to the classical simulation argument, showing that we are likely simulated not by our own descendants, but by alien civilizations. Based on this, we provide classification o…Read more
  •  434
    One of the primary, if not most critical, difficulties in the design and implementation of autonomous systems is the black-boxed nature of the decision-making structures and logical pathways. How human values are embodied and actualised in situ may ultimately prove to be harmful if not outright recalcitrant. For this reason, the values of stakeholders become of particular significance given the risks posed by opaque structures of intelligent agents (IAs). This paper explores how decision matrix …Read more
  •  58
    Artificial Intelligence Safety and Security (edited book)
    CRC Press. 2018.
    This book addresses different aspects of the AI control problem as it relates to the development of safe and secure artificial intelligence. It will be the first to address challenges of constructing safe and secure artificially intelligent systems.
  •  55
    Leakproofing the Singularity Artificial Intelligence Confinement Problem
    Journal of Consciousness Studies 19 (1-2): 194-214. 2012.
    This paper attempts to formalize and to address the 'leakproofing' of the Singularity problem presented by David Chalmers. The paper begins with the definition of the Artificial Intelligence Confinement Problem. After analysis of existing solutions and their shortcomings, a protocol is proposed aimed at making a more secure confinement environment which might delay potential negative effect from the technological singularity while allowing humanity to benefit from the superintelligence.
  •  2
    The technological singularity (edited book)
    with Jim Miller, Stuart Armstrong, and Vic Callaghan
    . 2015.
    "The idea that human history is approaching a singularity - that ordinary humans will someday be overtaken by artificially intelligent machines or cognitively enhanced biological intelligence, or both - has moved from the realm of science fiction to serious debate. Some singularity theorists predict that if the field of artificial intelligence continues to develop at its current dizzying rate, the singularity could come about in the middle of the present century. Murray Shanahan offers an introd…Read more
  •  80
    Leakproofing the Singularity
    Journal of Consciousness Studies 19 (1-2): 194-214. 2012.
    This paper attempts to formalize and to address the ‘leakproofing’ of the Singularity problem presented by David Chalmers. The paper begins with the definition of the Artificial Intelligence Confinement Problem. After analysis of existing solutions and their shortcomings, a protocol is proposed aimed at making a more secure confinement environment which might delay potential negative effect from the technological singularity while allowing humanity to benefit from the superintelligence.