Orthogonality-based disentanglement of responsibilities for ethical intelligent systems

Nadisha Marie Aliman, Utrecht University
Leon Kester, Nederlandse Organisatie voor toegepast natuurwetenschappelijk onderzoek- TNO
Peter Werkhoven, Utrecht University
Roman Yampolskiy, University of Louisville

Abstract

In recent years, the implementation of meaningfully controllable advanced intelligent systems whose goals are aligned with ethical values as specified by human entities emerged as key subject of investigation of international relevance across diverse AI-related research areas. In this paper, we present a novel transdisciplinary and Systems Engineering oriented approach denoted “orthogonality-based disentanglement” which jointly tackles both the thereby underlying control problem and value alignment problem while unraveling the corresponding responsibilities of different stakeholders based on the distinction of two orthogonal axes assigned to the problem-solving ability of these intelligent systems on the one hand and to the ethical abilities they exhibit based on quantitatively encoded human values on the other hand. Moreover, we introduce the notion of explicitly formulated ethical goal functions ideally encoding what humans should want and exemplify a possible class of “self-aware” intelligent systems with the capability to reliably adhere to these human-defined goal functions. Beyond that, we discuss an attainable transformative socio-technological feedback-loop that could result out of the introduced orthogonality-based disentanglement approach and briefly elaborate on how the framework additionally provides valuable hints with regard to the coordination subtask in AI Safety. Finally, we point out remaining crucial challenges as incentive for future work.