Controversies in Military Ethics & Security Policy
Meaningful Human Control of Autonomous Systems
Introduction
In 2015, Meaningful Human Control of autonomous systems became a concern for the first time, as more and more semi-autonomous weapons have been used,[1] for example, unmanned aerial combat drones (UAVs) in the Afghanistan conflict. The first systems typically had a low-level machine autonomy, whilst high-level decisions were left to a human operator. Meaningful Human Control addresses degree, quality and intentionality of human control over machines and affects, among others, accountability for the actions or “behavior” of the machines. It was claimed that meaningful decisions should always remain in the hands of humans and should never be handled by algorithms, as these have no ethical or moral traits. Moreover, implementing human ethics and moral on a machine was considered an unfeasible task, among others for reasons of complexity. This led to a discussion on an appropriate degree of machine autonomy, especially on the concrete separation of low and high-level autonomy to allow human operators to apply their moral and ethics to the overall operation. Meaningful Human Control is also playing an increasingly important role in other areas such as autonomous driving on public roads where legal accountability is one of the major concerns.
The first step of realising the behavior of autonomous systems is selecting a system model. In the past deterministic models have been used for implementing such systems that, if perception and overall complexity could be mastered, resulted in predictable behaviors. Not too long ago, artificial intelligence models, especially self-learning algorithms were introduced into those systems. They typically provide a higher adaptability at the cost of lower explainability of the system behavior. Image classification is a well-known application for a learning system, where a large number of annotated images (e.g. “this is an image of a banana”) are presented to a system that is subsequently adjusted to make the actual classification match with the given annotation. Current systems go well beyond classification and are capable of controlling entire tasks and missions of autonomous systems. Physically performing a story written by artificial intelligence (AI) systems like ChatGPT[2] may provide an indicator of the capabilities of such systems.
We chose the RoboCup soccer challenge as our test and demonstration scenario. The RoboCup’s vision is a soccer game between the human soccer champions and a team of autonomous soccer robots in the year 2050.[3] It was chosen for being a rule-based physical competition between teams of autonomous robots with some human control and, in perspective, between robots and humans. Along this rule-based competition scenario we try to provide a practical insight into Meaningful Human Control of autonomous systems and its underlying principles.
State of the Art
This chapter is an overview of the state of the art of the core topics of autonomy, including the reliability and safety of a system and explainability of artificial intelligence algorithms to control such systems.
Autonomy
Autonomy itself has several meanings and is interpreted differently in different contexts. Gottschalk-Mazouz[4] describes autonomy as consisting of the following three characteristics: independence, self-sufficiency and self-determination. In technical systems independence and self-sufficiency play the main roles. The author also describes that, in contrast to the three forms of personal, moral and political autonomy of humans, technical autonomy is differentiated in degrees of independence from humans and the environment (Fig 1). He detailed this with the help of four elements of a mission, namely monitoring, option building, selection and implementation, which may have different levels of autonomy. Finally, Gottschalk-Mazouz considers three “facets” of autonomy of technical systems: The first facet is that of self-imposed technical constraints. These include being autonomous without external supplies such as energy or materials or being able to move and perform tasks without a human operator. In his opinion, these systems should nowadays be differentiated more finely, such as remote-controlled, semi-autonomous or fully autonomous, or be divided into degrees of autonomy such as the five levels of autonomous driving in the automotive industry[5]. The second facet is the independence from the environment to accomplish given tasks, namely inherent control, adaptability and flexibility. The third facet is the ability to learn and to develop capabilities not foreseen at time of construction. He describes this as the ability to “surprise”. Rammert[6] suggests categorizing systems according to their activity, in which their “own creativity” increases. The author proposes four dimensions: motor activities, actuator activities, sensory activities and control activities. The motor activities describe the ability to change location. Actuator activities address physical interaction with the environment. The sensory activities cover perception and self-awareness. Control activities describe the internal steering process of all other activities. The authors propose five levels on an activity scale of such systems: passive, active, reactive, interactive and transactive, from tools that are moved and acted upon to intelligent systems and distributed systems.
Learning Systems
With increasingly complex tasks for robots to solve, new approaches were needed to program or rather create them. Systems were supposed to become intelligent, with an Artificial Intelligence (AI) Algorithm taking care of high-level control. One particular approach is machine learning.[7] Machine learning systems typically consist of a layered decision structure built from artificial neurons, an artificial neural network (ANN) (Fig 2). Each neuron has a number of input signals that are individually weighted, added and fed to an activation node to possibly activate or “trigger” the output signal of the neuron. Typical neurons are defined by a set of individual input weights and an activation function.
An ANN consists of multiple layers of neurons, each feeding the outputs to the inputs of the next layer. For a specific input pattern, certain network nodes will be triggered and a corresponding output pattern will appear to control further activities. The overall decisions are made based on the model of the problem the system has learned. Depending on the quality of the input information, the network itself and its training, the outputs will have a specific confidence value. Teaching the system is a problem of optimising all weights. Two widely used training methods are supervised learning and reinforced learning. With a large number of layers this learning approach is typically called “Deep Learning”. In supervised learning, dedicated training data sets with input patterns and related output patterns are used. An output pattern, generated for a given input is compared with a desired output pattern and the network is adjusted accordingly until a sufficient reliability is reached. An example of this is face recognition. Reinforcement learning on the other hand has the purpose to train a system to interact with the environment. For this, the system will get a task to perform. For good decisions the system will be “rewarded” and for unfavourable decisions it will be penalised. The system will be adjusted to minimize the overall penalty and maximise the reward.
Safety of AI
Leaving far reaching decisions to learning systems requires considering safety. Varshney[8] described the minimal definition for safety in the machine learning context as the potential harm, risk and uncertainty of the decisions. The author also identified “that the minimization of epistemic uncertainty is missing from standard modes of machine learning developed around risk minimization and that it needs to be included when considering safety.” Faria[9] noted that it is impossible to be certain that a machine learning algorithm will make the right decision and as such that a system which is not fully predictable is sufficiently safe.
Explainable AI
The problems sketched earlier led researchers to wanting to better understand and explain the models implemented in ANNs. This has given rise to the research field of Explainable AI.[10] Holzinger describes that due to the development in Deep Learning and the success of creating algorithms that have surpassed human performance, the need of understanding them has grown.[11] Researchers understand the mathematical principles, but the trained configuration of the network, typically a set of weight vectors, network graphs and possibly different activation functions, is not understandable for humans. According to the author, “words are mapped to high-dimensional vectors, making them unintelligible to humans.” The systems appear as black boxes. Moreover, though the overall learning can be verified by testing, it typically is impossible to test all situations that may occur. Making AI models readable or interpretable for humans is addressed by the research field of Explainable AI. In so called white box AI, every processing step and the results are transparent for humans. The disadvantage is that they are usually written entirely by humans and the more complex the problem is, the more effort is required. Moreover, white box systems do not learn.
Ethics in AI
De Swarte et al.[12] describe the need for an ethical AI for robots and discuss means for the implementation of ethical rules. In order to choose the “most ethical” decision of an intelligent robot they describe a utilitarian approach, maximising intrinsic good. The “quantitative ethics” concept, however, requires quantifying good and evil for a given situation.
Altmann[13] on the other hand wants to ban autonomous weapon systems except for close-in defense systems, like C-RAMs. Moreover, he proposes a data recorder for all information exchange between a semi-autonomous system and its operator, to allow ethical scrutiny after the end of hostilities. Huang et al.[14] were concerned by the lack of transparency of the technologies used, the data security and privacy of training data, and the accountability for the actions of the system.
The MeHuCo Project
The MeHuCo project is an interdisciplinary research project with a background in science and technology studies, robotics, law and media studies. “It situates so far unconnected concepts of the controversy around Autonomous Weapon Systems in their historical and cultural context, develops an appropriate concept of socio-material agency and focuses on the translation of scientific results into public discourse to strengthen the civil society debate.”[15] In literature and movies, artificial intelligence and intelligent robots often appear dangerous. Systems created to protect turn against their creators and threaten to extinct humanity. How do these imaginary robots relate to real AI and intelligent robots? Current laws do not appear to be ready for autonomous robots, e.g. when it comes to accountability for the actions of an autonomous robot in civil applications and moreover in the military domain. The contributions of the technical work package are development of a neutral, engineering perspective to autonomous (weapon) systems, the provision of an equivalence scenario for experiments on human control of autonomous systems to derive current socio-technical insights, and the practical demonstration of the possibilities and limitations of a comparable application. The vision of the RoboCup, a soccer game between humanoid robots and humans, serves as the basis for the equivalence scenario. The soccer game includes many elements that are also discussed in the deployment of AWS − a physical confrontation according to rules. This involves, among others, preventing rule violations through additional intelligence or detecting them through a neutral “referee” and attributing them to the autonomy of the machine, the operator, or the creator. The project builds on previous work on humanoid soccer robots competing in the “RoboCup Humanoid League”.
Competitions as Demonstration Scenarios
Scientific competitions help to guide and benchmark research.[16] Compared to real-world scenarios, competitions typically have limited complexities, rule sets and ethics frameworks. The desire to win is an interesting, antagonistic element to the rules and ethics. Probably the first AI-related competition was Claude Shannon’s chess challenge for a computer to win a chess game against the human chess grandmaster, which was eventually accomplished in 1996. Since then, many computer and robotics challenges and competitions have evolved. Many of them have educational and research backgrounds, e.g. Federation of International Sports Association’s (FIRA)[17] robot sport competitions. Some of them, e.g. the DARPA (Defense Advanced Research Projects Agency of the US Department of Defence) Grand Challenge[18] on autonomous land vehicles, had some military element. Often, challenges and rules evolve over the time to account for new developments and research trends. Competitions drove many developments that later found their way to real-world applications.
Some competitions are mono-thematic, addressing specific application domains or research aspects only. Others cover a larger range.
The RoboCup has a “junior” branch for pupils and a “major” branch for students and researchers. Within the major branch there are several leagues respective sub-leagues. The main application domains are robotic soccer, assistive robots in home application, logistic and manipulation robots for factories, and search&rescue. The leagues have different foci, e.g. on human-robot interaction for the assistive robots or locomotion in unstructured terrain for the search&rescue robots. The soccer leagues play a special role, as they are related to the founding vision of RoboCup of a robotics soccer team playing − and winning − in the year 2050 against the then active human soccer world champion team according to human soccer rules and with the limitations implied by these rules. Among the different soccer leagues, the Humanoid League[19] may be closest to this vision.
RoboCup Humanoid League as MeHuCo Demonstrator
The Humanoid League robots (Fig 3) need to have humanoid appearance, sensors and locomotion scheme, i.e. the robots need to be built according to a human body plan with typical proportions and body parts in typical positions. Only sensors that have a representation in humans are allowed. The robots need to be fully autonomous, i.e. no external control, computation, sensors or power supply is allowed. Upon failures of robots, they may be taken off the field by dedicated (human) robot handlers. Between games or during breaks, human team members may carry out repairs or change robot hardware and software. The humanoid league has two sub-leagues, for “KidSize” robots of up to 100cm height (see Fig 3) and for “AdultSize” robots of up to 200cm height. The maximum weight of the largest robot would be 120kg.
What makes the RoboCup soccer scenario especially interesting is combining a physical competition with a comprehensive rule set, sportsman ethics and a gray zone to exploit potential loop holes to possibly improve winning prospects. Potentially having robots to play a physical soccer games against humans is another relevant aspect.
The game mostly follows FIFA soccer rules with some adaptations, e.g. related to the field size of 9*6 m for KidSize respectively 14*9 m for AdultSize and the duration of 2*10 minutes. Over the years different technical aspects have been emphasized by the competition through adjustment of the rules. In an initial phase, walking was the main objective, then shifting to recovering from a fall, which required adding functional arms. Later, perception was important, to localize on the field and identify own and opponent goal on a symmetric soccer field. Currently research groups are working on improving the team play of the robots and on preparing the robots for typical outdoor human soccer field scenarios. During the game, the soccer rules and regulations are enforced by human referees. A so-called “game controller”, operated by an assistant referee, serves as a global control device and communication interface between referees and robots. It sets and communicates global game states like kick-off, free kick, penalty kick and so on. The game controller is the tool used to exerts human control over the entire game and the robot players. A typical mistake by humans operating the game controller in hectic situations is penalizing a wrong player.
An automated referee may eliminate the human interaction with the game controller. However, it would require all relevant information on the game, e.g., positions of players and the ball. This could be achieved by installing a dedicated referee perception system or making use of the data of the players. An automated referee would also allow to train rule-abiding and ethical behavior of learning robots. As most learning algorithms require a large number of training examples, the learning process typically is carried out by simulation with a significantly higher speed than accomplishable in physical environments and with humans in control. Moreover, the simulation tool already has all relevant game information. A newly introduced problem would be bridging the sim-to-real gap, which describes the difference between conditions in the simulation and in real world. For example, the simulator may not consider all physical effects of the real world, which, when they actually play a role in the real world may affect the performance of the robot.
Robots, among many other software modules, come with a “mission functionality” and a possibly partially antagonistic “rule functionality”. The mission functionality encodes winning the competition, e.g., how to score goals and defend the own goal. The rule functionality limits the actions to those in line with the rules. In perspective it could also provide ethics functionality.
An exemplary game situation may illustrate the interaction of mission and rule functionalities. According to the rules, two robots physically fighting for the ball need to disengage before a maximum time (of 15 seconds) is reached. Whichever robot considers the maximum time to be reached will start withdrawing from the situation, possibly leaving the opponent player an opportunity to score. This asymmetric behavior could be caused by different perception of a situation by the robots, e.g. by sensing the beginning differently. The situation could be resolved by the human referee if there was clear and reliable information on the motivation of both players, which he or she does not have. This way there are three different views of a situation and conclusions drawn. Only few combinations lead to an outcome according to rules and ethics. In the real competitions, this observation typically led the human team members to slightly adjusting the internal time keeping of the robots for this situation to rather have the robot potentially be penalized for breaking the rules than leaving an opponent with a certain opportunity to score. In a learning system, e.g. with a reinforcement learning approach, the robot may have adjusted the time by itself. One of the question then would be whom to hold accountable for breaking the rules? The programmer who implemented a program that adjusted itself to break rules, the operator that exposed the robot to the situation or the robot?
Conclusion
We could show how the RoboCup Humanoid league soccer scenario, provides a highly suitable equivalence scenario to research and showcase autonomous intelligent systems. The physical competition shows a relevant analogy to the military domain, e.g. robot teams with drone swarms. It also provides many facets of human involvement, from human control and monitoring to the immediate involvement of humans as one party in the physical engagement. The competition scenario provides some motivation to explore legal and ethic gray zones in order to achieve the targeted objectives, as could be shown with observations during the RoboCup competitions. The soccer scenario helps to break down the overall complexity and severity of military rules and ethics concerns and transforms them to a field better known for a larger audience.
This project was funded by the German Federal Ministry of Education and Research, in a program on Peace and Conflict Research, Grant Number: 01UG2206D.
[1] Scharre, Paul and Horowitz, Michael C. (2015): Meaningful human control in weapon systems: A primer. Center for a New American Security, 16.
[4] Gottschalk-Mazouz, Niels (2019): Autonomie. In: Liggieri, Kevin und Müller, Oliver (Hg.): Mensch-Maschine-Interaktion: Handbuch zu Geschichte – Kultur – Ethik. Berlin, pp. 238–240.
[7] Varshney, Kush R. (2016): Engineering safety in machine learning. In: 2016 Information Theory and Applications Workshop (ITA). IEEE, pp. 1–5; Xu, Zhaoyi und Saleh, Joseph Homer (2021): Machine learning for reliability engineering and safety applications: Review of current status and future opportunities. In: Reliability Engineering & System Safety 211:107530.
[11] Holzinger, Andreas (2018): From machine learning to explainable ai. In: 2018 world symposium on digital intelligence for systems and machines (DISA), IEEE, pp. 55–66.
[12] De Swarte, Thibault et al. (2019): Artificial intelligence, ethics and human values: the cases of military drones and companion robots. In: Artificial Life and Robotics 24, pp. 291–296.
[16] Niemüller, Tim et al. (2016): Robocup logistics league sponsored by festo: a competitive factory automation testbed. In: Jeschke, Sabine et al. (eds): Automation, Communication and Cybernetics in Science and Engineering2015/2016, pp. 605–618.
Daniel Giffhorn completed his Master's degree in Computer Science with a specialization in Mobile Autonomous Systems at Ostfalia University in 2021. Since 2022, he has been working in the MeHuCo project on the implementation of a test and demonstration scenario for rule-compliant physical confrontations between robots using the example of a robot soccer game. He is particularly interested in the realization of an automated referee.
Reinhard Gerndt
Reinhard Gerndt has held a professorship for robotics in the Faculty of Computer Science at Ostfalia University since 2002. Together with a colleague, he runs the Human-Centered Robotics Lab. He is a reviewer in European funding programs and co-founder of the French Robotics Association FFROB. He is particularly interested in the various aspects of the autonomy of robotic systems, including in interaction with humans.