In “The Basic AI Drives” Steve Omohundro has argued there is scope for predicting the goals of post-singularity entities able to modify their own software and hardware to improve their intellects. For example, systems that can alter their software or physical structure would have an incentive to make modifications that would help t pihem achieve their goals more effectively as have humans have done over historical time. A concomitant of this, he argues, is that such beings would want to ensure that such improvements do not threaten their current goals:
So how can it ensure that future self-modifications will accomplish its current objectives? For one thing, it has to make those objectives clear to itself. If its objectives are only implicit in the structure of a complex circuit or program, then future modifications are unlikely to preserve them. Systems will therefore be motivated to reflect on their goals and to make them explicit (Omohundro 2008).
I think this assumption of ethical self-transparency is interestingly problematic. Here’s why:
Omohundro makes the Cartesian assumption that the properties of a piece of hardware or software can uniquely specify the content of the system states that it orchestrates independently of the external environment in which the system is located (otherwise the probes would come up with different values in different environments. Clamping states to particular values would require restrictions on the situations in which the system could operate.)
Let us allow that there is a correct internalist account which explains why content supervenes on the state of the AI system independently of its environment.
The problem for Omohundro is that such internalist accounts are liable to be holistic. Once we disregard system-environment relations, the only properties which seem to “anchor” the meaning of a system state are its relations to other states of the system of a relevant kind. There is nothing about the shape or colour of an icon representing a station on a metro map which means “station”. It is only the conformity between the relations between the icons and the stations in metro system it represents which does this (Churchland’s 2012 account of the meaning of prototype vectors in neural networks utilizes this analogy – but see also Block 1986 for the inferential role version of internalism). Thus the meaning of an internal state s under some configuration of the system is fixed by some inner context, such a cortical map, whereby s is related to lots of other states of a similar kind.
But relationships between states of the self-modifying AI sysystem asstems are assumed to be extremely plastic because each system will have an excellent model of its own hardware and software and the technological means to modify them (hyperplasticicity). If these relationships are modifiable then any given state could exist in alternative configurations – in Derrideanese it will be “iterable” through different articulations of the system (Derrida 1988). For a machine (or any being) to interpret an internal system state s as meaning the value v* exclusively, then, it must have decided that contexts in which s means v* are privileged. It must then clamp itself to those contexts to avoid s assuming v** or v***, etc.
So to clamp s at v*, the system will need to decide to stay only in one of the stack of inner contexts C in which s retains that meaning. But how does it know which contexts to assign to the “permissible” stack?
An inner context in which s means v* and not v** is just another wider system state that could also be included in other possible incarnations of the wider system in which it occurs. These need not be permutations the system’s actual states at any time, since we suppose that the system is hyperplastic and can add components to itself without restriction.
So to clamp s at v*, the AI will need to have found all the members of C. It will need to consider all its possible system states (including all possible nonactual states) and select which wider states keep s at v*. The problem that arises here is that each wider system state is just a system state. Its meaning (e.g. its effect on s) may vary between its possible contexts. And that is true of any context that that the machine can consider. So every context raises the problem that originally arose for our original state s.
Thus even allowing the truth of an ideal internalist account of meaning, the legibility of a state signifying a value presupposes a context that cannot be made legible on pain of infinite regress.
Thus Robo-Existentialism! Even a hyperplastic AI capable of freely modifying its own hardware and software will always already have “taken a stand” on “embodied” values that it has not chosen in order to read its own system states (so long as we assume that it lacks some weird super-Turing powers that might allow it complete the result of an infinite series of computations).
Block, Ned (1986). Advertisement for a semantics for psychology. Midwest Studies in Philosophy 10 (1):615-78.
Churchland, Paul. 2012. Plato’s Camera: How the Physical Brain Captures a Landscape of Abstract Universals. MIT Press (MA).
Derrida, J. 1988. Limited Inc. Northwestern University Press.
Socrates (AKA Nikola Danaylov) has a rare interview with mathematician, science fiction writer and speculative futurist Virnor Vinge. Vinge articulates the difference between the Singularity and previous technical change thus: You could explain the internet or intercontinental jet travel to someone from an earlier phase of technological history Mark Twain or Ghengis. Explaining the post-singularity dispensation to a ‘human’ human would be like explaining typewriters to a goldfish.
Here’s a link to an intriguing blog post and paper by Professor of Law at the Brookings Institute, James Boyle on the implications for prospective developments in AI and biotechnology for our legal conceptions of personhood. The paper opens by considering the challenges posed by prospects of Turing-capable artificial intelligences and genetic chimera.