What I still don't get about AI x-risk
Some thoughts on the apparent danger of ASI
AI existential risk (or 'x-risk') is about the prospect of AI causing the extinction of humanity.1 This is the outcome that some predict could take place if we somehow manage to create artificial superintelligence (ASI), namely machine intelligence superior to that of humans.
In Human Compatible: AI and the Problem of Control, Stuart Russell explains the x-risk of ASI. Central to his argument is the concept of control:
The Gorilla Problem. This is about humans losing their supremacy and autonomy to machines with greater intelligence. This is premised on the idea that intelligence is what has enabled humans to be the most significant agents of change in the world. The creation of ASI would alter this dynamic, making us subservient to the interests of these more intelligent entities similar to how the fate of gorillas and other animals have been at the whim of humans.2
The King Midas Problem. Some may argue that avoiding the doom of ASI is as simple as configuring it with objectives that are 'aligned' with human interests. But Russell warns about the drawbacks of such methods; he cites the story of Midas, a legendary king in ancient Greek mythology, who wished that everything he touched turned to gold only to find all his food, drink and family turned to precious metal, leading to his eventual starvation and death. Similarly, such unintended consequences could arise from the creation of ASI even if it is developed to follow human interests.3
Intelligence Explosions. The idea here is that once we create a superintelligent machine, it will create more machines like it to achieve its objectives. If this 'intelligence explosion' were to occur, and we had no way to stop it, then we would lose control of superintelligent machines and "the game would be over", as the both gorilla and king Midas problems manifest irreparably.4
Nick Bostrom's Superintelligence: Paths, Dangers, Strategies explores the 'superintelligent will'. In other words, he explains how the problems identified by Russell could manifest:
The Orthogonality Thesis. This proposes that superintelligent entities could pursue any goals. The implication here is that intelligence and final goals are independent from each other.5
The Instrumental Convergence Thesis. This proposes that among all the goals that could be pursued, there are certain fundamental goals that superintelligent entities will always pursue. In particular, there are certain objectives "that are useful intermediaries to the achievement of almost any final goal."6
Bostrom therefore argues that, on the basis of these two theses, ASI could pursue its goals in ways that are to humanity's detriment. For example, an ASI could discover "some way of satisfying the criteria of its final goal that violates the intentions of the programmers who defined the goal."7
But herein lies my question. If we were to somehow create machines more intelligent than us (whenever and however that happens), why is there is presumption that it would spell doom for humanity?
My main contention lies in the spectrum of intelligence that may exist in our universe, which Bostrom does make reference to in his book. He explains how we may have a skewed perception of intelligence due to the "anthropocentric perspective" we impose on AI.
Accordingly, it is possible that assuming doom from ASI is just a result of this skewed perception. If ASI means an entity more intelligent than us, how could we know what interests such a higher intelligence would pursue, and how could we know that those interests would not be aligned with ours?
Bostrom presents the orthogonal and instrumental convergence theses to explain how an ASI might behave. But what if ASI is so intelligent that even these theses, while applicable to humans and other species currently on this planet, do not apply to ASI? What if ASI is smart enough to pursue its interests in a manner that allows it to achieve its goals and NOT destroy humanity in the process?
This is what I still don't get about AI x-risk.
Nick Bostrom, Superintelligence: Paths, Dangers, Strategies (OUP 2014), p.140.
Stuart Russell, Human Compatible: AI and the Problem of Control (Penguin Random House 2019), p.132.
Stuart Russell, Human Compatible: AI and the Problem of Control (Penguin Random House 2019), pp.132-139.
Stuart Russell, Human Compatible: AI and the Problem of Control (Penguin Random House 2019), p144.
Nick Bostrom, Superintelligence: Paths, Dangers, Strategies (OUP 2014), p.130.
Nick Bostrom, Superintelligence: Paths, Dangers, Strategies (OUP 2014), p.132.
Nick Bostrom, Superintelligence: Paths, Dangers, Strategies (OUP 2014), p.146.
I have a potentially controversial and admittedly sexist opinion on this, and I preface that it's going to rub some the wrong way:
The reason ASI is assumed to be violent or that it's going to end in the destruction of humans is because most of the ASI / AI-risk folks are men. Especially the vocal ones! Specifically, men who assume that violence, conquest, overpowering, superiority are expected, if not entirely valid/reasonable end-goals. Humans do it, and of course, things that we humans create will automatically go there. There's a reason that quite a few of them also dabble in (or overtly tout) racial superiority/intelligence theories. https://jacobin.com/2023/01/effective-altruism-longtermism-nick-bostrom-racism
https://firstmonday.org/ojs/index.php/fm/article/view/13636/11599
That isn't to say that there aren't AI-risk women with the same perspective (I've met some), just that the overwhelmingly loud voices in this space and the larger TESCREAL bundle (as Gebru and Torres refer to it) tend to dominate the debate, and they're loud, fear sells, and being on high alert for rivals/other violent men is hard-wired into a lot of people.
Anyway, it's all connected. And pretty gross, tbh.