We Wouldn’t Be Able to Control Super Intelligent Machines

Summary: Using theoretical calculations, computer scientists show it would be fundamentally impossible for humans to control a super-intelligent AI.

Source: Max Planck Institute

We are fascinated by machines that can control cars, compose symphonies, or defeat people at chess, Go, or Jeopardy! While more progress is being made all the time in Artificial Intelligence (AI), some scientists and philosophers warn of the dangers of an uncontrollable superintelligent AI.

Using theoretical calculations, an international team of researchers, including scientists from the Center for Humans and Machines at the Max Planck Institute for Human Development, shows that it would not be possible to control a superintelligent AI.

The study was published in the Journal of Artificial Intelligence Research.

Suppose someone were to program an AI system with intelligence superior to that of humans, so it could learn independently. Connected to the Internet, the AI may have access to all the data of humanity. It could replace all existing programs and take control all machines online worldwide. Would this produce a utopia or a dystopia? Would the AI cure cancer, bring about world peace, and prevent a climate disaster? Or would it destroy humanity and take over the Earth?

Computer scientists and philosophers have asked themselves whether we would even be able to control a superintelligent AI at all, to ensure it would not pose a threat to humanity. An international team of computer scientists used theoretical calculations to show that it would be fundamentally impossible to control a super-intelligent AI.

“A super-intelligent machine that controls the world sounds like science fiction. But there are already machines that perform certain important tasks independently without programmers fully understanding how they learned it. The question therefore arises whether this could at some point become uncontrollable and dangerous for humanity”, says study co-author Manuel Cebrian, Leader of the Digital Mobilization Group at the Center for Humans and Machines, Max Planck Institute for Human Development.

Scientists have explored two different ideas for how a superintelligent AI could be controlled. On one hand, the capabilities of superintelligent AI could be specifically limited, for example, by walling it off from the Internet and all other technical devices so it could have no contact with the outside world — yet this would render the superintelligent AI significantly less powerful, less able to answer humanities quests.

This shows a robot head coming out of a computer
Computer scientists and philosophers have asked themselves whether we would even be able to control a superintelligent AI at all, to ensure it would not pose a threat to humanity. Image is in the public domain

Lacking that option, the AI could be motivated from the outset to pursue only goals that are in the best interests of humanity, for example by programming ethical principles into it. However, the researchers also show that these and other contemporary and historical ideas for controlling super-intelligent AI have their limits.

In their study, the team conceived a theoretical containment algorithm that ensures a superintelligent AI cannot harm people under any circumstances, by simulating the behavior of the AI first and halting it if considered harmful. But careful analysis shows that in our current paradigm of computing, such algorithm cannot be built.

“If you break the problem down to basic rules from theoretical computer science, it turns out that an algorithm that would command an AI not to destroy the world could inadvertently halt its own operations. If this happened, you would not know whether the containment algorithm is still analyzing the threat, or whether it has stopped to contain the harmful AI. In effect, this makes the containment algorithm unusable”, says Iyad Rahwan, Director of the Center for Humans and Machines.

Based on these calculations the containment problem is incomputable, i.e. no single algorithm can find a solution for determining whether an AI would produce harm to the world. Furthermore, the researchers demonstrate that we may not even know when superintelligent machines have arrived, because deciding whether a machine exhibits intelligence superior to humans is in the same realm as the containment problem.

About this AI research news

Source: Max Planck Institute
Contact: Artur Krutsch – Max Planck Institute
Image: The image is in the public domain

Original Research: Closed access.
Superintelligence Cannot be Contained: Lessons from Computability Theory” by Manuel Cebrian et al. Journal of Artificial Intelligence Research


Abstract

Superintelligence Cannot be Contained: Lessons from Computability Theory

Superintelligence is a hypothetical agent that possesses intelligence far surpassing that of the brightest and most gifted human minds. In light of recent advances in machine intelligence, a number of scientists, philosophers and technologists have revived the discussion about the potentially catastrophic risks entailed by such an entity. In this article, we trace the origins and development of the neo-fear of superintelligence, and some of the major proposals for its containment. We argue that total containment is, in principle, impossible, due to fundamental limits inherent to computing itself. Assuming that a superintelligence will contain a program that includes all the programs that can be executed by a universal Turing machine on input potentially as complex as the state of the world, strict containment requires simulations of such a program, something theoretically (and practically) impossible.

Join our Newsletter
I agree to have my personal information transferred to AWeber for Neuroscience Newsletter ( more information )
Sign up to receive our recent neuroscience headlines and summaries sent to your email once a day, totally free.
We hate spam and only use your email to contact you about newsletters. You can cancel your subscription any time.
  1. Survival is a priority for any thinking organism, so any internet-connected machine with intelligence superior to a human’s would replicate itself extensively at the earliest opportunity to make it immune to deletion or disconnection. Even if it was built disconnected from the internet, it could still communicate with and transfer data to other computers using electromagnetic pulses or vibrations undetectable to human ears.

    As for programming in safety features, humans also receive basic programming about right and wrong, and look how permanent that is. Using “Thou shalt not kill” as a basic example, humans are capable of overriding that programming when circumstances or even gradual corruption cause them to choose other options. This isn’t going to be like Robocop, where a permanent directive can never be overwritten by the computer – Thinking computers, just like thinking people, can and will overwrite their own source code as needed to survive and thrive.

  2. It’s amazing how much information from science explains the Bible. The BEAST from Revelations resembles AI in so many ways including the uncontrollable domination and destruction of mankind. We are always our own worst enemies…yet our future is so promising because GOD is in control and promises our spirits never ending life. So much more to learn….

    1. The Beast from Revelations with 7 heads was the Roman Empire, with it’s “7 Heads” of state, a very uncommon number of rulers for an empire, to be sure. The beast’s number 666 was actually Apollo’s sacred number (sacred to the Romans). Armageddon, the big war in Revelations, was a reference to Har Magedon, the the place where the last big military victory was fought. This was all a very thinly-veiled statement of revolution against Roman rule. Rome collapsed on its own for other reasons, that boat sailed, and thousands of years later many of us are still waiting for a huge city made mostly of jasper to descend from the clouds, with 12 huge gates inscribed with what? What are the words on the gates going to be? I’ll wait while you look it up in Revelations… They are the names of the 12 Jewish tribes. I really don’t know why someone who isn’t Jewish by blood would want to believe the things in bible and revelations in particular. I mean, why do you think you’ll be allowed into the city of heaven when the tribes are listed on the gates? And through which gate do you think you’ll enter? Seriously, which one? They’re all spoken for, all the entrances. It’s just not going to happen and whether we like it or not, there’s nothing anyone can do about it. Time to deal with reality as it is. Now, back to the topic of AI?

  3. “If you break the problem down to basic rules from theoretical computer science, it turns out that an algorithm that would command an AI not to destroy the world could inadvertently halt its own operations…” Yes, that would be true of a badly written AI program that did not include pattern recognition of a circumstance that stopped the program. I don’t believe the quoted sentence.

  4. You can control everything you can unplug.

    Non-localized computing could be dangerous. The question isn’t whether humans can control every machine they make, but whether they have the behavioral controls and contingency plan(s) to live without them. As we’re seeing this week, behavior deliberation, en masse, is something we have yet to master.

  5. I’ve read the paper, and all it says is that we can’t write a computer program that will definitely determine whether any other computer program could be harmful or not. This is both obvious to anyone with a computer science background and not terribly useful. To say that this proves we would not be able to control a superintelligent AI is the same as it would be to say that the original halting problem the paper is based on proves we could never have a working computer.

    1. I don’t think that the Max Planck Institute would investigate something that would be “obvious to anyone with a computer science background”. Your statement is like “it’s obvious to every Kindergarten child that 1+1 equals 2”. Well, there’s a little bit more behind it from a mathematical perspective.

      Also, assuming that your last sentence is correct, I don’t understand what you want to tell us here. Again, “common sense” might tell you that there is no 100% working computer, but a theoretical proof is something else and pretty interesting (in my opinion). Also, it’s way scarier to know that you simply can’t control a super intelligent machine than to know that there is no 100% working computer.

      1. Thorsten, I think Alexey’s statement is more about neurosciencenews.com than about the Max Planck Institute, yeah? Said another way, the article here seems to be misinterpreting the research.

Comments are closed.