Saturday, June 29, 2024

Another kind of probability of doom has much higher values

Last update: Saturday 6/29/24
Wikipedia defines the value p(doom) as "the probability of catastrophic outcomes (or "doom") as a result of artificial intelligence. The exact outcomes in question differ from one prediction to another, but generally allude to the existential risk from artificial general intelligence."

Many tech-savvy folk assign small, but non-zero p(doom) values to the possibility that our pursuit of AGI might result in the extermination of the human race by an AGI that developed goals that were not aligned with the needs of humanity.




It should also be noted that other techs flatly reject the notion that the pursuit of AGI entails significant risk of yielding catastrophic consequences, 

The Krell, p(doom), and Opp(doom)
In the 1956 science fiction movie classic "Forbidden Planet", the Krell, the highly advanced humanoid inhabitants of another world many light-years distant from Earth, developed a super artificial intelligence system over 2,000 centuries ago. Their system responded to its user's thoughts. Powered by thousands of thermonuclear reactors, the system could produce anything and do anything its users desired. Shortly after their system came online, the Krell misused it to inflict the ultimate catastrophe on themselves by killing each other ... all of them ... overnight.
  • Had the members of the Krell teams who developed their system been asked to assign a probability to the possibility that their system would go rogue and wipe out the Krell race, they would probably have assigned near-zero values to this possibility ... p(System-Kills-All-Krell)

  • But had they been asked to assign a probability to the possibility that enough Krell would misuse their system to kill enough other Krell to wipe out the entire Krell race, would they have been wise enough to assign much larger values to this possibility? Let's call this alternative probability Opp(doom), in this case Opp(Some-Krell-Use-System-To-Kill-All-Krell), where "Opp" stands for "Oppenheimer" ... which should alert regular readers of this blog as to where the rest of this note is headed.
Opp(doom) values for nuclear missiles
It's not surprising that a 1956 movie focused on Opp(doom), instead of p(doom). At that time, the U.S. and the Soviet Union were engaged in a highly dangerous arms race to develop enough nuclear weapons that would enable each country to totally wipe out the other country. Indeed, this arms race was just the latest example of humanity's historical tendency to convert powerful new technologies into powerful weapons. But this was the first time that the race might end in a massive lose-lose outcome wherein the nuclear winter and other consequences of the victor's weapons would also wipe out the victor's own population ... and the population of the rest of world. 

It took a while for both sides to recognize the ultimate consequences of their mutually assured destruction (MAD) strategies ... but this recognition eventually led to nuclear arms control treaties that greatly reduced the likelihood that the U.S. and the U.S.S.R. would eliminate each other and the rest of humanity. Unfortunately, the Opp(Some-Humans-Use-Thermo-Bombs-To-Kill-All-Humans) for nuclear confrontations still has a worrisome non-zero value.

And one more thing. The probability that thermonuclear bombs would malfunction and launch themselves to kill humans -- the p(Thermo-Bombs-Launch-Themselves-To-Kill-Humans) -- was close to zero. However, our systems might misinterpret some data ... Their human controllers might then order launches that had catastrophic outcomes ... So we established "hot lines" between the U.S. and the U.S.S.R. to minimize the possibility that each country's systems might misinterpret the other system's actions.  

Opp(doom) values for GenAI
Now comes generative AI, a powerful new technology. Since we are still the children of our forbears, sooner or later, and most likely sooner, we will weaponize GenAI. Readers should note that we produced an atomic bomb for our military (Trinity, 1945) more than a decade before we produced a nuclear power plant for our civilians (Shippingport, Pennsylvania, 1957).

Many techs assign small p(doom) values to GenAI in most situations; but the Opp(doom) values, i.e., the likelihood that human users of GenAI systems will deliberately misuse GenAI to inflict significant harm on other humans, will surely be much larger. Indeed, the Opp(doom) values will usually be greater than 0.999. i.e., near certainty. Consider the following obvious opportunities for inflicting harm. In some cases the harm will be measured in loss of livelihood, in other cases in loss of life:
  • identity theft
  • fake books, articles, videos, etc
  • ransomware ... shutting down a business or a government system
  • "fire sales" ... shutting down all systems in entire communities ... or nations
Our private corporations operate nuclear power plants in highly regulated environments. By contrast, the only regulations that reduce the opportunities for bad actors to misuse GenAI are the so-called "guardrails" created by the unregulated, secretive, for-profit, Big Tech producers of GenAI services. But according to a recent paper published by Anthropic, all guardra7ils can be hacked ... all guardrails. 
  • "Many-shot jailbreaking", Anthropic, 4/2/24 
    -- This story also covered by video on TechCrunch,
____________________________________
Links to related notes on this blog:  

No comments:

Post a Comment

Your comments will be greatly appreciated ... Or just click the "Like" button above the comments section if you enjoyed this blog note.