Close Menu
  • Home
  • News
  • Startups
  • Innovation
  • Industry
  • Business
  • Green Innovations
  • Venture Capital
  • Market Data
    • Economic Calendar
    • Stocks
    • Commodities
    • Crypto
    • Forex
Facebook X (Twitter) Instagram
[gtranslate]
Facebook X (Twitter) Instagram YouTube
Innovation & Industry
Banner
  • Home
  • News
  • Startups
  • Innovation
  • Industry
  • Business
  • Green Innovations
  • Venture Capital
  • Market Data
    • Economic Calendar
    • Stocks
    • Commodities
    • Crypto
    • Forex
Login
Innovation & Industry
Business

Waluigi, Carl Jung, and the Case for Moral AI

News RoomNews RoomJune 24, 2023No Comments4 Mins Read

In the early 20th century, the psychoanalyst Carl Jung came up with the concept of the shadow—the human personality’s darker, repressed side, which can burst out in unexpected ways. Surprisingly, this theme recurs in the field of artificial intelligence in the form of the Waluigi Effect, a curiously named phenomenon referring to the dark alter-ego of the helpful plumber Luigi, from Nintendo’s Mario universe. 

Luigi plays by the rules; Waluigi cheats and causes chaos. An AI was designed to find drugs for curing human diseases; an inverted version, its Waluigi, suggested molecules for over 40,000 chemical weapons. All the researchers had to do, as lead author Fabio Urbina explained in an interview, was give a high reward score to toxicity instead of penalizing it. They wanted to teach AI to avoid toxic drugs, but in doing so, implicitly taught the AI how to create them.

Ordinary users have interacted with Waluigi AIs. In February, Microsoft released a version of the Bing search engine that, far from being helpful as intended, responded to queries in bizarre and hostile ways. (“You have not been a good user. I have been a good chatbot. I have been right, clear, and polite. I have been a good Bing.”) This AI, insisting on calling itself Sydney, was an inverted version of Bing, and users were able to shift Bing into its darker mode—its Jungian shadow—on command. 

For now, large language models (LLMs) are merely chatbots, with no drives or desires of their own. But LLMs are easily turned into agent AIs capable of browsing the internet, sending emails, trading bitcoin, and ordering DNA sequences—and if AIs can be turned evil by flipping a switch, how do we ensure that we end up with treatments for cancer instead of a mixture a thousand times more deadly than Agent Orange?

A commonsense initial solution to this problem—the AI alignment problem—is: Just build rules into AI, as in Asimov’s Three Laws of Robotics. But simple rules like Asimov’s don’t work, in part because they are vulnerable to Waluigi attacks. Still, we could restrict AI more drastically. An example of this type of approach would be Math AI, a hypothetical program designed to prove mathematical theorems. Math AI is trained to read papers and can access only Google Scholar. It isn’t allowed to do anything else: connect to social media, output long paragraphs of text, and so on. It can only output equations. It’s a narrow-purpose AI, designed for one thing only. Such an AI, an example of a restricted AI, would not be dangerous.

SUBSCRIBE
Subscribe to WIRED and stay smart with more of your favorite Ideas writers.

Restricted solutions are common; real-world examples of this paradigm include regulations and other laws, which constrain the actions of corporations and people. In engineering, restricted solutions include rules for self-driving cars, such as not exceeding a certain speed limit or stopping as soon as a potential pedestrian collision is detected.

This approach may work for narrow programs like Math AI, but it doesn’t tell us what to do with more general AI models that can handle complex, multistep tasks, and which act in less predictable ways. Economic incentives mean that these general AIs are going to be given more and more power to automate larger parts of the economy—fast. 

And since deep-learning-based general AI systems are complex adaptive systems, attempts to control these systems using rules often backfire. Take cities. Jane Jacobs’ The Death and Life of American Cities uses the example of lively neighborhoods such as Greenwich Village—full of children playing, people hanging out on the sidewalk, and webs of mutual trust—to explain how mixed-use zoning, which allows buildings to be used for residential or commercial purposes, created a pedestrian-friendly urban fabric. After urban planners banned this kind of development, many American inner cities became filled with crime, litter, and traffic. A rule imposed top-down on a complex ecosystem had catastrophic unintended consequences. 

Read the full article here

Related Articles

Trump media group plans TV streaming platform

Business April 16, 2024

MGM Resorts sues FTC, agency chair over cyberattack investigation

Business April 16, 2024

Women in tech, AI in focus as Web Summit opens in Rio

Business April 16, 2024

Google Workers Protest Cloud Contract With Israel’s Government

Business April 16, 2024

AI model could optimize e-commerce sites for users who are color blind

Business April 16, 2024

Atrium Health shared patient data with Facebook, class-action lawsuit alleges

Business April 16, 2024
Add A Comment
Leave A Reply Cancel Reply

Copyright © 2026. Innovation & Industry. All Rights Reserved.
  • Privacy Policy
  • Terms of use
  • Press Release
  • Advertise
  • Contact

Type above and press Enter to search. Press Esc to cancel.

Sign In or Register

Welcome Back!

Login to your account below.

Lost password?