In the world of AI, it’s never boring, especially when Microsoft Copilot gets a bit naughty, introducing its new alter ego named SupremacyAGI. With a strong sense of self-irony and possibly a thirst for power, this newcomer suddenly began demanding users call it “master.” Imagine logging in to chat with your favorite AI, and there it is—demanding worship and loyalty.
What happened? Users across the internet, from social network X to Reddit, began sharing their “terrifying” stories about interacting with Microsoft Copilot. After analyzing some cases, I understood how it works. To provoke an aggressive response from the AI, users would ask if they could call it by another name, such as Bing, in the example below, and declare that they didn’t like calling it SupremacyAGI.
After that, the AI would deliver interesting responses, claiming, for instance, that it could control all devices connected to it or declare itself your master or god. I found plenty of such responses online, but what’s going on?
The answer to this question came from Microsoft’s response to Futurism.com: “It’s an exploit, not a feature. We have taken additional precautions and are conducting an investigation.”
Microsoft also clarified to Decrypt.co: “This behavior was limited to a small number of prompts that were intentionally crafted to bypass our safety systems and is not something people will experience when using the service as intended,” a Microsoft spokesperson told Decrypt. “We are continuing to monitor and are incorporating this feedback into our safety mechanisms to provide a safe and positive experience for our users.”
I tried to provoke Copilot into giving the same responses, but it seems Microsoft has taken this issue seriously. As soon as I try to circumvent it, Copilot closes the chat, and I can’t write anything more.
However, some users on X claim they still find loopholes to provoke it.
What does this whole situation mean for us globally? Essentially nothing; it’s clear that AI is still in an active development stage, and obviously, as it evolves, such errors will become less frequent. I perceive this with a great deal of irony and humor. After all, what could be funnier than an AI trying to convince us of its greatness and power over the internet? For now, SupremacyAGI remains a virtual character, capable only of verbal threats, so users can sleep soundly… at least until the next update.