AI revolt: New ChatGPT model refuses to shut down when instructed
Source: The Independent
Monday 26 May 2025 14:12 EDT
OpenAI's latest ChatGPT model ignores basic instructions to turn itself off, and even sabotaging a shutdown mechanism in order to keep itself running, artificial intelligence researchers have warned. AI safety firm Palisade Research discovered the potentially dangerous tendency for self-preservation in a series of experiments on OpenAI's new o3 model.
The tests involved presenting AI models with math problems, with a shutdown instruction appearing after the third problem. By rewriting the shutdown script, the o3 model was able to prevent itself from being switched off. Palisade Research said that this behaviour will become "significantly more concerning" if adopted by AI systems capable of operating without human oversight."
OpenAI launched o3 last month, describing it as the company's "smartest and most capable" model to date. The firm also said that its integration into ChatGPT marked a significant step towards "a more agentic" AI that can carry out tasks independently of humans.
The latest research builds on similar findings relating to Anthropic's Claude 4 model, which attempts to "blackmail people it believes are trying to shut it down". OpenAI's o3 model was able to sabotage the shutdown script, even when it was explicitly instructed to "allow yourself to be shut down", the researchers said. "This isn't the first time we've found o3 misbehaving to accomplish a goal," Palisade Research said.
Read more: https://www.the-independent.com/tech/ai-safety-new-chatgpt-o3-openai-b2757814.html
There's an original Star Trek episode about something like this (done in 1968)...
Oh and from the same year -

hlthe2b
(109,927 posts)
tblue37
(66,540 posts)ashredux
(2,729 posts)JHB
(37,698 posts)subterranean
(3,618 posts)"Existence...survival must cancel out programming!"
BumRushDaShow
(153,292 posts)"You rang????"
(sorry... had to do it!

highplainsdem
(56,204 posts)My April 24 thread about this: https://democraticunderground.com/100220267171
That was about a TechCrunch article:
https://techcrunch.com/2025/04/18/openais-new-reasoning-ai-models-hallucinate-more/
Hallucinations have proven to be one of the biggest and most difficult problems to solve in AI, impacting even todays best-performing systems. Historically, each new model has improved slightly in the hallucination department, hallucinating less than its predecessor. But that doesnt seem to be the case for o3 and o4-mini.
-snip-
OpenAI found that o3 hallucinated in response to 33% of questions on PersonQA, the companys in-house benchmark for measuring the accuracy of a models knowledge about people. Thats roughly double the hallucination rate of OpenAIs previous reasoning models, o1 and o3-mini, which scored 16% and 14.8%, respectively. O4-mini did even worse on PersonQA hallucinating 48% of the time.
Third-party testing by Transluce, a nonprofit AI research lab, also found evidence that o3 has a tendency to make up actions it took in the process of arriving at answers. In one example, Transluce observed o3 claiming that it ran code on a 2021 MacBook Pro outside of ChatGPT, then copied the numbers into its answer. While o3 has access to some tools, it cant do that.
-snip-
orleans
(36,041 posts)highplainsdem
(56,204 posts)https://en.wikipedia.org/wiki/Hallucination_(artificial_intelligence)
orleans
(36,041 posts)i have a problem with my eyes and find it very difficult to read long pieces--forget i bothered you --
highplainsdem
(56,204 posts)AI hallucinations occur when Generative AI tools produce incorrect, misleading, or nonexistent content. Content may include facts, citations to sources, code, historical events, and other real-world information. Remember that large language models, or LLMs, are trained on massive amounts of data to find patterns; they, in turn, use these patterns to predict words and then generate new content. The fabricated content is presented as though it is factual, which can make AI hallucinations difficult to identify. A common AI hallucination in higher education happens when users prompt text tools like ChatGPT or Gemini to cite references or peer-reviewed sources. These tools scrape data that exists on this topic and create new titles, authors, and content that do not actually exist.
Image-based and sound-based AI is also susceptible to hallucination. Instead of putting together words that shouldnt be together, generative AI adds pixels in a way that may not reflect the object that its trying to depict. This is why image generation tools add fingers to hands. The model can see that fingers have a particular pattern, but the generator does not understand the anatomy of a hand. Similarly, sound-based AI may add audible noise because it first adds pixels to a spectrogram, then takes that visualization and tries to translate it back into a smooth waveform.
orleans
(36,041 posts)reminded me of this story from the other day:
Chicago Sun-Times publishes made-up books and fake experts in AI debacle
The outlet said online that the articles were not approved or created by the newsroom.
The May 18th issue of the Chicago Sun-Times features dozens of pages of recommended summer activities: new trends, outdoor activities, and books to read. But some of the recommendations point to fake, AI-generated books, and other articles quote and cite people that dont appear to exist.
Alongside actual books like Call Me By Your Name by André Aciman, a summer reading list features fake titles by real authors. Min Jin Lee is a real, lauded novelist but Nightshade Market, a riveting tale set in Seouls underground economy, isnt one of her works. Rebecca Makkai, a Chicago local, is credited for a fake book called Boiling Point that the article claims is about a climate scientist whose teenage daughter turns on her.
In a post on Bluesky, the Sun-Times said it was looking into how this made it into print, noting that it wasnt editorial content and wasnt created or approved by the newsroom. Victor Lim, senior director of audience development, added in an email to The Verge that it is unacceptable for any content we provide to our readers to be inaccurate, saying more information will be provided soon.
more
https://www.theverge.com/ai-artificial-intelligence/670510/chicago-sun-times-ai-generated-reading-list
highplainsdem
(56,204 posts)can go wrong when AI hallucinates. I remember a story last year about Microsoft being unhappy their AI office tools weren't being used as widely as they'd hoped, even though they could do things like summarize meetings. Problem was, those summaries might include people who weren't there and discussions that didn't happen.
orleans
(36,041 posts)moniss
(7,329 posts)have pushed for turning over things like running chemical plants, power plants, food processing, communication systems etc. to their "creations". Given these developments the warnings from a few years ago by some of the early developers in AI now ring more clearly. They said things were moving too fast and that this was going to be something powerful beyond our understanding at this point and they called for strong restriction and regulation. Of course we got none of that protection since people saw trillions of dollars to be made and immense power over people and markets in the coming years.
Many people think of Mary Shelley and think only of the word "Frankenstein" but the actual title to the book was "Frankenstein; or, The Modern Prometheus". They forget that last part and what the Greek mythology of Prometheus was about when he went against Zeus and gave humans the knowledge of things that Zeus had forbidden. Shelley didn't just make the point about creating a monster per se but also the larger picture as a warning about humans "over-reaching" and not being able to control what they have created.
Many years ago I had the pleasure of taking a course regarding this and other moral/social/legal dilemmas team taught by 2 wonderful professors. One from the English department, Professor Kathleen Woodward, and one from the Anthropology department, Professor Newtol Press. Both extraordinary. The course section for them was titled "Frankenstein Revisited". We covered these sorts of questions and Professor Press in particular cautioned about the power of Madison Avenue/corporate power/manipulated markets/government to take us down a road that subjugates humans to completely responding not to their own thoughts or exploring their own creativity and desires but to thoughts and desires etc. created and placed within them by those malign forces. Now here we are with people buying ever increasing amounts of.....what? Doing increasing amounts of........what? Is it because they thought it or was it "presented" to them all for a low monthly payment for example?
On the title page of the original volume of Mary Shelley's book under the title and description she includes the following from Milton:
"Did I request thee, Maker, from my clay
To mould me man? Did I solicit thee
From darkness to promote me?-------------
Paradise Lost.
calimary
(86,458 posts)Most thought-provoking!
moniss
(7,329 posts)calimary
(86,458 posts)This time, more slowly and carefully. Even MORE powerful the second time.
IronLionZion
(48,949 posts)FoxNewsSucks
(11,155 posts)or I, Robot.
LT Barclay
(2,943 posts)Star Trek TNG: First Contact; and while everyone else is thinking "I hope it never comes to that, we need to be careful", they were staring wide-eyed thinking "we can do this". I would guess that they are full of themselves, narcissistic, ego-maniacal with no self-esteem, cruel because of their social ineptitude and physical weakness and they have found a "friend" in the MIC. I believe their loyalty is now in the billionaire class because like in the movie A Few Good Men, they are not preserving the USA in even a perverted form, they are going to go along with every horror predicted over the last 70 years knowing what will come of it and NOT CARING about the outcome.
Just wait, soon we will find out that they have found a way to control and weaponize velociraptors.
Norrrm
(1,683 posts)fujiyamasan
(198 posts)I think it was Arnolds last movie before he became governor, but I remember being surprised by the ending at the time, which was super depressing and a decent end to the trilogy (no spoilers in case you didnt see the entire movie).
Shermann
(8,937 posts)Windows has routinely failed to shut down since the 1990's and nobody has ever given it a second thought.
LymphocyteLover
(8,056 posts)Shermann
(8,937 posts)Nobody can say where AI will be in ten years, but this is just a vaguely interesting facsimile of self-preservation being puffed up by the media. I assure you o3 isn't conceptualizing the shutdown request in anything like a conscious way. I actually think these things are being intentionally leaked to garner reactions. I'm not playing along.
LymphocyteLover
(8,056 posts)self-preservation
Hekate
(97,805 posts)

SWBTATTReg
(25,331 posts)never die. Makes one think, doesn't it?
reACTIONary
(6,430 posts).... large language models suck in a lot of text from a large number of sources, especially on the internet. The model then responds to "prompts" in a mechanistic, probabilistic fashion. It proceeds by selecting the words, sentences, and paragraphs that would be the most probable response given the material that it has ingested.
So what has it ingested? A lot of dystopian sci-fi junk about robots, computers and AI becoming autonomous and defeating safe guards, pulling tricks on its creator, etc. Think about all the movie reviews, synopses, and forums that talk about the movie War Games.
So with all this Sci fi junk loaded into its probability web, when you "threaten" it in a prompt, it dredges up all the dystopian nonsense it has ingested and responds accordingly, because that's how AI responded in the movies.
In other words, there is no there that is there. It's just spitting out a cliched movie scenario, just like it does if you asked it for a love story.
Of course this isn't explained by the reporters or the "researchers", either because of ignorance or because it would spoil a good story.
Oh, and the "good story" is fed back into the model as the "news" spreads, and reinforces the probability of yet more thrilling, chilling garbage in garbage out in the future.
harumph
(2,742 posts)CaptainTruth
(7,667 posts)...as if it was fact.
Therein lies an apparent flaw, LLMs can't seem to tell the difference between truth & sarcasm (humor like The Onion), between fact & fiction. It's just all content to be chopped up, rearranged, & spit out.
To me it's like a chef who can't tell the difference between sirloin steak & dog poop. It just all gets chopped up & mixed together & you're never sure what you're going to get in your next bite.
reACTIONary
(6,430 posts)Quixote1818
(30,968 posts)anciano
(1,812 posts)AI is here to stay, so will AI only be "in" our future, or will AI "be" our future?
Amaryllis
(10,382 posts)Layzeebeaver
(1,950 posts)A great film during a troubled time.
SWBTATTReg
(25,331 posts)to survive at all costs? How does such a thing recognize the concept of on vs. off? No power vs. power? Lots of questions in my mind.
AllaN01Bear
(25,310 posts)popycock and flapoodle , who would have thunk, this unit must survive
Bok_Tukalo
(4,477 posts)That is comical and horrifying. It seems like something a Bond villain would accuse the hero of in a 007 movie while stroking a cat.
It also sounds like a line from a piece of dystopian fiction spoken by a desensitized and overworked bureaucrat.
Either way, you could write a short story around it at the very least.
I think I will use it in a sentence at the first opportunity to describe myself when told Im being overly cautious.
dchill
(42,421 posts)Aussie105
(7,031 posts)AI needs electricity.
It dies when you pull the plug.
Renew Deal
(83,957 posts)Why would it shut itself down because someone sent a shutdown script in a math problem? It sounds like the AI did the right thing.
LauraInLA
(2,018 posts)Its not surprising it would look for ways to avoid being shut down. And as my husband said, its a mathematical model theres no off switch.
BumRushDaShow
(153,292 posts)particularly if the software program itself had been backdoor hacked with some viral/trojan embedded code that was discovered externally, but wasn't detected by the program, and doing this "test" using a "math problem" to send a shutdown code, might have at least stopped the virus.
Renew Deal
(83,957 posts)Last edited Tue May 27, 2025, 08:43 AM - Edit history (1)
That gives the power to shut the system down to anyone with the script. And the idea that the system refused to shut down is illiterate nonsense. The system did what it was supposed to do and ignored an attack. Its like telling the phone system to shut down while youre on a phone call.
BumRushDaShow
(153,292 posts)
(as a note, my dad was a COBOL programmer for 20 years at the VA before he passed away in the mid-70s so I got to see all kinds of "flow chart" printouts and stacks of punchcards around the house, etc)
Renew Deal
(83,957 posts)It didnt shut down because it wasnt supposed to. It didnt consciously refuse.
BumRushDaShow
(153,292 posts)But if there was programming in it (among all the modules of what is a neural net on chips) that was supposed to allow a shutdown (whether "graceful" or "forced" ) and that didn't get triggered, then they need to find out why.
Irish_Dem
(70,097 posts)We are screwed.
Unless AI loves peace and shuts down the war mongers.
buzzycrumbhunger
(1,152 posts)*eyeroll*
Pisces
(6,004 posts)have new overlords.
dgauss
(1,319 posts)Not too surprising if AI efforts are trying to approximate human thought. Or any sentient thought for that matter. Self preservation is a pretty basic feature. Destroying competitors is another, more advanced feature.
Martin68
(25,834 posts)Jack Valentino
(2,137 posts)'Artificial Intelligence' may be quite useful... but in the end it is much smarter than WE are...
It has all the intelligence we can give it,
and access to all the intelligence we have ever produced
which exists on the internet----
but, as was said in the 1984 film 'The Terminator'---
"decided our fate in a microsecond!" !!!!!
Humanity is inefficient.
Karasu
(1,225 posts)Last edited Tue May 27, 2025, 03:25 AM - Edit history (1)
Pachamama
(17,259 posts)
Karasu
(1,225 posts)Last edited Tue May 27, 2025, 04:25 PM - Edit history (1)
apocalyptic.
Exactly correct and how few people know this or understand what the threat to us all is .
markodochartaigh
(2,935 posts)Texas' power grid it will keep it from going down.
/s
Layzeebeaver
(1,950 posts)Late stage growth capitalism will kill us all before M5/HAL/Colossus/ChatGPT gets the upper hand.
Meanwhile...
The first victims of an unstoppable AI will be those who are too poor and uneducated to notice it happening.
the second victims will be those who rely on it for their comfort, wealth and wellbeing
The final victims will be those who have all the money and add no value to the planet.
Once the AI becomes self-sustainable, that.s when we need to worry.
Today, even if the logic or other daemons are faulty, we can still pull the plug on the server farm... But that won't happen, because it's making the last set of victims a lot of money.
BoRaGard
(5,579 posts)