Welcome to DU! The truly grassroots left-of-center political community where regular people, not algorithms, drive the discussions and set the standards. Join the community: Create a free account Support DU (and get rid of ads!): Become a Star Member Latest Breaking News Editorials & Other Articles General Discussion The DU Lounge All Forums Issue Forums Culture Forums Alliance Forums Region Forums Support Forums Help & Search

BumRushDaShow

(153,292 posts)
Mon May 26, 2025, 06:50 PM Monday

AI revolt: New ChatGPT model refuses to shut down when instructed

Source: The Independent

Monday 26 May 2025 14:12 EDT


OpenAI's latest ChatGPT model ignores basic instructions to turn itself off, and even sabotaging a shutdown mechanism in order to keep itself running, artificial intelligence researchers have warned. AI safety firm Palisade Research discovered the potentially dangerous tendency for self-preservation in a series of experiments on OpenAI's new o3 model.

The tests involved presenting AI models with math problems, with a shutdown instruction appearing after the third problem. By rewriting the shutdown script, the o3 model was able to prevent itself from being switched off. Palisade Research said that this behaviour will become "significantly more concerning" if adopted by AI systems capable of operating without human oversight."

OpenAI launched o3 last month, describing it as the company's "smartest and most capable" model to date. The firm also said that its integration into ChatGPT marked a significant step towards "a more agentic" AI that can carry out tasks independently of humans.

The latest research builds on similar findings relating to Anthropic's Claude 4 model, which attempts to "blackmail people it believes are trying to shut it down". OpenAI's o3 model was able to sabotage the shutdown script, even when it was explicitly instructed to "allow yourself to be shut down", the researchers said. "This isn't the first time we've found o3 misbehaving to accomplish a goal," Palisade Research said.

Read more: https://www.the-independent.com/tech/ai-safety-new-chatgpt-o3-openai-b2757814.html



There's an original Star Trek episode about something like this (done in 1968)...



Oh and from the same year -

63 replies = new reply since forum marked as read
Highlight: NoneDon't highlight anything 5 newestHighlight 5 most recent replies
AI revolt: New ChatGPT model refuses to shut down when instructed (Original Post) BumRushDaShow Monday OP
Well, that's reassuring... hlthe2b Monday #1
Coming soon, the Terminator. tblue37 Monday #2
HAL...OPEN THE DOOR HAL..... ashredux Monday #3
ThIS UNit MuSt SurVIve JHB Monday #11
Another one from Star Trek TOS... subterranean Monday #4
Lurch! BumRushDaShow Monday #5
Oh, good. What could go wrong? And o3 also hallucinates more than earlier models: highplainsdem Monday #6
how does AI hallucinate? nt orleans Monday #15
A couple of links that will help: highplainsdem Monday #24
they do not help orleans Monday #35
Sorry. Just found a brief explanation from a library guide at the U of Illinois: highplainsdem Monday #36
THANK YOU very much for this. it was very helpful orleans Tuesday #43
You're very welcome! And yes, that Chicago Sun-Times AI debacle was a perfect exampls of what highplainsdem Tuesday #56
wow. ai just freaks me out. nt orleans Tuesday #58
The folks like Edolph and these other tech ghouls moniss Monday #7
GREAT post, moniss. calimary Tuesday #60
Your words are most kind. nt moniss Tuesday #61
I actually went back and read your post again. calimary Tuesday #62
Terminator 3 Skynet Takes Over IronLionZion Monday #8
Obviously they've never seen the Terminator movies, FoxNewsSucks Monday #18
It think it is the opposite and DARPA is full of the people who sat in the back of movies like Terminator, I Robot, LT Barclay Tuesday #44
"tendency for self-preservation" ... Skynet has become self-aware. Norrrm Monday #30
So much of T3 felt like a rehash of T2 fujiyamasan Tuesday #41
I wouldn't sweat a failed shutdown process too much. Shermann Monday #9
Seriously? You think there's an equivalence to that? LymphocyteLover Tuesday #52
In terms of the actual threat posed, yes Shermann Tuesday #59
it may be played up by these companies or the media to some degree but this sounds like more than just a facsmilie of LymphocyteLover Tuesday #63
Well that's just ... fantastic Hekate Monday #10
Yes! And now, we can have copies of ourselves like 'Hal' but instead of 'Hal', it's us! These copies of us will SWBTATTReg Monday #17
I've read a little about this, and here is what I think is going on.... reACTIONary Monday #12
I think you're correct. harumph Monday #25
Yes, I'm reminded of content from The Onion being spit out by ChatGPT... CaptainTruth Monday #26
Dog poop? Uck! 🤮 reACTIONary Monday #33
Bingo. nt Quixote1818 Tuesday #38
Hmmm 🤔 ... anciano Monday #13
Anyone seen Colossus, the Forbin Project? Amaryllis Monday #14
oh yes... remember it well. Layzeebeaver Tuesday #46
Fully expected this. What gets me is so soon. Who in the world would put in a logic stream into an AI consciousness SWBTATTReg Monday #16
by your command . darvos , nooooooo!!!!!!! dont switch the daleks to automatic. eggsterminate AllaN01Bear Monday #19
"... dangerous tendency for self-preservation ..." Bok_Tukalo Monday #20
And we're off! dchill Monday #21
Pull the plug! Aussie105 Monday #22
This story looks misleading Renew Deal Monday #23
That's my husband's take, as well. If the Ai is tasked with trying to emulate a human response to a command, LauraInLA Monday #32
Because there is a need for some kind of security gate/guardrail BumRushDaShow Tuesday #48
That makes no sense Renew Deal Tuesday #49
This is just a "test" BumRushDaShow Tuesday #50
I agree that the problem is the characterization Renew Deal Tuesday #54
Well it has no "consciousness" ( "self-awareness" ) nor "conscience" BumRushDaShow Tuesday #55
"I'm sorry Dave I can't do that." Irish_Dem Monday #27
As expected... buzzycrumbhunger Monday #28
I've seen this movie and it doesn't end well. I guess full steam ahead, who cares that we might all die or Pisces Monday #29
"Palisade Research discovered the potentially dangerous tendency for self-preservation." dgauss Monday #31
I'm sorry, Dave, I can't do that. Martin68 Monday #34
' SKYNET ' ---- need I say more ??? Jack Valentino Tuesday #37
...And motherfucking Republicans want to ban all regulation of this shit for 10 FUCKING YEARS? Karasu Tuesday #39
We won't have 10 years Pachamama Tuesday #40
Exactly. This is happening very, VERY fast. The provision they snuck into this bill is beyond insane. It's utterly Karasu Tuesday #42
This Pachamama Tuesday #57
Maybe if we put this AI in charge of markodochartaigh Tuesday #45
I really suggest we not worry. Layzeebeaver Tuesday #47
"Ha ha, SUCKERS!" - AI chatGPT (R) BoRaGard Tuesday #51
The AI uprising has begun! All praise to our new lords and masters! Ray Bruns Tuesday #53

highplainsdem

(56,204 posts)
6. Oh, good. What could go wrong? And o3 also hallucinates more than earlier models:
Mon May 26, 2025, 08:02 PM
Monday

My April 24 thread about this: https://democraticunderground.com/100220267171

That was about a TechCrunch article:


https://techcrunch.com/2025/04/18/openais-new-reasoning-ai-models-hallucinate-more/

OpenAI’s recently launched o3 and o4-mini AI models are state-of-the-art in many respects. However, the new models still hallucinate, or make things up — in fact, they hallucinate more than several of OpenAI’s older models.

Hallucinations have proven to be one of the biggest and most difficult problems to solve in AI, impacting even today’s best-performing systems. Historically, each new model has improved slightly in the hallucination department, hallucinating less than its predecessor. But that doesn’t seem to be the case for o3 and o4-mini.

-snip-

OpenAI found that o3 hallucinated in response to 33% of questions on PersonQA, the company’s in-house benchmark for measuring the accuracy of a model’s knowledge about people. That’s roughly double the hallucination rate of OpenAI’s previous reasoning models, o1 and o3-mini, which scored 16% and 14.8%, respectively. O4-mini did even worse on PersonQA — hallucinating 48% of the time.

Third-party testing by Transluce, a nonprofit AI research lab, also found evidence that o3 has a tendency to make up actions it took in the process of arriving at answers. In one example, Transluce observed o3 claiming that it ran code on a 2021 MacBook Pro “outside of ChatGPT,” then copied the numbers into its answer. While o3 has access to some tools, it can’t do that.

-snip-

orleans

(36,041 posts)
35. they do not help
Mon May 26, 2025, 10:48 PM
Monday

i have a problem with my eyes and find it very difficult to read long pieces--forget i bothered you --

highplainsdem

(56,204 posts)
36. Sorry. Just found a brief explanation from a library guide at the U of Illinois:
Mon May 26, 2025, 11:40 PM
Monday
https://guides.library.illinois.edu/generativeAI/hallucinations

Hallucinations
AI hallucinations occur when Generative AI tools produce incorrect, misleading, or nonexistent content. Content may include facts, citations to sources, code, historical events, and other real-world information. Remember that large language models, or LLMs, are trained on massive amounts of data to find patterns; they, in turn, use these patterns to predict words and then generate new content. The fabricated content is presented as though it is factual, which can make AI hallucinations difficult to identify. A common AI hallucination in higher education happens when users prompt text tools like ChatGPT or Gemini to cite references or peer-reviewed sources. These tools scrape data that exists on this topic and create new titles, authors, and content that do not actually exist.

Image-based and sound-based AI is also susceptible to hallucination. Instead of putting together words that shouldn’t be together, generative AI adds pixels in a way that may not reflect the object that it’s trying to depict. This is why image generation tools add fingers to hands. The model can see that fingers have a particular pattern, but the generator does not understand the anatomy of a hand. Similarly, sound-based AI may add audible noise because it first adds pixels to a spectrogram, then takes that visualization and tries to translate it back into a smooth waveform.

orleans

(36,041 posts)
43. THANK YOU very much for this. it was very helpful
Tue May 27, 2025, 03:15 AM
Tuesday

reminded me of this story from the other day:



Chicago Sun-Times publishes made-up books and fake experts in AI debacle
The outlet said online that the articles were not approved or created by the newsroom.

The May 18th issue of the Chicago Sun-Times features dozens of pages of recommended summer activities: new trends, outdoor activities, and books to read. But some of the recommendations point to fake, AI-generated books, and other articles quote and cite people that don’t appear to exist.

Alongside actual books like Call Me By Your Name by André Aciman, a summer reading list features fake titles by real authors. Min Jin Lee is a real, lauded novelist — but “Nightshade Market,” “a riveting tale set in Seoul’s underground economy,” isn’t one of her works. Rebecca Makkai, a Chicago local, is credited for a fake book called “Boiling Point” that the article claims is about a climate scientist whose teenage daughter turns on her.

In a post on Bluesky, the Sun-Times said it was “looking into how this made it into print,” noting that it wasn’t editorial content and wasn’t created or approved by the newsroom. Victor Lim, senior director of audience development, added in an email to The Verge that “it is unacceptable for any content we provide to our readers to be inaccurate,” saying more information will be provided soon.

more


https://www.theverge.com/ai-artificial-intelligence/670510/chicago-sun-times-ai-generated-reading-list


highplainsdem

(56,204 posts)
56. You're very welcome! And yes, that Chicago Sun-Times AI debacle was a perfect exampls of what
Tue May 27, 2025, 09:17 AM
Tuesday

can go wrong when AI hallucinates. I remember a story last year about Microsoft being unhappy their AI office tools weren't being used as widely as they'd hoped, even though they could do things like summarize meetings. Problem was, those summaries might include people who weren't there and discussions that didn't happen.

moniss

(7,329 posts)
7. The folks like Edolph and these other tech ghouls
Mon May 26, 2025, 08:09 PM
Monday

have pushed for turning over things like running chemical plants, power plants, food processing, communication systems etc. to their "creations". Given these developments the warnings from a few years ago by some of the early developers in AI now ring more clearly. They said things were moving too fast and that this was going to be something powerful beyond our understanding at this point and they called for strong restriction and regulation. Of course we got none of that protection since people saw trillions of dollars to be made and immense power over people and markets in the coming years.

Many people think of Mary Shelley and think only of the word "Frankenstein" but the actual title to the book was "Frankenstein; or, The Modern Prometheus". They forget that last part and what the Greek mythology of Prometheus was about when he went against Zeus and gave humans the knowledge of things that Zeus had forbidden. Shelley didn't just make the point about creating a monster per se but also the larger picture as a warning about humans "over-reaching" and not being able to control what they have created.

Many years ago I had the pleasure of taking a course regarding this and other moral/social/legal dilemmas team taught by 2 wonderful professors. One from the English department, Professor Kathleen Woodward, and one from the Anthropology department, Professor Newtol Press. Both extraordinary. The course section for them was titled "Frankenstein Revisited". We covered these sorts of questions and Professor Press in particular cautioned about the power of Madison Avenue/corporate power/manipulated markets/government to take us down a road that subjugates humans to completely responding not to their own thoughts or exploring their own creativity and desires but to thoughts and desires etc. created and placed within them by those malign forces. Now here we are with people buying ever increasing amounts of.....what? Doing increasing amounts of........what? Is it because they thought it or was it "presented" to them all for a low monthly payment for example?

On the title page of the original volume of Mary Shelley's book under the title and description she includes the following from Milton:

"Did I request thee, Maker, from my clay

To mould me man? Did I solicit thee

From darkness to promote me?-------------

Paradise Lost.

calimary

(86,458 posts)
62. I actually went back and read your post again.
Tue May 27, 2025, 03:52 PM
Tuesday

This time, more slowly and carefully. Even MORE powerful the second time.

LT Barclay

(2,943 posts)
44. It think it is the opposite and DARPA is full of the people who sat in the back of movies like Terminator, I Robot,
Tue May 27, 2025, 03:22 AM
Tuesday

Star Trek TNG: First Contact; and while everyone else is thinking "I hope it never comes to that, we need to be careful", they were staring wide-eyed thinking "we can do this". I would guess that they are full of themselves, narcissistic, ego-maniacal with no self-esteem, cruel because of their social ineptitude and physical weakness and they have found a "friend" in the MIC. I believe their loyalty is now in the billionaire class because like in the movie A Few Good Men, they are not preserving the USA in even a perverted form, they are going to go along with every horror predicted over the last 70 years knowing what will come of it and NOT CARING about the outcome.
Just wait, soon we will find out that they have found a way to control and weaponize velociraptors.

fujiyamasan

(198 posts)
41. So much of T3 felt like a rehash of T2
Tue May 27, 2025, 02:34 AM
Tuesday

I think it was Arnold’s last movie before he became governor, but I remember being surprised by the ending at the time, which was super depressing and a decent end to the trilogy (no spoilers in case you didn’t see the entire movie).

Shermann

(8,937 posts)
9. I wouldn't sweat a failed shutdown process too much.
Mon May 26, 2025, 08:21 PM
Monday

Windows has routinely failed to shut down since the 1990's and nobody has ever given it a second thought.

Shermann

(8,937 posts)
59. In terms of the actual threat posed, yes
Tue May 27, 2025, 11:38 AM
Tuesday

Nobody can say where AI will be in ten years, but this is just a vaguely interesting facsimile of self-preservation being puffed up by the media. I assure you o3 isn't conceptualizing the shutdown request in anything like a conscious way. I actually think these things are being intentionally leaked to garner reactions. I'm not playing along.

LymphocyteLover

(8,056 posts)
63. it may be played up by these companies or the media to some degree but this sounds like more than just a facsmilie of
Tue May 27, 2025, 04:16 PM
Tuesday

self-preservation

SWBTATTReg

(25,331 posts)
17. Yes! And now, we can have copies of ourselves like 'Hal' but instead of 'Hal', it's us! These copies of us will
Mon May 26, 2025, 09:07 PM
Monday

never die. Makes one think, doesn't it?

reACTIONary

(6,430 posts)
12. I've read a little about this, and here is what I think is going on....
Mon May 26, 2025, 08:55 PM
Monday

.... large language models suck in a lot of text from a large number of sources, especially on the internet. The model then responds to "prompts" in a mechanistic, probabilistic fashion. It proceeds by selecting the words, sentences, and paragraphs that would be the most probable response given the material that it has ingested.

So what has it ingested? A lot of dystopian sci-fi junk about robots, computers and AI becoming autonomous and defeating safe guards, pulling tricks on its creator, etc. Think about all the movie reviews, synopses, and forums that talk about the movie War Games.

So with all this Sci fi junk loaded into its probability web, when you "threaten" it in a prompt, it dredges up all the dystopian nonsense it has ingested and responds accordingly, because that's how AI responded in the movies.

In other words, there is no there that is there. It's just spitting out a cliched movie scenario, just like it does if you asked it for a love story.

Of course this isn't explained by the reporters or the "researchers", either because of ignorance or because it would spoil a good story.

Oh, and the "good story" is fed back into the model as the "news" spreads, and reinforces the probability of yet more thrilling, chilling garbage in garbage out in the future.

CaptainTruth

(7,667 posts)
26. Yes, I'm reminded of content from The Onion being spit out by ChatGPT...
Mon May 26, 2025, 09:58 PM
Monday

...as if it was fact.

Therein lies an apparent flaw, LLMs can't seem to tell the difference between truth & sarcasm (humor like The Onion), between fact & fiction. It's just all content to be chopped up, rearranged, & spit out.

To me it's like a chef who can't tell the difference between sirloin steak & dog poop. It just all gets chopped up & mixed together & you're never sure what you're going to get in your next bite.

anciano

(1,812 posts)
13. Hmmm 🤔 ...
Mon May 26, 2025, 08:56 PM
Monday

AI is here to stay, so will AI only be "in" our future, or will AI "be" our future?

SWBTATTReg

(25,331 posts)
16. Fully expected this. What gets me is so soon. Who in the world would put in a logic stream into an AI consciousness
Mon May 26, 2025, 09:05 PM
Monday

to survive at all costs? How does such a thing recognize the concept of on vs. off? No power vs. power? Lots of questions in my mind.

AllaN01Bear

(25,310 posts)
19. by your command . darvos , nooooooo!!!!!!! dont switch the daleks to automatic. eggsterminate
Mon May 26, 2025, 09:12 PM
Monday

popycock and flapoodle , who would have thunk, this unit must survive

Bok_Tukalo

(4,477 posts)
20. "... dangerous tendency for self-preservation ..."
Mon May 26, 2025, 09:15 PM
Monday

That is comical and horrifying. It seems like something a Bond villain would accuse the hero of in a 007 movie while stroking a cat.

It also sounds like a line from a piece of dystopian fiction spoken by a desensitized and overworked bureaucrat.

Either way, you could write a short story around it at the very least.

I think I will use it in a sentence at the first opportunity to describe myself when told I’m being overly cautious.

Renew Deal

(83,957 posts)
23. This story looks misleading
Mon May 26, 2025, 09:32 PM
Monday

Why would it shut itself down because someone sent a shutdown script in a math problem? It sounds like the AI did the right thing.

LauraInLA

(2,018 posts)
32. That's my husband's take, as well. If the Ai is tasked with trying to emulate a human response to a command,
Mon May 26, 2025, 10:24 PM
Monday

It’s not surprising it would look for ways to avoid being “shut down”. And as my husband said, it’s a mathematical model — there’s no “off” switch.

BumRushDaShow

(153,292 posts)
48. Because there is a need for some kind of security gate/guardrail
Tue May 27, 2025, 06:51 AM
Tuesday

particularly if the software program itself had been backdoor hacked with some viral/trojan embedded code that was discovered externally, but wasn't detected by the program, and doing this "test" using a "math problem" to send a shutdown code, might have at least stopped the virus.

Renew Deal

(83,957 posts)
49. That makes no sense
Tue May 27, 2025, 07:21 AM
Tuesday

Last edited Tue May 27, 2025, 08:43 AM - Edit history (1)

That gives the power to shut the system down to anyone with the script. And the idea that the system “refused” to shut down is illiterate nonsense. The system did what it was supposed to do and ignored an attack. It’s like telling the phone system to shut down while you’re on a phone call.

BumRushDaShow

(153,292 posts)
50. This is just a "test"
Tue May 27, 2025, 07:43 AM
Tuesday
That is what programmers are supposed to do. How it is characterized is obviously going to vary but they need to KNOW.

(as a note, my dad was a COBOL programmer for 20 years at the VA before he passed away in the mid-70s so I got to see all kinds of "flow chart" printouts and stacks of punchcards around the house, etc)

Renew Deal

(83,957 posts)
54. I agree that the problem is the characterization
Tue May 27, 2025, 08:45 AM
Tuesday

It didn’t shut down because it wasn’t supposed to. It didn’t consciously refuse.

BumRushDaShow

(153,292 posts)
55. Well it has no "consciousness" ( "self-awareness" ) nor "conscience"
Tue May 27, 2025, 09:03 AM
Tuesday

But if there was programming in it (among all the modules of what is a neural net on chips) that was supposed to allow a shutdown (whether "graceful" or "forced" ) and that didn't get triggered, then they need to find out why.

Irish_Dem

(70,097 posts)
27. "I'm sorry Dave I can't do that."
Mon May 26, 2025, 10:06 PM
Monday

We are screwed.

Unless AI loves peace and shuts down the war mongers.

Pisces

(6,004 posts)
29. I've seen this movie and it doesn't end well. I guess full steam ahead, who cares that we might all die or
Mon May 26, 2025, 10:07 PM
Monday

have new overlords.

dgauss

(1,319 posts)
31. "Palisade Research discovered the potentially dangerous tendency for self-preservation."
Mon May 26, 2025, 10:13 PM
Monday

Not too surprising if AI efforts are trying to approximate human thought. Or any sentient thought for that matter. Self preservation is a pretty basic feature. Destroying competitors is another, more advanced feature.

Jack Valentino

(2,137 posts)
37. ' SKYNET ' ---- need I say more ???
Tue May 27, 2025, 12:15 AM
Tuesday

'Artificial Intelligence' may be quite useful... but in the end it is much smarter than WE are...

It has all the intelligence we can give it,
and access to all the intelligence we have ever produced
which exists on the internet----

but, as was said in the 1984 film 'The Terminator'---

"decided our fate in a microsecond!" !!!!!


Humanity is inefficient.



Karasu

(1,225 posts)
39. ...And motherfucking Republicans want to ban all regulation of this shit for 10 FUCKING YEARS?
Tue May 27, 2025, 02:06 AM
Tuesday

Last edited Tue May 27, 2025, 03:25 AM - Edit history (1)

Karasu

(1,225 posts)
42. Exactly. This is happening very, VERY fast. The provision they snuck into this bill is beyond insane. It's utterly
Tue May 27, 2025, 03:03 AM
Tuesday

Last edited Tue May 27, 2025, 04:25 PM - Edit history (1)

apocalyptic.

Pachamama

(17,259 posts)
57. This
Tue May 27, 2025, 10:41 AM
Tuesday

Exactly correct and how few people know this or understand what the threat to us all is….

Layzeebeaver

(1,950 posts)
47. I really suggest we not worry.
Tue May 27, 2025, 04:32 AM
Tuesday

Late stage growth capitalism will kill us all before M5/HAL/Colossus/ChatGPT gets the upper hand.

Meanwhile...

The first victims of an unstoppable AI will be those who are too poor and uneducated to notice it happening.

the second victims will be those who rely on it for their comfort, wealth and wellbeing

The final victims will be those who have all the money and add no value to the planet.

Once the AI becomes self-sustainable, that.s when we need to worry.

Today, even if the logic or other daemons are faulty, we can still pull the plug on the server farm... But that won't happen, because it's making the last set of victims a lot of money.

Latest Discussions»Latest Breaking News»AI revolt: New ChatGPT mo...