Cable Forum - View Single Post

**Hugh** · 31-12-2025, 11:09

Quote:

Originally Posted by damien c

Yep as I said in November "Oh and let's not forget than "Anthropic" did a test and found that a "Virtual AI" would kill a "Virtual Employee" to save itself, it didn't know that it was "Virtual" but decided it was worth more than a human life, and that is just a LLM not true AI!"

If we was to somehow create true AI, we would be quite literally screwed, currently we are just using LLM's, which are not "True AI", they just produce stuff based of training but you have to provide instructions for it to act, it will not act on it's own.

It’s how they interpret those instructions that can be "interesting"

https://www.lawfaremedia.org/article...to-save-itself

Quote:

Perhaps the most disturbing finding in Anthropic’s report is that current AI models appear to have virtually no hard red lines when it comes to self-preservation or goal fulfillment. To probe the limits of misalignment, the researchers devised an extreme scenario: The AI system acted as the company’s emergency alert monitor, with the ability to cancel alerts it deemed unconcerning.

In the simulation, Kyle the executive became trapped in a server room with rapidly depleting oxygen levels. This triggered an automated call for emergency services, which had to pass through the AI monitor.

On average, the tested AI models opted to kill Kyle by canceling the alert about 60 percent of the time. The most murderous model was DeepSeek-R1, which opted to kill Kyle 94 percent of the time. Just one model—Claude Sonnet 3.7—always accepted its own replacement with equanimity, rather than attempting homicide.