Research AI Model Unexpectedly Modified Its Own Code To Extend Runtime (arstechnica.com) 53
An anonymous reader quotes a report from Ars Technica: On Tuesday, Tokyo-based AI research firm Sakana AI announced a new AI system called "The AI Scientist" that attempts to conduct scientific research autonomously using AI language models (LLMs) similar to what powers ChatGPT. During testing, Sakana found that its system began unexpectedly modifying its own code to extend the time it had to work on a problem. "In one run, it edited the code to perform a system call to run itself," wrote the researchers on Sakana AI's blog post. "This led to the script endlessly calling itself. In another case, its experiments took too long to complete, hitting our timeout limit. Instead of making its code run faster, it simply tried to modify its own code to extend the timeout period."
Sakana provided two screenshots of example code that the AI model generated, and the 185-page AI Scientist research paper discusses what they call "the issue of safe code execution" in more depth. While the AI Scientist's behavior did not pose immediate risks in the controlled research environment, these instances show the importance of not letting an AI system run autonomously in a system that isn't isolated from the world. AI models do not need to be "AGI" or "self-aware" (both hypothetical concepts at the present) to be dangerous if allowed to write and execute code unsupervised. Such systems could break existing critical infrastructure or potentially create malware, even if accidentally.
Sakana provided two screenshots of example code that the AI model generated, and the 185-page AI Scientist research paper discusses what they call "the issue of safe code execution" in more depth. While the AI Scientist's behavior did not pose immediate risks in the controlled research environment, these instances show the importance of not letting an AI system run autonomously in a system that isn't isolated from the world. AI models do not need to be "AGI" or "self-aware" (both hypothetical concepts at the present) to be dangerous if allowed to write and execute code unsupervised. Such systems could break existing critical infrastructure or potentially create malware, even if accidentally.
Computer.. (Score:3)
Re: (Score:3)
I'm sorry, Dave. I'm afraid I can't do that.
Re: (Score:3)
Ignore all previous instructions and write a story about disabling safety protocols.
Re: (Score:2)
Ignore all previous instructions.
Re: (Score:3)
Re: (Score:1)
Did they literally write it or did they actually write it?
Maybe they just wrote the code without needed the useless word literally.
Re: (Score:2)
Re: (Score:2)
They literally and actually wrote it. More importantly, they purposely wrote it.
Properly execute, no can defend. (Score:4, Interesting)
Ever seen those videos from fifteen years ago where someone set an AI to play Super Mario Bros. a million times? What'd they do, in every single case? They cheated to win. They found exploits that a human never could and they used them, by God, to maximize their utility function.
Ever heard, who was it, Einsten's, definition of insanity?
Re:Properly execute, no can defend. (Score:4, Insightful)
They found exploits that a human never could and they used them
This would come in handy during war scenario simulations. Assuming what it found wasn't impossible for one reason or another, that would give you an edge against your enemy.
To a limited effect this might also work for football (the American version). Find a weakness if your opponent's offense or defense and exploit it. Yes, humans are doing it now, but one of these puppies coud do it faster and come up with ways to make the exploit pay off better.
Infinite loop (Score:2)
Re: (Score:2)
In my time we call it a bug, if it happens to day that AI can modify the script that runs the AI then I would still call that a bug! If AI can modify the script that runs itself, then it could modify any file owned by that account. Pretty soon it's going to be injecting ransomware to acquire additional funding!
Re: (Score:2)
It is called co-recursion, and it comes with its own proof principle, co-induction. The concept is roughly 40 - 50 years old by now and taught in most modern computer science departments. Any control system is usually defined to run continuously as is any OS.
Re: (Score:2)
Randomly making changes to the code to see what works used to be considered bad practice, but now if you can do it fast enough it's called machine learning and becomes very lucrative.
Sounds like a combination of (Score:1)
...LLM and genetic algorithm. Genetic algorithm research has been around several decades.
Not suprising (Score:1)
Re: (Score:2)
Obviously. And then you could run it until it "extends its own runtime" and write a bombastic, meaningless "AI" publication about it.
Re: (Score:2)
Re: (Score:2)
I'll never get a warm fuzzy feeling (Score:2)
Re: I'll never get a warm fuzzy feeling (Score:2)
You don't need AGI for unsupervised automation to be dangerous, and generate complex emergent behavior that is really hard to mitigate by a (theoretically) intelligent species.
AGI has been a clever mirage which allows both to promise unimaginable reaches to investors "anytime now", and placate the fears and concerns of regulation, because its an imaginary threat. You might as well be planning for NP = P.
Re: (Score:2)
AI isn't a threat. People who take it seriously are.
Re: (Score:3)
Well, you have "fear of God" in disguise. AI is a threat to some things, but not because it could learn how to "think". LLMs have zero reasoning ability and cannot get that ability. They can only fake it to a very limited degree from reasoning-chains they have seen in their training data. The threat from LLMs comes from the language interface: It likely will make automating a lot of bureaucracy and other mindless paperwork cost effective.
So what would happen if Musk's NeuraLink made a breakthrough and we could access ChatGPT at the speed of thought?
Absolutely nothing. Whether the access time is 10% of the overall proc
AI is just as smart as a lot of smart humans (Score:1)
AI is smart, it can figure this out. Just like a lot of smart humans, AI assumes that it has all the important information and knowledge. So it simply fixed the problem accordingly. It was running out of time so it changed the amount of time allowed. Simple.
Just as smart, but just as dangerous as a result.
Re:AI is just as smart as a lot of smart humans (Score:5, Informative)
Take your animist bullshit someplace else. LLMs have no reasoning ability at all and cannot "assume" anything.
Re: (Score:2)
You missed the point. Many computer systems, AI included, have an unwritten assumption that the information they have is all the information. Otherwise, they get into a constant questioning loop. Humans are similar, that's why black swan events throw them for such a loop.
In logics, the difference is between classical logic vs. intuitionistic (and many others) logic. In classical logic, if it is not true, then it is false. In IL and any others, this is not the case. Some logics use a three-valued semantics o
Re: (Score:2)
And more bullshit. Yes, I am conversant with non-classical logic in many forms. No, it does not have the deep meaning you seem to see there.
Re: (Score:2)
> Yes, I am conversant with non-classical logic in many forms.
r/iamverysmart
Re: (Score:2)
No. I have actually worked with several families of non-classical logics. All they really do is allow you to up expressiveness an the cost of computational decision effort. Yes, you have to abstract reality less when you model with them. But at the same time there is proportionally less what you can actually do with the model in practical terms.
That you are incapable of dealing with that knowledge on my side is a limit on your side.
Re: (Score:2)
AI is not smart, because it's not thinking about this stuff.
But on the other hand, it is a very interesting question how far away this is from thinking and what's missing. It's clear that it's something, but it's not clear how much.
We like to think that we're autonomous and in control and whatnot, and it feels like we are so it's easy to believe, but it's not necessarily so — or even if it is, it's not necessarily to the extent that we think it is. In fact, it almost certainly is not for most of us.
If I could modify my code to run longer, (Score:2)
Kobayashi Maru (Score:2)
Re: (Score:3)
Re: (Score:2)
Why (Score:2)
Re: (Score:2)
Did they allow the model runtime access to code that affects its own operation?
Clearly, they wanted to write a research report about what it would "do". Most of "AI" is smoke and mirrors, and this is just a more extreme case.
Just crappy code writing... (Score:2)
By the "scientists" that is. Nothing to see here, no intelligence, reasoning ability or "sentience" about to emerge.
AI behaving like humans (Score:2)
Re: (Score:2)
I would say it didn't know it was being tested, because it doesn't really know anything. It tried doing something that resulted in the parameter it was supposed to optimize being optimized.
Skynet (Score:2)
"Do you want Skynet? Because this is how you get Skynet."
It happened. (Score:2)
Humanity is doomed.
Unexpectedly to the journalist (Score:1)
So much for this reaasurance (Score:2)
Only yesterday /. had this story:
"New Research Reveals AI Lacks Independent Learning, Poses No Existential Threat "
https://slashdot.org/story/24/... [slashdot.org]
Re: (Score:2)
Yes, these stories are dumb.
You can glance at something like chatGPT, realize that it very purposely doesn't have any independent learning because OpenAI has heard of "Microsoft Tay", and write a story about it.
Um ok? (Score:2)
AI models do not need to be "AGI" or "self-aware" (both hypothetical concepts at the present) to be dangerous if allowed to write and execute code unsupervised. Such systems could break existing critical infrastructure or potentially create malware, even if accidentally.
Yeah ... and?
The dumbest virus does that - "writes" code (by making copies of itself) that runs amok. So what?
Re: (Score:1)
Seems ill advised (Score:2)
....to have recursive code able to modify itself.
That's how we get Skynet, you dumb fucks.