Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
Programming Security

AI-Generated Code Creates Major Security Risk Through 'Package Hallucinations' (arstechnica.com) 23

A new study [PDF] reveals AI-generated code frequently references non-existent third-party libraries, creating opportunities for supply-chain attacks. Researchers analyzed 576,000 code samples from 16 popular large language models and found 19.7% of package dependencies -- 440,445 in total -- were "hallucinated."

These non-existent dependencies exacerbate dependency confusion attacks, where malicious packages with identical names to legitimate ones can infiltrate software. Open source models hallucinated at nearly 22%, compared to 5% for commercial models. "Once the attacker publishes a package under the hallucinated name, containing some malicious code, they rely on the model suggesting that name to unsuspecting users," said lead researcher Joseph Spracklen. Alarmingly, 43% of hallucinations repeated across multiple queries, making them predictable targets.

AI-Generated Code Creates Major Security Risk Through 'Package Hallucinations'

Comments Filter:
  • by Dan East ( 318230 ) on Tuesday April 29, 2025 @03:30PM (#65340715) Journal

    You are not hallucinating - this story is a dupe [slashdot.org]

  • As important as generating code is testing it to ensure that it does what it is supposed to do. Who/what writes these tests ? This has got to be by people who really understand the problem that the code is supposed to address. So given a set of inputs, what are the expected outputs ? [ I do understand that this is a simplistic description. ] I would be wary of using AI to generate test cases -- if it hallucinates then what are you testing ?

    Another question: who writes the end user documentation ?

    I am assumi

    • Sounds like the out of branch commit issue. Git had many complaints early on about hallucinations aka itdontwork. Well, you just versioned some software while ignoring the relevant commits.......
    • As important as generating code is testing it to ensure that it does what it is supposed to do. Who/what writes these tests ?

      From what I've heard talking to people, one of the most common uses of AI is to generate the tests.

      • by micheas ( 231635 )

        As important as generating code is testing it to ensure that it does what it is supposed to do. Who/what writes these tests ?

        From what I've heard talking to people, one of the most common uses of AI is to generate the tests.

        That's because in many companies the primary purpose of tests is so that you can tell auditors and hence customers that your code has x% test coverage. With AI you can hit that checkbox of 100% test coverage with AI tests that are meaningless, but allow you to get the auditor seal of approval that you have good test coverage.

    • by TWX ( 665546 )

      As important as generating code is testing it to ensure that it does what it is supposed to do. Who/what writes these tests ? This has got to be by people who really understand the problem that the code is supposed to address. So given a set of inputs, what are the expected outputs ? [ I do understand that this is a simplistic description. ]

      I used to test software for a living, alpha stuff right out of the daily builds.

      It was my experience that it took out-of-the-box thinking to come up with real-world tests that accurately reflected both how the software was intended to be used by its developer and ways that someone could misuse it that were actually plausible. I was working on communications protocols because apparently the company lawyers were afraid of BSD licensed code so they wouldn't let the project take existing software. I leveraged

  • So, you're saying that AI code is even shittier than first year programmers straight out of the Code Boot Camp? Because I have yet to see one of those that doesn't at least make sure a library exists before referencing it. Even the really, really bad ones.

    I know, I know. AI is gonna take all programming jobs any day now. And sadly, it'll probably happen because management would rather have shit code than pay and benefits for employees. The contractor cleanup gigs a few years later will be nice paying, I'm s

  • by Anonymous Coward

    That's what SHE said.

  • I wonder if it was related to this as devs blindly don't read their dependencies.
  • If you create a malicious package and advertise it enough, doesn't this happen without AI?

    • "Re: what is stopping the AI from testing dependencies?" -- Who knows why these systems do anything? They are black boxes.
      • Why don't they hallucinate grammar or vocabulary?

        ChatGPT says: "LLMs are much better at plausible surface-level generation than verifiable grounded reference, especially in niche domains like package names or APIs."

        • They sometimes do, but that is what LLMs are primarily trained on. They understand in a mechanical way words and grammar, but they have no idea what they really mean. They can string them together in sentences, but that is all they really do.
  • Must be some really good stuff if it makes a computer trip ;-D
  • Hallucinations are random. Recidivism is non-random. Repeating hallucinations across queries for different users imply it's not a hallucination, but a purposeful interpretation of data through logical error or intentional results poisoning.
  • ...this study is already obsolete. No, I am not kidding. AI code generatiion makes huge improvementns in just 2-3 months, whereas the development, peer review and publication of academic papers takes 6-12 months.

    In short, the models they test in the paper are basically ancient history.

Our business is run on trust. We trust you will pay in advance.

Working...