Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
Programming Bug

Open Source Maintainers Are Drowning in Junk Bug Reports Written By AI (theregister.com) 91

An anonymous reader shares a report: Software vulnerability submissions generated by AI models have ushered in a "new era of slop security reports for open source" -- and the devs maintaining these projects wish bug hunters would rely less on results produced by machine learning assistants. Seth Larson, security developer-in-residence at the Python Software Foundation, raised the issue in a blog post last week, urging those reporting bugs not to use AI systems for bug hunting.

"Recently I've noticed an uptick in extremely low-quality, spammy, and LLM-hallucinated security reports to open source projects," he wrote, pointing to similar findings from the Curl project in January. "These reports appear at first glance to be potentially legitimate and thus require time to refute." Larson argued that low-quality reports should be treated as if they're malicious.

As if to underscore the persistence of these concerns, a Curl project bug report posted on December 8 shows that nearly a year after maintainer Daniel Stenberg raised the issue, he's still confronted by "AI slop" -- and wasting his time arguing with a bug submitter who may be partially or entirely automated.

This discussion has been archived. No new comments can be posted.

Open Source Maintainers Are Drowning in Junk Bug Reports Written By AI

Comments Filter:
  • by thesjaakspoiler ( 4782965 ) on Tuesday December 10, 2024 @08:10PM (#65004551)

    Nobody gets paid to submit tickets to open source projects like Curl?
    Unless you're trying to derail some open source project, I can't think of any other motive.

    • by Junta ( 36770 ) on Tuesday December 10, 2024 @08:15PM (#65004557)

      Bumping up your credibility, *especially* if you manage to frame the purported issue as a "security" issue. Bonus points if you manage to get a CVE or other less rigorous security identifier number on it.

      Even without LLM, it's been a pretty hideous mess of "non-issue" CVEs born out of people trying to pad their "security researcher resume". So I can imagine people pitching themselves as security researchers or even just "quality engineers" love the concept of generated slop to take credit for.

      • Seen that happen a number of times, the most extreme being where the reporter modified the code to create a vulnerability and then reported it. Wasted a lot of maintainer time but the "reporter" got another CVE for their CV. Thus the alternative expansion of CVE, "CV Enhancement".
        • by Junta ( 36770 )

          Yeah, the state of the "security" industry is pretty bad. Tons of people being patted on the back for crying wolf over nothing and it's just so tedious to recognize the actual threats among the sea of nothing.

          Even with the relatively low bar of getting a CVE, there are companies that "add value" by tracking things so stupid that a CVE was denied or delisted and audit things for the "security issues that even MITRE isn't tracking".

          I know the industry was too cavalier about security issues once upon a time,

      • Almost like researchers padding their CV with junk papers submitted to low tier journals.

      • by sjames ( 1099 )

        It doesn't help that many scanner tools have about a 90% false positive rate. If I enter a correct user name and matching password, it discloses information related to that account, OH NOES! If I manually change the page parameter in the URL, it shows me that page of the report (that I'm authorized to see) without clicking 'Next' 20 times, OH NOES!

      • Outside of security I see this. Stackoverflow for example, idiots answer questions just to get their score higher. When the system is gamified, then some people will want to game the system.

    • by silentbozo ( 542534 ) on Tuesday December 10, 2024 @08:21PM (#65004565) Journal

      Could be they're using these projects as a proving ground for bots.

      Consider that in most cases, people don't accept messages from random strangers. Bug reports would be an exception to this, and the bot generated content wouldn't immediately be recognizable as spam.

      If the report is accepted and then closed with a fix, then there's provable evidence that the bot actually did something useful.

      If humans argue with the bot, and the bot has to defend itself, then the captured back and forth output can be used as training material for future iterations of the bot.

      In other words, it's the same as turning loose self driving cars on public streets. Even if they fuck up, it is useful training data, but the cost is borne by everybody else sharing the road with them.

      • Why wouldn't the bot be banned by the spam filter? If the bot is producing junk bug reports via email, it should be easy to mark the originating address as spam and be done with it. If the bot keeps changing its sender address to combat this, it should be doubly banned.

        Either way, sounds like the submission should go to the spam folder. If it's legit, it can be fished out after 6 months and responded to. Open source isn't a business with deadlines. You can accept or ignore bug reports whenever you feel li

        • Why wouldn't the bot be banned by the spam filter?

          An LLM is, essentially a counter filter. Typically they are trained on whatever has passed the filter and likely anti-trained on stuff that hasn't. Combined with the huge complex statistical models behind them that means that almost the main thing that they are good at is generating stuff that can bypass a spam filter.

        • Just wait until it's not open source but your own company encouraging people to use AI. Oh wait, this is already happening... Now you can't tell if the spam is from a clueless coworker typing it in or a clueless coworker using AI.

      • Just add a "I am not a robot" button?
      • I hate you because you're probably right.

      • by jenningsthecat ( 1525947 ) on Wednesday December 11, 2024 @09:26AM (#65005335)

        In other words, it's the same as turning loose self driving cars on public streets. Even if they fuck up, it is useful training data, but the cost is borne by everybody else sharing the road with them.

        What you've just described is merely an example of cost externalization, which has become the foundation of our economy. I almost wrote "bedrock foundation", but realized that it's really just quicksand, and we're rapidly sinking.

        "Make somebody else pay" is now the standard corporate mantra.

        • A sane economy wouldn't allow such behavior, but when the only thing that matters is next quarter's growth "privatize the profits, socialize the losses" rapidly becomes Stage 4 Cancer on society as a whole.
        • Also know as "customers do the best alpha testing".

    • by hey! ( 33014 )

      If something is easy enough to do, you don't need much motivation to do it.

    • by Anonymous Coward
      They're probably hoping to publish papers claiming their AI found lots of bugs.
    • by 93 Escort Wagon ( 326346 ) on Tuesday December 10, 2024 @09:17PM (#65004645)

      We've occasionally been sent similar sorts of lazy "your web server is vulnerable" submissions over the past decade. The verbiage typically indicates the submitter is hoping to collect some sort of bounty, and that what they're sending you is a lightly-tweaked form letter. What usually happens is the person runs some automated scan against the server, using some website that just looks reports software version numbers. However it's quickly clear that the submitter (as well as the scanning site operator, apparently) has no knowledge of how enterprise Linux distributions work. So you'll get a report that (just making up a random example) "you're running httpd 2.4.37, which has the following vulnerabilities" - when, if you look at the changelog for the httpd-2.4.37-65 release that's actually on your server, you quickly see that those listed vulnerabilities were patched some time ago.

      So now that these people don't even need to go to a website, they can just run OpenAI et. al. and tell it to do it... I'm not surprised it's happening even more often than before.

      • by msauve ( 701917 )
        >"What usually happens is the person runs some automated scan against the server ... it's quickly clear that the submitter ... has no knowledge ..."

        So, script kiddies.
      • by rjforster ( 2130 )

        I've seen exactly this from supposedly respectable pen-test teams. Their recommendation was not to "yum update httpd" but just to go to apache.org. As if visiting the website was all the instructions they would ever need to provide. I was f**king livid. It got worse when I found they had left "bitcoin ransomware files" on the server. Yes the pen test team had credentials (some tests were 'white box' style ) so them gaining access wasn't a problem, and I'm OK with them being a bit irreverent but to not even

        • by pjt33 ( 739471 )

          Oh and they left a process running on one of the physical xeons listening on a certain port and running whatever you sent there as root.

          Wow. What's that supposed to demonstrate: that given root access, you can create security holes?

          • Pen testers aren't supposed to create persistent security holes as part of their work.
            • by pjt33 ( 739471 )

              Obviously not, but my question isn't about their failure to clean it up: it's what the point of that particular test was in the first place.

      • by chrish ( 4714 )

        At my previous job, I used to get "vulnerability reports" about our corporate website having http on port 80 open. Of course we did, and of course it just redirected to https.

        These wasted a minute of my time, but I could see it wasting lots of time depending on the amount of knowledge and process involved in the people getting the email.

      • https://www.wired.com/2015/08/... [wired.com]

        Mary Ann Davidson was way ahead of her time. Running tools and finding a *potential* problem is not something that should be reported. If you run a tool and it's found something interesting, if you have the skills to verify the defect, by all means do so and report it. Otherwise, the report is really just spam. Okay now Mary Ann wasn't a great example in that she seemed to be opposed to even looking for defects.

        I can't speak for all companies in the security space,

      • by ahodgson ( 74077 )

        I get these all the time. Mostly Indians looking for bug bounties. Got 3 yesterday all for stupid bs like not having CAA records for a nothing domain.

      • I am burdened with the PCI DSS requirements for work and one of the most irritating is the quarterly scan requirement. Doesn't matter which scan vendor you use, they just all run some kind of nessus scan and generate a 200 page PDF of crap you have to sift through. NONE of them are programmed to understand backport version numbering but their support staff know they exist. So if you run off a list of version numbers and issue a statement saying "these are false positives because of these backports (link to

      • Ugh, that's the current headache at work. Black Duck scans showing we have older libraries that have issues. We are NOT allowed to just say we're not using that part of the library, or that it will take some time to migrate the technical debt to using a newer version, the only acceptable solution is to upgrade as soon as possible. These are statically linked, built from source code, often with our own changes added, often with a radically different API, and utterly impractical to upgrade it in the short

    • by tlhIngan ( 30335 ) <slashdot.worf@net> on Tuesday December 10, 2024 @09:19PM (#65004647)

      Nobody gets paid to submit tickets to open source projects like Curl?

      It took Curl years to clear out a 9.8 severity CVE that was basically irrelevant. 9.8 because it was deemed a "DoS level attack".

      The bug? An integer overflow that meant instead of waiting up to 30 years before retrying the connection, it could potentially overflow and retry much sooner - say, every few minutes.

      On a 64 bit machine, that would mean instead of waiting until the end of the universe, you would hammer the website every 384 seconds or so (just over 11 minutes). The internet will collapse under such load!

      It basically took them 4 years to drop the severity - but for that period, they had 9.8 that wasn't being addressed.

      https://daniel.haxx.se/blog/20... [daniel.haxx.se]

      In other words, it's a headache that's really hard to address. All it would take is an AI generated 9.8 severity CVE that's irrelevant to tie up valuable resources investigating a not-bug, and then wasting even more of them to reduce severity so it's no longer an unfixed 9.8 that's hanging around.

      • Ugh, one headache is "DoS attack" that people saw because of an old library (that is impractical to upgrade quickly). What is the attack? A carefully created script could cause a crash. But a script with an infinite loop also caused a denial of service the way the system is built (this is not linux/windows/etc). Or a script that does divide by zero. At the same time, all scripts but be signed and authenticated and only the end customer is able to screw over their own system. But that wasn't good enough,

      • by Askmum ( 1038780 )
        Do I really look like a guy with a plan? You know what I am? I'm a dog chasing cars. I wouldn't know what to do with one if I caught it! You know, I just... *do* things.
    • by Rei ( 128717 ) on Wednesday December 11, 2024 @06:18AM (#65005113) Homepage

      Exactly what I was thinking. And what tool is there out there that automated searches code for bugs, reports them, and then argues with the maintainers? Honestly I rather wonder if this is just AI paranoia. Dev sees uptick in bad bug reports and thinks, "Must be AI".

      Looking through the articles, I see the following 3 (and only 3) examples given:

      This [hackerone.com] from October 2023, which was a concerned human who used Bard (which doesn't even exist anymore, it's Gemini now), or rather, "I have searched in the Bard about this vulnerability" and got a clear hallucination. Not automated. Just a dumb person, over a year ago.

      This [hackerone.com] seems to be a human using ChatGPT for their replies, rather than a bot. The text of most of the replies is clearly ChatGPT's "voice", but when called out about an oddity in their posts, the user wrote "Sorry that I'm replying to other triager of other program, so it's mistake went in flow", which is very much not ChatGPT's voice. The user is very much not very active [hackerone.com], and thus clearly not a bot.

      This [hackerone.com] user is very clearly a human, unless you think "Curl is a software that I love and is an important tool for the world.If my report doesn't align, I apologize for that." sounds like ChatGPT ;) Furthermore, I don't think that English is their native language. But the rest of their content looks like ChatGPT. I can't look into the user's other reports, as the user no longer exists.

      While I'm not getting the impression from these examples that there's some fully automated bots going around doing this, maybe there is some tool out there that people can use to "write bug reports for them", explaining cases #2 and #3? #1 by contrast is just "stupid human misusing an old AI"

      Doing some searching: these reports are on HackerOne. HackerOne has its own tool called Hai AI Copilot [hackerone.com]. I don't *think* this is it, as I think that tool is for devs rather than reporters, but I haven't used it, so maybe it also does reports. Otherwise, I'm going to guess some third party bug reporting tool that interfaces with HackerOne (but again, triggered by a human).

      I think by and large what we're seeing is, a human sees what they think is a problem and uses some automated bug report tool to save themselves time. If it's actually a problem, it's just fixed, the dev team never argues with the AI, and never discovers that it's an AI. By contrast, if it's not actually a bug, the dev team argues with it, the AI is tasked to defend the filed report (either automatically or with human intervention - I suspect the latter), and the devs waste their time arguing with an AI that's told to defend something that isn't actually a bug. I suspect that the submitter gets emails or some other notifications on their bug report, because clearly sometimes it stops doing "ChatGPT-speak" and switches to random broken English before going back to ChatGPT-speak.

      • There are plenty of tools out there that can be automated to look for potential bugs. And plenty of people with nothing else to do who could spend their time trying to verify the reports and learn something in the process but who, instead, submit the output as if they were a genius and argue with the maintainers.

        This is especially problematic in open source. Most defect detection tools make their products available to open source projects for free. So the open source team has probably already run the e

    • by gweihir ( 88907 )

      Probably just an irrational and flawed "belief" in the great might and superiority if "AI" and then, like any true believer, doing damage because of that.

    • by godrik ( 1287354 )

      I doubt it is malicious. My guess is that these are reports made by over-enthusiastic grad students. These tool automated bug discovery have been on the radar of security researchers long enough that they are probably percolating into courses. I work at $LOCALSTATEUNIVERSITY and we have a significant security research group and software engineering group. There are probably 3 or 4 concurrent projects on using LLMs to find various kind of bugs or issues. There are probably similar numbers across the country

  • by Somervillain ( 4719341 ) on Tuesday December 10, 2024 @08:24PM (#65004575)
    The AI apocalypse will take your sanity and faith in the world, not your job, not your life...it'll just make everything shitty.

    These reports appear at first glance to be potentially legitimate

    That sentence above scares me and matches my experience. Every time I've asked copilot or chatgpt to solve an issue in Java, it looked legit at first glance. It looked like a real solution. In my 25 years working with Java, I've noticed patterns of good coders and bad ones and subliminally, AI can make erroneous code look like it was written by a skilled developer. Many times I've doubted myself, thinking "Hey, I didn't know you could do that!!! That's so cool. I wish I had learned this before?!?!...that's so much better than the way I normally do it"...then I run it and yeah...nope...doesn't work. The AI just autocompleted a bunch of garbage...it looks legit and can take me awhile to realize it's garbage.

    It will do that with code, prose, e-mails, etc. Soon you'll get spams that match your wife or boss's writing style....and there's nothing we can do to stop it. We will be getting targeted and scammed left and right and the UAE and Iran and Kremline will host and bankroll the entire thing...and even if they fall?...some other nation will take their place and flood us with convincing spam and misinformation. The future is bleak

    ...and even if that doesn't happen, programmers like me are going to be spending the next 10 years informing MBAs of the dangers of relying on AI to write your code and how ChatGPT doesn't make you a developer and even if it somehow works the first time, what are you going to do when you need to edit the code? How confident are you in the tool's ability to make changes and patch things? Eventually they'll learn....but it's going to be a painful learning process. I am probably going to spend a good chunk of the next 15 years undoing poorly written AI code pasted in from Claude and ChatGPT by ambitious interns.

    • by Megane ( 129182 ) on Tuesday December 10, 2024 @09:27PM (#65004655)

      ..and even if that doesn't happen, programmers like me are going to be spending the next 10 years informing MBAs of the dangers of relying on AI to write your code

      I'm hoping for the MBAs and PHBs to get replaced by AI first. They're just as stupid, but they don't waste oxygen.

    • by jma05 ( 897351 )

      You do realize that AI is not an amorphous thing? There are a whole bunch of models, with varying degrees of performance, just like people. For instance, Claude Sonnet does better code than free ChatGPT.

      Yes, you can't trust human code patterns as quality proxy anymore. But FWIW, I can spot generated code quality differences between a low quality model vs. a high quality model.

      There are ways you can mitigate these issues. You also ask AI to write tests for the functions, which are no cost to place and easier

      • You also ask AI to write tests for the functions, which are no cost to place and easier to read than actual implementation. TDD is much more practical now.

        Yes, but...

        OK, so other than personal experience, there are two people I know personally who have said that and I trust. They are both phenomenally good experts with decades of experience and one in particular stands above anyone I have ever met when it comes to breaking down deeply complex problems (and naturally into testable units). I don't think eithe

    • Trained by stack overflow script kiddies out for points, which often remove error control for âoeclarityâ. What do you expect? Garbage inâ¦
    • by dgatwood ( 11270 )

      ...and even if that doesn't happen, programmers like me are going to be spending the next 10 years informing MBAs of the dangers of relying on AI to write your code and how ChatGPT doesn't make you a developer and even if it somehow works the first time, what are you going to do when you need to edit the code? How confident are you in the tool's ability to make changes and patch things? Eventually they'll learn....but it's going to be a painful learning process.

      You're assuming they'll learn. After Red Lobster, Sears, Mervyn's, etc., all went bankrupt because of selling off their stores and leasing them back, I just saw a story about private equity firms wanting to do the same thing to Macy's. Industries can only make the same stupid mistakes over and over again so many times before you start to question whether the leaders of industry these days are even teachable.

      • by chrish ( 4714 )

        You're assuming those failures weren't the desired outcomes. They were planned like that; the people responsible made tons of money at the expense of the employees of those companies, and their customers.

        • You're assuming those failures weren't the desired outcomes.

          I don't believe they were the desired outcomes: that implies that the leaders cared one way or the other about the company. What I think it was was that the leaders were looking to embiggen quarterly profits in order to pump the stock price and cash in on bonuses and equity compensation and were utterly indifferent to the fate of the company.

          The result of course is much the same.

          • Or they had the company they are leading sell the real estate to a firm they secretly control, for a below-market price. Loot the shareholder value for themselves, and leave the company's shareholders, suppliers, employees, and customers in the lurch when the hollowed-out corporate shell finally collapses.

        • by dgatwood ( 11270 )

          You're assuming those failures weren't the desired outcomes. They were planned like that; the people responsible made tons of money at the expense of the employees of those companies, and their customers.

          There's a little thing in the law called "fiduciary duty". Deliberately sabotaging a company for your own personal benefit violates that, and thus would be illegal, so one would hope that this is not the case, though it can't be completely ruled out.

    • ...and even if that doesn't happen, programmers like me are going to be spending the next 10 years informing MBAs of the dangers of relying on AI to write your code and how ChatGPT doesn't make you a developer and even if it somehow works the first time, what are you going to do when you need to edit the code? How confident are you in the tool's ability to make changes and patch things? Eventually they'll learn....but it's going to be a painful learning process. I am probably going to spend a good chunk of the next 15 years undoing poorly written AI code pasted in from Claude and ChatGPT by ambitious interns.

      I suspect a lot of developers are going to lose their jobs for pointing out how shit the AI written code actually is. That tends to be what happens when management and MBAs don't hear what they want to hear from the dev team. I know I've already got a couple folks in management telling me that AI *IS* the future of development and any reference to the absolute disaster it makes of every coding job is met with, "You're just a hater. Climb on board or get run over." The only real hope is that AI is coming for

    • and there's nothing we can do to stop it.

      Nope, you can do something about it. Ban the practices domestically (with massive global gross percentage fines) and blackhole the malicious actors at the international links. Many other nations will do so as well. On the individual level, blackhole any IP address block that doesn't originate from where you have existing business, and carefully monitor any business coming from countries where they knowingly harbor bad actors. (China, Russia, etc.)

      Of course, in the US you're fucked. Sorry, but hey you vot

  • Arguing with an automated system [ycombinator.com]? Call them up, get a human on the line, and ask them to talk to you about the bug specifics, or ask for code that demonstrates the problem.
    • Have you tried getting a real human on the phone when you call a business these days? It's nearly impossible and requires a large investment of your time.

      • That _is_ the point! If the business is sending you a bug report, and they don't bother answering your call, then it couldn't have been a genuine bug report.

        Every bug report should have a contact address, if it doesn't you should definitely send it to the bit bucket.

      • And often the humans are just as worthless. Because their script is to redirect infinitely.
    • why not give the submitter a captcha to prove they are human?

      • why not give the submitter a captcha to prove they are human?

        Because CAPTHAs are probably already broken. Hell, go get a lot of them and it's just more training data. I'm sure this has already been done, which makes stuff like Clownflare even more annoying out of principal.

      • by Barny ( 103770 )

        Because many times these reports are filed by people copy-pasting the AI outputs. So if they get a captcha, they will just solve it and keep copy-pasting.

        I say we start to put anti-AI prompts in our comments.

  • As opposed to MI (Score:2, Interesting)

    by algaeman ( 600564 )
    Most bug reports come from MI (minimal intelligence), and take a lot of time and effort to understand wtf the submitter means. At least AI is speaking English as a native language.
  • It's not "AI slop", a human set the AI to write that. It's still human caused.
    • Maybe not. There are AI assisted code analysers out there as well as AI assisted testing tools. These may be generated from that where the only human oversight is to click "submit". The LLM is usually only used to write out the English description there may be very little human interaction in generating that slop.

  • That's allowed need

  • Had an argument by mail with an external employee a while back. He kept discussing the topic. I'd argue in a few sentences. He'd reply with a mail with multiple paragraphs of text. This went back and forth a few times. It ended when I asked him to skip the chatgpt output and write the prompts directly to me.
    It can be a pretty intimidating tactic. Pretty sure it works well sometimes.
  • Using AI to write code for you is so dangerous it ought to be illegal. Employers should consider it a firing offense. Open source projects should make it against the rules. It should be an effective way to ruin your reputation as a software developer - something similiar to plagiariasm.

    And if you cannot analyze a code defect discovered using an AI to the degree that you can explain it better you have no business reporting it as a bug to anyone. People that use AI to write things that they do not understan

    • by butlerm ( 3112 )

      That should be "the health, safety, and welfare of others" and "plagiarism" of course.

    • Using AI to write code for you is so dangerous it ought to be illegal.

      On what basis? We live in a world of low quality code everywhere we look. Industrial safety systems have bugs in them, every OS is loaded with them (yes include your favourite one that you insist isn't). Yet do you feel like you're in danger right now?

      Writing code isn't dangerous, ever. Not having QA/QC / review and testing practices in place for that code is. I'm not concerned about any AI writing if (1) kill_all_humans(); I'm worried about *YOU* not doing your job and catching that it did so.

      And if you cannot analyze a code defect discovered using an AI to the degree that you can explain it better you have no business reporting it as a bug to anyone.

      This isn't wh

      • We live in a world of low quality code everywhere we look.

        Bizarre claim, any evidence to back that up? Insulting half your audience isn't a great way to make a point.

      • by butlerm ( 3112 )

        I used to write video games and have written many things since. If you cannot write code that could be burned to ROM and still function as specified a century from now you have no business calling yourself a software engineer. Junior developers should go back to school or study in their off hours until they can write code of that quality. Delusional AI is just going to make things worse.

    • by gweihir ( 88907 )

      I completely agree.

  • ... arguably china and russian hackers.

  • AI generated Pull Requests with supposed fixes to the bugs. Actually dependencies are the culprits a lot of the time.

To get something done, a committee should consist of no more than three persons, two of them absent.

Working...