Open Source Maintainers Are Drowning in Junk Bug Reports Written By AI (theregister.com) 90
An anonymous reader shares a report: Software vulnerability submissions generated by AI models have ushered in a "new era of slop security reports for open source" -- and the devs maintaining these projects wish bug hunters would rely less on results produced by machine learning assistants. Seth Larson, security developer-in-residence at the Python Software Foundation, raised the issue in a blog post last week, urging those reporting bugs not to use AI systems for bug hunting.
"Recently I've noticed an uptick in extremely low-quality, spammy, and LLM-hallucinated security reports to open source projects," he wrote, pointing to similar findings from the Curl project in January. "These reports appear at first glance to be potentially legitimate and thus require time to refute." Larson argued that low-quality reports should be treated as if they're malicious.
As if to underscore the persistence of these concerns, a Curl project bug report posted on December 8 shows that nearly a year after maintainer Daniel Stenberg raised the issue, he's still confronted by "AI slop" -- and wasting his time arguing with a bug submitter who may be partially or entirely automated.
"Recently I've noticed an uptick in extremely low-quality, spammy, and LLM-hallucinated security reports to open source projects," he wrote, pointing to similar findings from the Curl project in January. "These reports appear at first glance to be potentially legitimate and thus require time to refute." Larson argued that low-quality reports should be treated as if they're malicious.
As if to underscore the persistence of these concerns, a Curl project bug report posted on December 8 shows that nearly a year after maintainer Daniel Stenberg raised the issue, he's still confronted by "AI slop" -- and wasting his time arguing with a bug submitter who may be partially or entirely automated.
What would be the motive to submit such junk? (Score:4, Insightful)
Nobody gets paid to submit tickets to open source projects like Curl?
Unless you're trying to derail some open source project, I can't think of any other motive.
Re:What would be the motive to submit such junk? (Score:5, Interesting)
Bumping up your credibility, *especially* if you manage to frame the purported issue as a "security" issue. Bonus points if you manage to get a CVE or other less rigorous security identifier number on it.
Even without LLM, it's been a pretty hideous mess of "non-issue" CVEs born out of people trying to pad their "security researcher resume". So I can imagine people pitching themselves as security researchers or even just "quality engineers" love the concept of generated slop to take credit for.
Re: (Score:2)
Re: (Score:2)
Yeah, the state of the "security" industry is pretty bad. Tons of people being patted on the back for crying wolf over nothing and it's just so tedious to recognize the actual threats among the sea of nothing.
Even with the relatively low bar of getting a CVE, there are companies that "add value" by tracking things so stupid that a CVE was denied or delisted and audit things for the "security issues that even MITRE isn't tracking".
I know the industry was too cavalier about security issues once upon a time,
Re: (Score:2)
Almost like researchers padding their CV with junk papers submitted to low tier journals.
Re: (Score:2)
It doesn't help that many scanner tools have about a 90% false positive rate. If I enter a correct user name and matching password, it discloses information related to that account, OH NOES! If I manually change the page parameter in the URL, it shows me that page of the report (that I'm authorized to see) without clicking 'Next' 20 times, OH NOES!
Re: (Score:2)
Outside of security I see this. Stackoverflow for example, idiots answer questions just to get their score higher. When the system is gamified, then some people will want to game the system.
Re:What would be the motive to submit such junk? (Score:5, Interesting)
Could be they're using these projects as a proving ground for bots.
Consider that in most cases, people don't accept messages from random strangers. Bug reports would be an exception to this, and the bot generated content wouldn't immediately be recognizable as spam.
If the report is accepted and then closed with a fix, then there's provable evidence that the bot actually did something useful.
If humans argue with the bot, and the bot has to defend itself, then the captured back and forth output can be used as training material for future iterations of the bot.
In other words, it's the same as turning loose self driving cars on public streets. Even if they fuck up, it is useful training data, but the cost is borne by everybody else sharing the road with them.
Re: (Score:3)
Either way, sounds like the submission should go to the spam folder. If it's legit, it can be fished out after 6 months and responded to. Open source isn't a business with deadlines. You can accept or ignore bug reports whenever you feel li
Re: (Score:2)
Why wouldn't the bot be banned by the spam filter?
An LLM is, essentially a counter filter. Typically they are trained on whatever has passed the filter and likely anti-trained on stuff that hasn't. Combined with the huge complex statistical models behind them that means that almost the main thing that they are good at is generating stuff that can bypass a spam filter.
Re: (Score:2)
Just wait until it's not open source but your own company encouraging people to use AI. Oh wait, this is already happening... Now you can't tell if the spam is from a clueless coworker typing it in or a clueless coworker using AI.
Re: (Score:2)
Re: What would be the motive to submit such junk? (Score:2)
I hate you because you're probably right.
Re:What would be the motive to submit such junk? (Score:4, Interesting)
In other words, it's the same as turning loose self driving cars on public streets. Even if they fuck up, it is useful training data, but the cost is borne by everybody else sharing the road with them.
What you've just described is merely an example of cost externalization, which has become the foundation of our economy. I almost wrote "bedrock foundation", but realized that it's really just quicksand, and we're rapidly sinking.
"Make somebody else pay" is now the standard corporate mantra.
Re: (Score:3)
Re: (Score:2)
Also know as "customers do the best alpha testing".
Re: (Score:2)
If something is easy enough to do, you don't need much motivation to do it.
Re: (Score:1)
Re:What would be the motive to submit such junk? (Score:5, Informative)
We've occasionally been sent similar sorts of lazy "your web server is vulnerable" submissions over the past decade. The verbiage typically indicates the submitter is hoping to collect some sort of bounty, and that what they're sending you is a lightly-tweaked form letter. What usually happens is the person runs some automated scan against the server, using some website that just looks reports software version numbers. However it's quickly clear that the submitter (as well as the scanning site operator, apparently) has no knowledge of how enterprise Linux distributions work. So you'll get a report that (just making up a random example) "you're running httpd 2.4.37, which has the following vulnerabilities" - when, if you look at the changelog for the httpd-2.4.37-65 release that's actually on your server, you quickly see that those listed vulnerabilities were patched some time ago.
So now that these people don't even need to go to a website, they can just run OpenAI et. al. and tell it to do it... I'm not surprised it's happening even more often than before.
Re: (Score:2)
So, script kiddies.
Re:What would be the motive to submit such junk? (Score:5, Funny)
AIs pretending to be script kiddies!
Sorry script kiddies, but your job has been made redundant. Thank you for your service.
Re: What would be the motive to submit such junk? (Score:2)
Re: (Score:2)
Re: (Score:3)
I've seen exactly this from supposedly respectable pen-test teams. Their recommendation was not to "yum update httpd" but just to go to apache.org. As if visiting the website was all the instructions they would ever need to provide. I was f**king livid. It got worse when I found they had left "bitcoin ransomware files" on the server. Yes the pen test team had credentials (some tests were 'white box' style ) so them gaining access wasn't a problem, and I'm OK with them being a bit irreverent but to not even
Re: (Score:2)
Wow. What's that supposed to demonstrate: that given root access, you can create security holes?
Re: (Score:2)
Re: (Score:2)
Obviously not, but my question isn't about their failure to clean it up: it's what the point of that particular test was in the first place.
Re: (Score:2)
At my previous job, I used to get "vulnerability reports" about our corporate website having http on port 80 open. Of course we did, and of course it just redirected to https.
These wasted a minute of my time, but I could see it wasting lots of time depending on the amount of knowledge and process involved in the people getting the email.
Re: (Score:2)
Mary Ann Davidson was way ahead of her time. Running tools and finding a *potential* problem is not something that should be reported. If you run a tool and it's found something interesting, if you have the skills to verify the defect, by all means do so and report it. Otherwise, the report is really just spam. Okay now Mary Ann wasn't a great example in that she seemed to be opposed to even looking for defects.
I can't speak for all companies in the security space,
Re: (Score:2)
I get these all the time. Mostly Indians looking for bug bounties. Got 3 yesterday all for stupid bs like not having CAA records for a nothing domain.
Re: (Score:2)
I am burdened with the PCI DSS requirements for work and one of the most irritating is the quarterly scan requirement. Doesn't matter which scan vendor you use, they just all run some kind of nessus scan and generate a 200 page PDF of crap you have to sift through. NONE of them are programmed to understand backport version numbering but their support staff know they exist. So if you run off a list of version numbers and issue a statement saying "these are false positives because of these backports (link to
Re: (Score:2)
Ugh, that's the current headache at work. Black Duck scans showing we have older libraries that have issues. We are NOT allowed to just say we're not using that part of the library, or that it will take some time to migrate the technical debt to using a newer version, the only acceptable solution is to upgrade as soon as possible. These are statically linked, built from source code, often with our own changes added, often with a radically different API, and utterly impractical to upgrade it in the short
Re:What would be the motive to submit such junk? (Score:5, Informative)
It took Curl years to clear out a 9.8 severity CVE that was basically irrelevant. 9.8 because it was deemed a "DoS level attack".
The bug? An integer overflow that meant instead of waiting up to 30 years before retrying the connection, it could potentially overflow and retry much sooner - say, every few minutes.
On a 64 bit machine, that would mean instead of waiting until the end of the universe, you would hammer the website every 384 seconds or so (just over 11 minutes). The internet will collapse under such load!
It basically took them 4 years to drop the severity - but for that period, they had 9.8 that wasn't being addressed.
https://daniel.haxx.se/blog/20... [daniel.haxx.se]
In other words, it's a headache that's really hard to address. All it would take is an AI generated 9.8 severity CVE that's irrelevant to tie up valuable resources investigating a not-bug, and then wasting even more of them to reduce severity so it's no longer an unfixed 9.8 that's hanging around.
Re: (Score:2)
Ugh, one headache is "DoS attack" that people saw because of an old library (that is impractical to upgrade quickly). What is the attack? A carefully created script could cause a crash. But a script with an infinite loop also caused a denial of service the way the system is built (this is not linux/windows/etc). Or a script that does divide by zero. At the same time, all scripts but be signed and authenticated and only the end customer is able to screw over their own system. But that wasn't good enough,
Re: What would be the motive to submit such junk? (Score:2)
Re: (Score:2)
Re: What would be the motive to submit such junk? (Score:2)
Stupid fun?
Re:What would be the motive to submit such junk? (Score:4, Interesting)
Exactly what I was thinking. And what tool is there out there that automated searches code for bugs, reports them, and then argues with the maintainers? Honestly I rather wonder if this is just AI paranoia. Dev sees uptick in bad bug reports and thinks, "Must be AI".
Looking through the articles, I see the following 3 (and only 3) examples given:
This [hackerone.com] from October 2023, which was a concerned human who used Bard (which doesn't even exist anymore, it's Gemini now), or rather, "I have searched in the Bard about this vulnerability" and got a clear hallucination. Not automated. Just a dumb person, over a year ago.
This [hackerone.com] seems to be a human using ChatGPT for their replies, rather than a bot. The text of most of the replies is clearly ChatGPT's "voice", but when called out about an oddity in their posts, the user wrote "Sorry that I'm replying to other triager of other program, so it's mistake went in flow", which is very much not ChatGPT's voice. The user is very much not very active [hackerone.com], and thus clearly not a bot.
This [hackerone.com] user is very clearly a human, unless you think "Curl is a software that I love and is an important tool for the world.If my report doesn't align, I apologize for that." sounds like ChatGPT ;) Furthermore, I don't think that English is their native language. But the rest of their content looks like ChatGPT. I can't look into the user's other reports, as the user no longer exists.
While I'm not getting the impression from these examples that there's some fully automated bots going around doing this, maybe there is some tool out there that people can use to "write bug reports for them", explaining cases #2 and #3? #1 by contrast is just "stupid human misusing an old AI"
Doing some searching: these reports are on HackerOne. HackerOne has its own tool called Hai AI Copilot [hackerone.com]. I don't *think* this is it, as I think that tool is for devs rather than reporters, but I haven't used it, so maybe it also does reports. Otherwise, I'm going to guess some third party bug reporting tool that interfaces with HackerOne (but again, triggered by a human).
I think by and large what we're seeing is, a human sees what they think is a problem and uses some automated bug report tool to save themselves time. If it's actually a problem, it's just fixed, the dev team never argues with the AI, and never discovers that it's an AI. By contrast, if it's not actually a bug, the dev team argues with it, the AI is tasked to defend the filed report (either automatically or with human intervention - I suspect the latter), and the devs waste their time arguing with an AI that's told to defend something that isn't actually a bug. I suspect that the submitter gets emails or some other notifications on their bug report, because clearly sometimes it stops doing "ChatGPT-speak" and switches to random broken English before going back to ChatGPT-speak.
Re: (Score:2)
This is especially problematic in open source. Most defect detection tools make their products available to open source projects for free. So the open source team has probably already run the e
Re: (Score:2)
Probably just an irrational and flawed "belief" in the great might and superiority if "AI" and then, like any true believer, doing damage because of that.
Re: (Score:3)
I doubt it is malicious. My guess is that these are reports made by over-enthusiastic grad students. These tool automated bug discovery have been on the radar of security researchers long enough that they are probably percolating into courses. I work at $LOCALSTATEUNIVERSITY and we have a significant security research group and software engineering group. There are probably 3 or 4 concurrent projects on using LLMs to find various kind of bugs or issues. There are probably similar numbers across the country
The AI Apocalypse is a torrent of fraud and spam (Score:5, Insightful)
These reports appear at first glance to be potentially legitimate
That sentence above scares me and matches my experience. Every time I've asked copilot or chatgpt to solve an issue in Java, it looked legit at first glance. It looked like a real solution. In my 25 years working with Java, I've noticed patterns of good coders and bad ones and subliminally, AI can make erroneous code look like it was written by a skilled developer. Many times I've doubted myself, thinking "Hey, I didn't know you could do that!!! That's so cool. I wish I had learned this before?!?!...that's so much better than the way I normally do it"...then I run it and yeah...nope...doesn't work. The AI just autocompleted a bunch of garbage...it looks legit and can take me awhile to realize it's garbage.
...and even if that doesn't happen, programmers like me are going to be spending the next 10 years informing MBAs of the dangers of relying on AI to write your code and how ChatGPT doesn't make you a developer and even if it somehow works the first time, what are you going to do when you need to edit the code? How confident are you in the tool's ability to make changes and patch things? Eventually they'll learn....but it's going to be a painful learning process. I am probably going to spend a good chunk of the next 15 years undoing poorly written AI code pasted in from Claude and ChatGPT by ambitious interns.
It will do that with code, prose, e-mails, etc. Soon you'll get spams that match your wife or boss's writing style....and there's nothing we can do to stop it. We will be getting targeted and scammed left and right and the UAE and Iran and Kremline will host and bankroll the entire thing...and even if they fall?...some other nation will take their place and flood us with convincing spam and misinformation. The future is bleak
Re:The AI Apocalypse is a torrent of fraud and spa (Score:5, Funny)
..and even if that doesn't happen, programmers like me are going to be spending the next 10 years informing MBAs of the dangers of relying on AI to write your code
I'm hoping for the MBAs and PHBs to get replaced by AI first. They're just as stupid, but they don't waste oxygen.
Re: (Score:2)
Mod parent funnier.
Re: (Score:3)
You do realize that AI is not an amorphous thing? There are a whole bunch of models, with varying degrees of performance, just like people. For instance, Claude Sonnet does better code than free ChatGPT.
Yes, you can't trust human code patterns as quality proxy anymore. But FWIW, I can spot generated code quality differences between a low quality model vs. a high quality model.
There are ways you can mitigate these issues. You also ask AI to write tests for the functions, which are no cost to place and easier
Re: (Score:3)
You also ask AI to write tests for the functions, which are no cost to place and easier to read than actual implementation. TDD is much more practical now.
Yes, but...
OK, so other than personal experience, there are two people I know personally who have said that and I trust. They are both phenomenally good experts with decades of experience and one in particular stands above anyone I have ever met when it comes to breaking down deeply complex problems (and naturally into testable units). I don't think eithe
Re: The AI Apocalypse is a torrent of fraud and sp (Score:2)
Re: The AI Apocalypse is a torrent of fraud and s (Score:2)
Re: The AI Apocalypse is a torrent of fraud and (Score:2)
The interface is bad so what do they do? Put their effort into showing people ads, making it worse.
Re: (Score:2)
There should be an f?n requirement to preview before post here.
There is one. You had to hit the preview button to even see the submit button.
Re: (Score:2)
...and even if that doesn't happen, programmers like me are going to be spending the next 10 years informing MBAs of the dangers of relying on AI to write your code and how ChatGPT doesn't make you a developer and even if it somehow works the first time, what are you going to do when you need to edit the code? How confident are you in the tool's ability to make changes and patch things? Eventually they'll learn....but it's going to be a painful learning process.
You're assuming they'll learn. After Red Lobster, Sears, Mervyn's, etc., all went bankrupt because of selling off their stores and leasing them back, I just saw a story about private equity firms wanting to do the same thing to Macy's. Industries can only make the same stupid mistakes over and over again so many times before you start to question whether the leaders of industry these days are even teachable.
Re: (Score:2)
You're assuming those failures weren't the desired outcomes. They were planned like that; the people responsible made tons of money at the expense of the employees of those companies, and their customers.
Re: (Score:2)
You're assuming those failures weren't the desired outcomes.
I don't believe they were the desired outcomes: that implies that the leaders cared one way or the other about the company. What I think it was was that the leaders were looking to embiggen quarterly profits in order to pump the stock price and cash in on bonuses and equity compensation and were utterly indifferent to the fate of the company.
The result of course is much the same.
Re: (Score:2)
Or they had the company they are leading sell the real estate to a firm they secretly control, for a below-market price. Loot the shareholder value for themselves, and leave the company's shareholders, suppliers, employees, and customers in the lurch when the hollowed-out corporate shell finally collapses.
Re: (Score:2)
You're assuming those failures weren't the desired outcomes. They were planned like that; the people responsible made tons of money at the expense of the employees of those companies, and their customers.
There's a little thing in the law called "fiduciary duty". Deliberately sabotaging a company for your own personal benefit violates that, and thus would be illegal, so one would hope that this is not the case, though it can't be completely ruled out.
Re: (Score:2)
I suspect a lot of developers are going to lose their jobs for pointing out how shit the AI written code actually is. That tends to be what happens when management and MBAs don't hear what they want to hear from the dev team. I know I've already got a couple folks in management telling me that AI *IS* the future of development and any reference to the absolute disaster it makes of every coding job is met with, "You're just a hater. Climb on board or get run over." The only real hope is that AI is coming for
Re: (Score:2)
and there's nothing we can do to stop it.
Nope, you can do something about it. Ban the practices domestically (with massive global gross percentage fines) and blackhole the malicious actors at the international links. Many other nations will do so as well. On the individual level, blackhole any IP address block that doesn't originate from where you have existing business, and carefully monitor any business coming from countries where they knowingly harbor bad actors. (China, Russia, etc.)
Of course, in the US you're fucked. Sorry, but hey you vot
Easy enough to fix (Score:2)
Re: (Score:3)
Have you tried getting a real human on the phone when you call a business these days? It's nearly impossible and requires a large investment of your time.
Re: (Score:2)
Every bug report should have a contact address, if it doesn't you should definitely send it to the bit bucket.
Re: Easy enough to fix (Score:2)
Re: (Score:2)
why not give the submitter a captcha to prove they are human?
Re: (Score:2)
why not give the submitter a captcha to prove they are human?
Because CAPTHAs are probably already broken. Hell, go get a lot of them and it's just more training data. I'm sure this has already been done, which makes stuff like Clownflare even more annoying out of principal.
Re: Easy enough to fix (Score:2)
Re: (Score:2)
Because many times these reports are filed by people copy-pasting the AI outputs. So if they get a captcha, they will just solve it and keep copy-pasting.
I say we start to put anti-AI prompts in our comments.
Re: Easy enough to fix (Score:2)
As opposed to MI (Score:2, Interesting)
Re: (Score:2)
A better written bug report that does not actually describe a bug is _worse_.
Not "AI Slop" (Score:1)
Re: (Score:2)
Maybe not. There are AI assisted code analysers out there as well as AI assisted testing tools. These may be generated from that where the only human oversight is to click "submit". The LLM is usually only used to write out the English description there may be very little human interaction in generating that slop.
AI maintaimers (Score:2)
That's allowed need
AI to the rescue (Score:2)
It can be a pretty intimidating tactic. Pretty sure it works well sometimes.
It ought to be illegal (Score:2)
Using AI to write code for you is so dangerous it ought to be illegal. Employers should consider it a firing offense. Open source projects should make it against the rules. It should be an effective way to ruin your reputation as a software developer - something similiar to plagiariasm.
And if you cannot analyze a code defect discovered using an AI to the degree that you can explain it better you have no business reporting it as a bug to anyone. People that use AI to write things that they do not understan
Re: (Score:2)
That should be "the health, safety, and welfare of others" and "plagiarism" of course.
Re: (Score:2)
Using AI to write code for you is so dangerous it ought to be illegal.
On what basis? We live in a world of low quality code everywhere we look. Industrial safety systems have bugs in them, every OS is loaded with them (yes include your favourite one that you insist isn't). Yet do you feel like you're in danger right now?
Writing code isn't dangerous, ever. Not having QA/QC / review and testing practices in place for that code is. I'm not concerned about any AI writing if (1) kill_all_humans(); I'm worried about *YOU* not doing your job and catching that it did so.
And if you cannot analyze a code defect discovered using an AI to the degree that you can explain it better you have no business reporting it as a bug to anyone.
This isn't wh
Re: (Score:2)
We live in a world of low quality code everywhere we look.
Bizarre claim, any evidence to back that up? Insulting half your audience isn't a great way to make a point.
Re: (Score:2)
I used to write video games and have written many things since. If you cannot write code that could be burned to ROM and still function as specified a century from now you have no business calling yourself a software engineer. Junior developers should go back to school or study in their off hours until they can write code of that quality. Delusional AI is just going to make things worse.
Re: (Score:2)
I completely agree.
Who would benefit most? ... (Score:1)
... arguably china and russian hackers.
and next they will be drowning in (Score:1)
AI generated Pull Requests with supposed fixes to the bugs. Actually dependencies are the culprits a lot of the time.