
Stack Overflow Data Reveals the Hidden Productivity Tax of 'Almost Right' AI Code (venturebeat.com) 55
Developers are growing increasingly frustrated with AI coding tools that produce deceptively flawed solutions, according to Stack Overflow's latest survey of over 49,000 programmers worldwide. The 2025 survey exposes a widening gap between AI adoption and satisfaction: while 84% of developers now use or plan to use AI tools, their trust has cratered.
Only 33% trust AI accuracy today, down from 43% last year. The core problem isn't broken code that developers can easily spot and discard. Instead, two-thirds report wrestling with AI solutions that appear correct but contain subtle errors requiring significant debugging time. Nearly half say fixing AI-generated code takes longer than expected, undermining the productivity gains these tools promise to deliver.
Only 33% trust AI accuracy today, down from 43% last year. The core problem isn't broken code that developers can easily spot and discard. Instead, two-thirds report wrestling with AI solutions that appear correct but contain subtle errors requiring significant debugging time. Nearly half say fixing AI-generated code takes longer than expected, undermining the productivity gains these tools promise to deliver.
AI does scut work well. (Score:5, Insightful)
But it does everything else horribly.
Think of it as a partly trained intern. Tell it to do something it has done before or something really simple and it does a good job. Then you start to trust it and think it is smart, so you give it more and more.
When it fails, it does not come forward and ask for help. Instead it panics, lies, makes up crap, and covers up it's failures.
Re: AI does scut work well. (Score:3)
I literally today got code for a parser from Copilot that did "if lines begins with '%'
Ship It! (Score:4, Funny)
I literally today got code for a parser from Copilot that did "if lines begins with '%' ... else if line begins with '%%' ...".
It compiles. Ship it!
Re: (Score:3, Insightful)
It compiles. Ship it!
Hasn't that been the way Microsoft has worked since 1975?
Re: (Score:2, Informative)
Re: (Score:3)
Monopolies are profitable. Invest in monopolies.
It does though (Score:1, Interesting)
Well, if you use something like Cursor or its competitors, you can perfectly well include comprehensive test writing, compilation, linting and loops until the code passes your every requirement.
It can still get stuck in some deeply worrying loops, go tumbling down a rabbit hole, or make a hundred additional classes and functions to solve a simple problem, but if your prompts and rules are good enough, you'll get your grunt work done exceedingly quickly.
TDD is king with AI tooling. Ensure you read and unders
Re: (Score:2)
Also: very important to leverage instructions. Ensure that every chat/agent session starts with the AI reading a file containing all your instructions on how it should behave. It makes the world of difference. Write Once, Read Many.
Re: (Score:2)
I ask it to show all the code after each query and paste the code and try to run it between of course. It will give many hel
Re: (Score:1)
It entirely depends on what language you are working in, which is expected due to it's "intelligence" coming from scraping the internet. It seems to be very good at python.
Re: (Score:2)
Re: AI does scut work well. (Score:4, Interesting)
It delivered a working function. It gets all files in an S3 bucket. But, only if the bucket contains less than 1000 objects. Fortunately I already knew how this API works, and this was my first test of the AI to see what it would deliver. It failed.
The S3 getObject function is paged. The function the "AI" wrote only delivers the first 1000 objects in the bucket, so if the bucket has more than 1000 objects then this is a pretty bad bug. And it would totally ship unless the human in charge knows the specifics about the API. A junior using the "AI" isn't going to realize this. They will test it with a bucket that has a few objects, and it will be seen to be working. They'll submit the PR, and someone may approve it because they probably aren't going to be reading the API docs when they look at the PR to know that the API is paged. Then when there finally are more than 1000 items in the bucket in a production environment, a customer will probably find the bug.
Sure, if I ask the "AI" to write a function that uses paging and is recursive, the "AI" will deliver that result because someone has already written it and the "AI" has vacuumed up that code to spit it out again, but you have to be so specific about what you want that you could have just written it yourself, with certainty that you are getting the result you need without creating a subtle time-bomb of bug.
How is "AI" really helping anyone with outputs like this? It's hurting more than helping in many situations.
Just like everything else in the world, now programming is in a race to the bottom.
Re: (Score:2)
Compare that with having to ask to some young human trying to learn and grow, who wants to deal with those creatures ? it's gross, I suppose
So anyway, as an expert, totally capable of writing the code yourself, you can be
Re: (Score:2)
Re:AI does scut work well. (Score:5, Insightful)
A partly trained intern would learn and get better. An AI model keeps making the same mistakes, again and again and again.
Re: (Score:2)
The intern also likely burns a lot less resources.
Re: (Score:2)
The 100W intern includes the inefficient energy conversion from renewable fuel. The GPU uses 200W after the conversion, so factor in power plant inefficiencies and transmission losses. The GPU may or may not be using renewable fuel.
Perhaps what we really need is to find a way to get human cells to produce chloroplasts so Intern 2.0 can be direct solar powered.
In addition, on the other end of the operation, GPUs add to the growing e-waste problem while interns are biodegradable.
Re: (Score:2)
The "human body" can get the "100W" (citation?) of energy from any number of clean sources, or our modern unhealthy food. It can also self-replicate without trillions of dollars of infrastructure to do so, and has no trouble replicating in small single-unit batches.
Re: (Score:2)
Re:AI does scut work well. (Score:4, Funny)
Re: (Score:2)
Yup, first semester intern that needs absolute explicit instructions. The only immediate benefit is that they type really really fast
Re: (Score:2)
I would not say that AI does scut work well. I have a case that is trivial to do, you could even hire a first grader to do it, but, AI does it with 90% accuracy, when 100% accuracy would be needed.
Instead what AI does really well is work where accuracy does not matter. AI is good solution when 90% accuracy is good enough for you, but if you don't want any mistakes in your data, you should not use AI to make it. Good example of such work is writing proof on concept code. Something that you use to test your i
Re: (Score:2)
Such a surprise (Score:5, Insightful)
Nobody saw that one coming...
Re:Such a surprise (Score:4, Informative)
programmers have been saying it for years - it takes far more time to review and debug code than it does to write it in the first place.
why this would be a surprise to anyone, I can't even imagine.
Re: (Score:2)
It's a variation on the old truism about project management:
80% of the project takes 80% of the time. The remaining 20% also takes 80% of the time.
Re: (Score:2)
Well, yeah - I've seen AI actually write decent code... without error conditions, or bounds checking, but following the happy path, it really does work. Still needs to learn how people with bad intentions get in, but, you know, the happy path code is serviceable. Can save me a few hours, but I still need to program in sad paths. Don't know if that is more time or less time, really, but it kind of negates itself, I still need to code review and fix shit.
Re: Such a surprise (Score:2)
Re: (Score:2)
Look at the Crowdstrike disaster for what "following the happy path" does. Incidentally, the happy path is 10% of the work and it is the easiest part.
Re: Such a surprise (Score:2)
Re: (Score:2)
Nope. Pair programming works for simple things, but completely fails for complex ones, because the second person gets pulled in and loses mental independency.
Re: Such a surprise (Score:2)
Re: (Score:2)
I did not. But you missed what I was actually saying.
Re:Such a surprise (Score:4, Insightful)
It plays into the leisure class's lifelong dream of being able to jettison the unwashed masses and keep the money for themselves without having to resort to learning how to do things or (God forbid) doing things themselves.
Most of us "unwashed masses" understand that things that sound too good to be true probably are, but that's because we haven't grown up in a world where we give orders, shuffle a couple of decimal points, and then sign our names to take credit for the hard work of thousands of people.
Re: (Score:2)
Re: (Score:3)
We'll never find out. Tech magazines don't interview people living under a bridge fighting raccoons for food scraps behind the McDonalds...
Re: (Score:2)
I guess we can put off those unemployment applications a little while longer.
Personal anecdote... (Score:2)
I told ChatGPT I wanted it to implement a particular open source Java interface using a specific major version of a dependency. It mixed imports from the previous and the current major versions. Obviously... that's a problem when the major release is a major rewrite of the public API.
I asked it specifically "restrict to version X.Y.Z," it confirmed it was going to do that, then went right back to generating mixed major release code.
Wasn't a problem for me. Took 5 minutes to debug with IntelliJ's decompiler
"undermining the productivity gains" (Score:2)
You mean I can't get something for nothing? Gee, what good is it then?
Claude Code is a Slot Machine (Score:5, Interesting)
Claude Code is a Slot Machine" [rgoldfinger.com]: "I'm guessing that part of why AI coding tools are so popular is the slot machine effect. Intermittent rewards, lots of waiting that fractures your attention, and inherent laziness keeping you trying with yet another prompt in hopes that you don't have to actually turn on your brain after so many hours of being told not to. The exhilarating power of creation. Just insert a few more cents, and you'll get another shot at making your dreams a reality."
This has been my experience exactly (Score:5, Interesting)
I am not a developer but I do write Perl and PowerShell scripts when the need arises.
I usually learn as I go and enjoy the process of figuring out how best to turn my problem into a well-functioning automation.
Sometimes, though, I would just like to get the LLM to output something really easy, like getting a list of all users in my company and running a simple process on them. It's something I could figure out and write in maybe an hour of web searching and document reading.
What I get though, is broken with invalid cmdlet parameters or not optimized like not using built-in filtering and relying on client-side filtering.
I am spending more time figuring out what is broken and doing the research to do things properly anyway. So, in the end, I haven't saved any time.
Maybe I am just not good at writing the prompts in the first place.
Re: (Score:3)
For such scripts it is less prompt skill (heck, that whole "prompting skills" is overrated) but about giving the spec correctly and completely and using a model that knows the programming language you want to use well. Your script sound like something that ChatGPT could do one year ago without too much work on the prompts, at least with python and bash, no idea about powershell.
What's the issue? (Score:5, Insightful)
This is how humans operate. They can produce code, but there are subtle flaws which are revealed only through debugging.
AI is being trained on stuff produced by humans. Why would you expect it to be any different?
Re: What's the issue? (Score:3)
I have wondered if we should be building a âoethoroughly debugged codeâ repository, where lots of programmers vet the code, and then see what happens with an LLM trained only on that.
Re: (Score:1)
Because billion/trillion-dollar companies are telling us it is different. The news cycles are constantly banging on about how AI is going to revolutionize everything, and put half of society, or more, out of a job. We've been hearing it constantly for over a year now. You're telling me it's my fault for believing any of it?
Re: (Score:2)
AI code is worse than human dumbness. I'll tell Copilot to refactor a function in a specific way, and it does it, but then embeds the new function in the old one, rather than replacing it. That's not stupidity that a human would produce.
But "the issue" is that a lot of people bought the hype that AI was the new master of the universe, and would take over all our jobs in short order. I think we still have some breathing room.
As expected (Score:4, Interesting)
There is a big difference between creating a cool, simple demo for youtube and writing solid, bulletproof code
I use AI tools to guide me through complex and confusing documentation, but always check the doc to make sure
I use AI tools to create sample code that I study and then write my own version once I understand the sample
The fiction of claiming that non-programmers can effortlessly "vibe code" complex systems is dangerous
Same as it ever was (Score:1)
Vetting and maintenance have always been the bottlenecks of software development, not code creation. RAD pushers keep selling clueless bosses on the creation part. RAD pushers have been around for more than 5 decades.
(RAD can be done right, and reasonably flexible, but one has to accept certain conventions. They may be good conventions, but people are spoiled and want it their way.)
Consider the source (Score:2)
My experience with LLMs is spectacular, but I work within their capabilities and don't expect them to do my job for me.
Almost Right AI code better than ... (Score:2)
Ya, but I'm guessing it's better than Really Right AI code from MechaHitler [rollingstone.com]. :-)
Progress... Two steps forward, one goosestep back. :-)
Have computers ever really made anything easier? (Score:2)
But sure, AI will save the world - just like computers turn a 100 person company into a 10000 person company.... to do the same tasks, but those initial 100 people now have it easy.
Selection Bias (Score:1)
Developers WHO USE STACKOVERFLOW are growing increasingly frustrated with AI coding tools
Maybe the ones who are proficient using AI tools don't visit StackOverflow anymore.
Garbage In / Garbage Out (Score:2)
Meanwhile, back in March- (Score:3)
DOGE To Rewrite SSA Codebase In 'Months' (wired.com)
https://developers.slashdot.or... [slashdot.org]