Amazon and AWS Developers May Not Want To Invite Their CEOs To Java Code Reviews 47
theodp writes: Typos happen to the best of us, but spelling still counts when it comes to software development. So, it's kind of surprising to see that both Amazon CEO Andy Jassy and former AWS CEO Adam Selipsky failed to notice an embarrassing typo in a demo video they offered to their millions of followers on social media as evidence of Amazon Q AI's Java upgrade capabilities, which Amazon has been trumpeting for months in SEC filings, shareholder communication, and Amazon's latest earnings call with Wall Street analysts.
Just 37 seconds into the demo of the software that Amazon says saved it 4,500 developer-years of work and provided an additional $260M in annualized efficiency gains, Amazon Q kicks off the Java upgrade conversation by saying, "I can help you upgrade your Jave [sic] 8 and 11 codebases to Java 17." The embarrassing misspelling did prompt Twitter user @archo5dev to alert Jassy to the typo, but there's been no response yet from Jassy, who boasted that Amazon developers were unable to find any mistakes in Q's work in "79% of the auto-generated code reviews."
It's probably worth noting that both Jassy and Selipsky opted to showcase a drop-dead simple demo of Amazon Q Code Transformation rather than some of the lengthier and less-magical demos of the product.
Just 37 seconds into the demo of the software that Amazon says saved it 4,500 developer-years of work and provided an additional $260M in annualized efficiency gains, Amazon Q kicks off the Java upgrade conversation by saying, "I can help you upgrade your Jave [sic] 8 and 11 codebases to Java 17." The embarrassing misspelling did prompt Twitter user @archo5dev to alert Jassy to the typo, but there's been no response yet from Jassy, who boasted that Amazon developers were unable to find any mistakes in Q's work in "79% of the auto-generated code reviews."
It's probably worth noting that both Jassy and Selipsky opted to showcase a drop-dead simple demo of Amazon Q Code Transformation rather than some of the lengthier and less-magical demos of the product.
What's the big deal? (Score:5, Funny)
Developers misspell words all the time. In fact, trying to read what they write is like trying to read another language.
Re: (Score:2)
The big deal is CEOs being too dumb to have some experts review important slides.
Re: (Score:2)
Probably a Harfurd graduate.
Re: (Score:2)
The guy in boston with 12 items in the 10 items or less grocery lane. The clerk says "Either you're from MIT and can't read, or Harvard and can't count, but you're in the wrong line."
Re: (Score:2)
If you're from MIT, you use the self-checkout to avoid talking to a human.
Re: (Score:2)
It's an old joke, before self checkout. Which I hate using.
Re: (Score:2)
I do not see the point of it. All it does is give them my shopping history at a laughable discount. And it is slower in addition.
Re: (Score:2)
Or slides made on a plane on the way over. I actually was on a plane and after it got up to cruising altitude, the two guys in front of me pulled out their laptops and one said "Alright, let's get started on that presentation..."
Re: (Score:2)
I have done that as well. But it was on a transatlantic flight and I already had a sketch.
Re: (Score:2)
Would I bug a developer about it? No, probably not, but this isn't a human developer, it's a system that needs to better, more efficient, and correct. If we're going to trust AI, it can't make simple mistakes like this. Yet another AI system that has to go back to the drawing board / training.
Re: (Score:2)
Just how does the AI get called into the manager's office for the semi-annual performance review? What about the bi-weekly scrum retrospective? "Eliza, it looks like you're making a lot of simple errors again."
Re: What's the big deal? (Score:2)
The worst part is this is precisely the sort of "simple, but repetitive and mindnumbingly boring" review tasks that are error prone for humans, and we have decades of experience handling with traditional software.
This is less informative about how good the LLMM at doing LLMM-appropiate things. But if you can't be bothered to do anything but wrap the model as a black-box, I cannot be bothered to use and test that output.
Re: (Score:2)
And the AI's training data includes typos! And bugs, and design flaws, etc. The AI doesn't know how to program, it does not have logic, it's just doing a very good job of pattern matching based upon the TRAINING DATA. Garbage in, garbage out. Where does the training data come from? Well they're stealing other people's code - you stick it in the cloud and I suspect some of that gets looked at. There's a lot of public code repositories, github, gitlab, etc, that they get code from. But is that code the b
Re: (Score:2)
The big deal is that this illustrates that AI is not intelligent, but too many people have been fooled into thinking it is. The software simply did a statistical analysis of its data set, encountered "Jave 8" and "Jave 11" enough times to pass its heuristics tests, and included them for no other reason than they satisfied an equation or an inequality. Relying on these system to be accurate is extremely foolish, but all too many people in high places haven't realized this yet.
This is the state of AI, and it'
Re: (Score:2)
The big deal is that this illustrates that AI is not intelligent, but too many people have been fooled into thinking it is. The software simply did a statistical analysis of its data set, encountered "Jave 8" and "Jave 11" enough times to pass its heuristics tests, and included them for no other reason than they satisfied an equation or an inequality. Relying on these system to be accurate is extremely foolish, but all too many people in high places haven't realized this yet.
You must not work with humans much. Humans are not intelligent. Their brains just encountered more "statistical analysis of its data set"s. Give it 20 years to cook and it will appear more intelligent than "human machine" learning models, and be infinitely cloneable.
Frist (Score:2)
Strange and not strange. (Score:2)
Re: (Score:2)
Microsoft has had live demos crash, and Apple have faked "live demos" of products they'd not built yet.
The public get a good giggle, because the public don't actually think far enough ahead to realise that these sorts of failures are because the products being sold are defective.
Comment removed (Score:4, Insightful)
Re: (Score:2)
It seems bizarre and inexplicable(unless there really are just that many true believers in the approval path) that someone's carefully stage-managed hype-demo would go out with an obvious spelling mistake
Honestly, this typo makes it more believable. It really looks like an actual example of code, not a stage-managed fake demo.
Re: (Score:2)
Well if it's in the training data... LLM is just pattern matching for the next word, phrase, concept to go next in the sequence. You know somewhere out there is some comment in it with "jave".
I see this with humans too, when they're too reliant on IDE tools or such. The same variable misspelled one hundred times, because they keep accepting the wrong autocomplete. J, A, V, I know one guy that if when he has a very short unix prompt, and he has mispelled a word in it, even a short 4 letter word, he w
Scary (Score:3)
Anyone who thinks that such tools are anything more than gimics at this stage, when we know that they tend to leave massive security holes, is waving a red flag at the black hat community. And I don't need to tell anyone here that Amazon is vulnerable in three distinct areas - their online shopping, their automated warehouses, and their cloud.
I'm not saying it's inevitable, by any means, but if Russia or North Korea successfully damage the credibility of any one of those three, the impact across the economy won't be insignificant.
They're highly vulnerable targets and this is an exceptionally dangerous time (what with a war in and around Russia and an election in the US).
Now is when they should be doing a Manhattan Project Meets OpenBSD Strategy and nailing every last byte firmly to the ground.
But, no, they want to show off Nice Shiny Toys that can't actually work.
Slow day. (Score:5, Insightful)
Re: (Score:2)
It's news when it's about a tool that Amazon bragged that the tool was so good that "79% of the auto-generated code reviews without any additional changes." Maybe the reviewers didn't notice when things were misspelled?
Not wrong (Score:2)
It's a perfectly plausible plural of Java.
Re: (Score:2)
"It's a perfectly plausible plural of Java."
In what language?
Provide other examples of a word ending in -a that is pluralized by replacing the a with an e, please.
Re: Not wrong (Score:2)
But java plural would perhaps be javae?
Person -> People ? Hmmm...
Re: (Score:2)
Re: (Score:2)
*Adding* an e can kinda work; first declension Latin nouns work that way. *Replacing* the a with an e, not so much.
"Elvae"? I've always heard "Elvii". Neither one really works; it really should me "Elves", with the second e long, not silent.
Prepare for a tsunami... (Score:2)
...of awful, half-baked AI crap as CEOs rush to jump on the hype wagon
I'm optimistic that AI will eventually be useful, but the first wave of implementations will suck mightily
Oh, stewardess! (Score:4, Funny)
You Don't Understand (Score:1)
If asked, the man will say "so?" and then go on to destroy the lives of thousands more.
Of course it's unjust. That's the whole point.
Code typos (Score:3)
Jave Talkin' (Score:2)
With apologies to the Bee Gees [genius.com]: Jave talkin', you're telling me lies, yeah / Good lovin' still gets in my eyes / Nobody believe what you say / It's just your Jave talkin' that gets in the way
What is the upgrade? (Score:3)
It remains unclear: what exactly is the upgrade? You can run Java 8 code just fine under Java 17, so "do nothing" would work. Are they adding genetics? Lambdas? Streams? Or none of that?
If it's a trivial search handler place of deprecated functions, thenno one should be impressed. If it's a major code rewrite, well, no, that's not believable...
Re: (Score:2)
Maybe this is why so many of the code reviews required no changes! They just had a commit message that said "Upgraded from Java 8 to 17." Excellent, another flawless code review complete!
Re: (Score:2)
We have used it to upgrade some old code with a bunch of dependencies as as a trial and It did a good job, but mostly in the sense of doing something dull and uninventive quickly and well. Doing in an hour or so what would have probably taken a few days of boring iterative work for a developer otherwise.
Am I missing some context? (Score:2)
The AI said a"Jave" instead of "Java" so the whole thing should come tumbling down? Fuck off. In the pantheon of demo errors, this doesn't even deserve a mention.
This article isn't just unnecessary... it's stupid. Just like any potential user who walks away from the tech simply because of this one error.
Oh thats cute (Score:2)
At first I thought the reviews were 'auto' (Score:2)
The title felt like it was saying something else at first (that someone invited them accidentally). Then I thought the AI was doing code reviews on its own code.
I just wish we could make a tool to summarize better for us. Meaning read, understand, then rewrite (possibly customized for the reader/consumer of said content). Of course there goes a lot of wasted, paid for work. I'm thinking business and government jobs, not just 'journalists' that have become bloggers in reality. And don't get me started o
79% success rate (Score:2)
"Amazon developers were unable to find any mistakes in Q's work in "79% of the auto-generated code reviews."
Uhhhh, so 21% of the code reviews had mistakes? What kind of mistakes? Is that good?
I always enjoy some good old fashioned "this is a good number" statistics. For all we know those code reviews, if trusted, will lead to worse results than before.