Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
Programming

Morgan Stanley Says Its AI Tool Processed 9 Million Lines of Legacy Code This Year And Saved 280,000 Developer Hours (msn.com) 87

Morgan Stanley has deployed an in-house AI tool called DevGen.AI that has reviewed nine million lines of legacy code this year, saving the investment bank's developers an estimated 280,000 hours by translating outdated programming languages into plain English specifications that can be rewritten in modern code.

The tool, built on OpenAI's GPT models and launched in January, addresses what Mike Pizzi, the company's global head of technology and operations, calls one of enterprise software's biggest pain points -- modernizing decades-old code that weakens security and slows new technology adoption. While commercial AI coding tools excel at writing new code, they lack expertise in older or company-specific programming languages like Cobol, prompting Morgan Stanley to train its own system on its proprietary codebase.

The tool's primary strength, the bank said, lies in creating English specifications that map what legacy code does, enabling any of the company's 15,000 developers worldwide to rewrite it in modern programming languages rather than relying on a dwindling pool of specialists familiar with antiquated coding systems.

Morgan Stanley Says Its AI Tool Processed 9 Million Lines of Legacy Code This Year And Saved 280,000 Developer Hours

Comments Filter:
  • by DjangoShagnasty ( 453677 ) on Wednesday June 04, 2025 @09:06AM (#65426857)
    There is going to be chaos.
    • Do you think someone could post lots of garbage code on the internet, the AI bot would Hoover it up to train its model, and then the AI would emit garbage?

      • I think they already did and the AI already does.
      • by Hadlock ( 143607 )

        Evaluation is a lot less computationally complex than generation; I would expect anything that gets scraped by a larger AI company goes through at least a two pass system 1) is this even relevant/compiling code? 2) does this meet a minimum standard? and probably for largest companies building a library of quality content 3) is this high enough quality to include in the library for future training? Steps 1/2 can probably be done with something like a 30b or 70b model fairly quickly and step 3 is probably eva

  • by ardmhacha ( 192482 ) on Wednesday June 04, 2025 @09:06AM (#65426859)

    "plain English specifications that can be rewritten in modern code."

    So just the easy bit left to :)

    • by ElimGarak000 ( 9327375 ) on Wednesday June 04, 2025 @09:19AM (#65426901)
      At least they didn't trust the AI to write the code and screw it up. This actually seems like a relatively careful, appropriate use of AI, of which there seems to be fairly little.
    • by rsilvergun ( 571051 ) on Wednesday June 04, 2025 @09:52AM (#65426971)
      Seeing this much grunt work eliminated is a huge deal.

      Remember they don't have to replace all of us, if they replace 20% of us it will create massive amounts of unemployment and cause wages to plummet because of labor oversupply.

      We need to pull our heads out of the sand right now. We need to get over the shit we learned when we were 12. This is happening and it's going to affect us badly.

      And there is no economic solution to this. We can't all just go start new businesses because nobody's going to give us any capital and even if they did bigger companies would just run us out of business.

      And we can't all be plumbers and welders and HVAC. The world just doesn't need that many of them and beyond that we need white collar guys hiring those blue car guys to fix stuff.

      You can't take 20% maybe even 30% of White collar employees out of the equation and still have a functioning economy in 2025. Keep in mind we spent the last 50 years automating every factory we could so there are a hell of a lot less Blue collar jobs and then they're used to be.
      • Yes, a lot of programming jobs are probably on the line. But that doesn't mean that 20% of ALL white collar jobs are. Think doctors, accountants, lawyers, teachers, engineers (as in designers) etc, etc.

        A friend in accountancy reckons that the quality of accountants has fallen in recent years - probably because the possible candidates have become IT staff. If AI does what it promises, a lot of intelligent people will be released to start serving their communities in new ways. Ok, that the optimistic spin, an

        • by jp10558 ( 748604 )

          Not to sound really stupid here, but what do accountants do that a whole set of traditional programs haven't already automated away that LLMs that can replace coders can't?

          I have limited understanding of being an accountant, but it seems like there's the already replaced manual version of double entry accounting, and then there's reading and applying rules to digital data. Then there's forensic accounting, looking for patterns in data and the LLMs claim to be real good at that too.

          • Most of what can be automated in accountancy has been, though you might be right about some parts of forensic. However the auditing role, creating management accounts and projections, finding optimal tax strategies, and, at its most basic, checking expenses claims, are not easily automated. Auditing is especially hard to automate; it's about checking physical stock are correct, claims for income, invoices etc are conforming to the rules, and just asking management hard questions about obscure corners.

      • You can't take 20% maybe even 30% of White collar employees out of the equation and still have a functioning economy in 2025. Keep in mind we spent the last 50 years automating every factory we could so there are a hell of a lot less Blue collar jobs and then they're used to be.

        I think that for the parasite class that's a feature, not a bug.

        Everything the current administration is doing seems directed at both reducing the population and more tightly controlling average citizens:
        - Cutbacks to medicare and medicaid
        - Cutbacks to Social Security, while simultaneously making it much harder to actually collect
        - Cutbacks at NOAA such that after-hours severe-weather warnings don't go out
        - Putting Kennedy in charge of health
        All of the above, and about a zillion other government action

    • "plain English specifications that can be rewritten in modern code."

      So just the easy bit left to :)

      That's an important bit. When I have to work on a file of old spaghetti code, I open Github Copilot and type "Explain this code", and it gives me a well written summary of everything going on in the file. That is very useful. Anybody not using AI coding tools is really missing out. I feel like Copilot has doubled my productivity. I can paste a screenshot from a requirements document and enter, "Make an Angular component that looks like this", and it does a remarkable job. It's not perfect, but what it

    • round down all Fractions of an penny and place them in my account.

    • Existing code: i = 3. Translation: Assign the value 3 to the variable i. Programmer: Wow!
    • Can somebody please up the the tree swing comic with Al translating each individual frame to a plain English specification. Because it looks like that the code you have is also what it should have been doing.

    • Wasn't COBOL written to make it such that a manager could read the code as plain English and understand what it does?

      In any case, the issue isn't understanding what the old COBOL code did. It is that when the senior executive's job is on the line, can you answer the question: Does the new code work the same way as the old code? How do you know?

      Your putting alot of faith in an LLM and its ability to write a plain English explanation. Not to mention that (a) the code fragment being analyzed may not actua

    • Yeah exactly! Come back when it can write quality unit and integration tests that can be used to prove that the new code works the same way as the old.
    • This is step one : analyze the existing architecture. If you can automate this you are MILES ahead. It is a very fatiguing process. A real grind.

      Step two is to interview users and determine their pain points (manual work-arounds for system limitations, or seemingly redundant or repetitive steps that can be merged).

      Step three is to design a replacement system.

      Then you can begin building and testing the new system.

  • by Shemmie ( 909181 ) on Wednesday June 04, 2025 @09:08AM (#65426863)

    "No fucking clue what this code block does, but without it the program runs backwards."

  • Sure. (Score:5, Insightful)

    by serviscope_minor ( 664417 ) on Wednesday June 04, 2025 @09:09AM (#65426869) Journal

    Sure. Sounds plausible, and there's nothing missing from the story.

    Like for example why they transcribed it into an ambiguous human language needing further reinterpretation as opposed to translating it directly into code in the target language.

    • by Junta ( 36770 )

      I presume the thought is to force human review of business critical logic.

      In terms of the relative value, it might be helpful, but probably would have just as well been subjecting snippets of original code as needed to the LLM to help clarify.

      I wager that the human doing the final implementation will have both the original source and the 'plain english' specification. I wouldn't be too shocked if they tend to look at the original source more than the specification.

      So I'm not thinking this is necessarily a t

      • by HiThere ( 15173 )

        That process sounds like it would be extremely useful if translating, say, COBOL into Java. (I hate reading COBOL, even without bothering to understand it.)

        • by Junta ( 36770 )

          Perhaps useful, but perhaps a bit wasteful compared to saying "developer can submit code they are trying to port to the LLM on-demand".

          The step of proactively submitting 9 million lines of code and getting a bunch of text that has mostly has avery high chance of never being referenced smells of management doing something to look impressive at significant cost to highlight how "in tune" they are with the AI hype in the face of a broader reluctant organization that wants to keep them from doing anything that

          • The step of proactively submitting 9 million lines of code and getting a bunch of text that has mostly has avery high chance of never being referenced smells of management doing something to look impressive at significant cost to highlight how "in tune" they are with the AI hype in the face of a broader reluctant organization that wants to keep them from doing anything that might carry a whiff of actual risk.

            They probably also spent a pile of money on their "AI" tools and need to justify the expense. Hell, some of the management probably have a stake in the "AI" company that supplied the tools. An endorsement from a big company like Morgan Stanley would increase the "value" of those tools, netting even more money for management.

    • by dfghjk ( 711126 )

      x = y; ===> "Set X equal to Y"

      There, saved an hour of developer time.

      The beauty here is that the output does nothing and must be reviewed completely by programmers for it to be useful in any capacity. It may have taken developers enormous effort to do that work, but so what? It's not clear that anything of value was produced and that 9 million lines must still be available to developers to complete the translations.

      "...as opposed to translating it directly into code in the target language."

      Yes, and th

    • by Zocalo ( 252965 )
      TFA is vague as hell about what they actually did here since there are any number of ways of interpreting "Take some legacy code, like COBOL or Fortran, and covert it into a human-readable specification". Given what Morgan Stanley does, I'd assume when they say "specification" that they actually mean it, so I'm hoping it's not just converting legacy code on a line-by-line basis and actually producing a usable specification for entire functions that defines the expected input and outputs and leaves it up to
    • Legacy systems, especially if they are more than like 20 years old, typically don't have a spec or don't match their spec because of all the tweaks and modifications that have gone into them over the years. If you have the LLM make a spec that you can browbeat the business with it. Otherwise they will just say "make it do what the old one did, only make it secure and standards compliant", which is not helpful at all, and LLMs are not yet even close to the level where they could do a task as large and vague

    • I just did. Not even a major one just a little web app written in Python with some JavaScript.

      Most of the time was spent figuring out what individual functions and libraries did. The actual number of lines of code I had to change for the update was probably less than 100.

      If I had something that did all that grunt work for me up front I could then go in and pick out the hundred or so lines that needed changing and the whole thing would have taken a couple hours instead of a couple weeks.

      That is
      • by munehiro ( 63206 )

        The problem is that you still need to review and check the thousands and thousands of LOC that the bot has generated. Are you going to push the whole bot vomit in production without a second look? And to take that second look, it takes a lot of time anyway.

        • The problem is that you still need to review and check the thousands and thousands of LOC that the bot has generated.

          False. In this scenario the bot has written 0 lines of code.

          • The goal isn't to get the bot to write code. The goal is to get the bot to point me where I need to make my code changes.

            The bot is still doing a lot of work it's just doing analysis work instead of writing code. But that analysis is a huge part of programming.
            • Yeah, I obviously got that by the way I was able to explain the same point to the guy who replied to you, thanks

        • The bot directs me or the programmer in question to where they need to make the changes and then the programmer makes those changes.

          That means instead of going through hundreds and hundreds of lines of code and understand to get all so I can make that one little change I need I just get pointed where I need to by the AI.

          Then I can do some testing and if it doesn't work I can double check the AI work but I can do that in a test mode.

          You need to stop thinking in black and white terms. That's the
    • I'm guessing that what they claim they're doing is vastly exaggerated. AI can't possibly magically provide an accurate 'decompile' of code back into specs of what this code is supposed to do. If nothing else, the code may have bugs and oversights that mean that original specs weren't correctly implemented in the code - in which case there may be absolutely no way to figure out what the code was supposed to do as opposed to what it actually does. More likely, the spec is 'make X do Y', and the programmer wen

      • Even more likely, some bright corporate spark decided that Cobol programmers that can maintain the old code are too rare and too expensive. So instead they can use the magic of AI to give them a complete translation of what the old code does, then hire a bunch of much cheaper Javascript programmers, give them the AI-generated specs and boom, everything works just like before, but is now much cheaper to maintain. Profit! Some way down the line they'll realise that nothing actually works, but they probably wo

  • by ZipNada ( 10152669 ) on Wednesday June 04, 2025 @09:23AM (#65426909)

    AI is great at analyzing existing code and explaining it, and should also be able to generate unit tests for that code. Then someone can review the explanations and tweak them if necessary. Feed them back in as prompts for new code generation, have the AI generate unit tests for it, and compare the results. A significant time saver.

    • It also gets things very wrong sometimes. My favorite part of LLMs is asking it about code it wrote itself - sometimes it cannot justify why they did it or what it does.

      • >> sometimes it cannot justify why they did

        The AI I'm using (GPT-4.1) gives a very detailed explanation of what it did and why. Not unusual for the explanation to be longer than the code it generated.

    • The traditional description of this is "circling round the toilet".
      • Another description would be "incremental development", not unlike what you would be doing on your own.

  • The thing you use to justify, rationalize, and give credence to an agenda you already wanted to push through.

    The expensive thing you pay to do the same thing you already have staff for.

    The thing that implements the hot new business school buzzword to show youâ(TM)re a bold leader, not afraid of the cutting edge.

    The thing you can take credit for the successes of without worrying about shame.

    The thing you can assign blame to if it doesnâ(TM)t work out.

  • by geekmux ( 1040042 ) on Wednesday June 04, 2025 @09:28AM (#65426917)

    ..has reviewed nine million lines of legacy code this year, saving the investment bank's developers an estimated 280,000 hours..

    This reads like those drug bust headlines where they confiscate a fanny pack full of coke, and then claim they saved 517 million lives with it. With a street value of eleventy-seven billion.

    • Gotta hype that AI :| .. 280,000 hours is 134 years for 1 person working 40 hour weeks, non stop. If that were a team of 20, that'd be about 6 years ... Just to UNDERSTAND the code! Where the hell did they get that metric from? And why is that code so horrible that it would take 134 years for 1 person just to comprehend!?
  • Bullshit (Score:3, Insightful)

    by RobinH ( 124750 ) on Wednesday June 04, 2025 @09:32AM (#65426923) Homepage
    There's absolutely no way that what they got out of the LLM was accurate. As humans we've been conditioned to view any output of a computer as infallible. I see that attitude in the people around me using LLMs right now in their daily lives. "It's really accurate," they say. But how do you know? Did you verify it yourself? Obviously not, because the entire point of using an LLM is that you didn't have time to do it yourself, so you clearly don't have time to painstakingly check it for mistakes. I suspect this is an investment company that's heavily invested in AI startups who is trying to prop up stock prices.
    • Software Engineers aren't humans/people.
    • by Junta ( 36770 )

      The thing is that it actually isn't important whether what they got out of the LLM was accurate, it was enough text for some manager to take credit for generating a huge specification that purports to describe the business logic of all their code. That's some fine job meeting some big KPIs there.

      It could be that the specification is wildly inaccurate and utterly useless, but on the other hand at least it's in good company with almost every single "English Specification" I have ever seen made by a human.

    • ... this is an investment company that's heavily invested in AI startups who is trying to prop up stock prices.

      You are the real Sherlock Holmes, and I claim by $5!

  • by WaffleMonster ( 969671 ) on Wednesday June 04, 2025 @09:32AM (#65426925)

    "It can technically rewrite code from an old language like Perl in a new one like Python".

    Both languages are from the same vintage. Python is from the early 90s and Perl late 80s. Reminiscent of persistent belief JSON is new yet XML is old.

    • by Pembers ( 250842 )

      "It can technically rewrite code from an old language like Perl in a new one like Python".

      Both languages are from the same vintage. Python is from the early 90s and Perl late 80s. Reminiscent of persistent belief JSON is new yet XML is old.

      True, but Perl isn't used for many new projects these days. Python developers are much easier to find than Perl developers, and probably cheaper, which is what this exercise is really about.

    • Ten years are like a hundred in computing. More importantly though, the vintage isn't actually relevant. Python is literally based around a concept from the seventies - significant indentation, from SASL. (Punch cards had indentation too, but it was to specific columns, and not for control flow.)

      • by Tablizer ( 95088 )

        > Ten years are like a hundred in computing.

        Only because most "innovation" is just recycled and repackaged ideas from the past. The fad cycle is getting faster, like seeing bell-bottom jeans every 4 years instead of every 15.

    • by Tablizer ( 95088 )

      Python is probably the wrong language. A compiled language like Java or C# seems a better fit for a staid company.

  • my money in the hands of a bot created slop? no thanks.

  • How much value that AI added to the processed code?

  • by blastard ( 816262 ) on Wednesday June 04, 2025 @10:20AM (#65427041)

    One reason COBOL exists at all today is the massive amount of legacy code, mostly in the financial arena. After all, the BOL is Business Oriented Language.
    Although I'm sad to see a language go, I don't feel too bad about this one. I learned COBOL using card punch and batch processing. The COBOL decks were bigger than equivalent programs in Fortran.

    I think that this is a pretty good use of AI. Within the limited world of COBOL coding it can be good at figuring out the routine and translating it. I'd love to see the output from this AI. Sadly the article doesn't link to any of it.

    Back in the 90s, my then girlfriend was working in MIS at the headquarters of a large insurance company. They were trying to migrate legacy data that was packed in single lines of variable data length and with possible middle initials and other variances. They were used to set formatting. Apparently they had never done parsing of strings. I wrote out how to go character by character and make decisions based on what they had. She took that to them and they were able to translate the data. (They were MIS people not CS people)

  • Ahh, the "Mythical Man Three Decades"

  • ... and it has nothing to do with AI per se. My experience with legacy code that there is stuff there that is absolutely impossible to figure out by just looking at the code alone. These parts usually encode weird edge cases of external systems, workarounds for even earlier data, "temporary" hotfixes, etc. Even well designed systems have a few corners like these. And even if there are comments, comments tend to assume a certain shared context. I find it unlikely that an AI or even a Homo Sapiens can properl

  • On the belief that the AI specifications will be wrong in some crucial detail that causes the whole bank to fail?

  • COBOL was designed to allow non-programmers to write code that maps closely to the English language.

    So now we let AI take that code and translate it to English.

    Hm, that's probably something that could have been done without AI at all. But without having actually seen the generated specifications and the code they were generated from, that's difficult to say.

  • "translating outdated programming languages into plain English specifications"

    This is insane and impossible and only a non-technical manager could believe it. English, or any other natural language, is nowhere near precise enough to capture every detail of complex software. At best, it can represent an approximate summary.

    I can imagine an AI that analyzes old code and builds a model of its function that could be used to find bugs and translate into another programming language, but translating into English

  • is always a bad idea. That old code does all kinds of things, both good and bad, both current and obsolete. If you don't have experts in your company that know what the code _should_ do, you are screwed. "Whatever the old code does" is never a good target.

    • Good point. Not everybody knows that... but that "could" make less work for some people, this has the whiff of empire building. No no no... the secret sauce is now conveniently just out of reach in another abstract way. But eeeeyyyyyeeee know how to do it .. Secret Sauce costs a lot lot of money. I don't think these guys really know what they are doing. I'd be keeping an eye on that. They need a strong understanding of their data, and write new stuff with their whizz bang AI. This reeks of consultants.
      • Indeed, lots of consultants. My bet is that it won't result in a net savings, rather, an apparent savings on the requirements gathering side, and a much inflated development cost built on the inflated AI-generated requirements, where 80% of the requirements are meaningless or obsolete or irrelevant.

        • ha ha ...Consultants can smell other consultants a mile away.... <yyeeeech those guys did a terrible job, trust us, we'll redesign and rebuild EVERYTHING> It's not cheap though!!
  • Sure, this AI did some work, and spit out some specs. Would humans have done this work at all? Are the specs that were generated, worth the bytes they occupy?

  • Everyone yelling the sky is falling about AI isn't realizing the goal posts will just move. Things that were not possible before will now be possible. Humans will still be required. Organizations will take on bigger projects/risks because AI will allow them to.
  • by gavron ( 1300111 ) on Wednesday June 04, 2025 @03:57PM (#65427934)

    15,000 devs working on 9M lines of code for one company that does nothing different than the other bazillion investment companies.

    But CHATGPT SAVED THEM 280,000 hours. My division skillz suggests that is two days' work for those "15,000" supposed devs.

    Wow, it saved two days.

    Quick write up a PR puff piece.
    Headline: "ChatGPT saved two workdays."
    or
    Headline "Devs told to take Saturday AND Sunday off. ChatGPT got this one, bro." /smdh

  • Good luck on that one!

Elegance and truth are inversely related. -- Becker's Razor

Working...