Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
AI Google Programming

Google Reports Halving Code Migration Time With AI Help 12

Google computer scientists have been using LLMs to streamline internal code migrations, achieving significant time savings of up to 89% in some cases. The findings appear in a pre-print paper titled "How is Google using AI for internal code migrations?" The Register reports: Their focus is on bespoke AI tools developed for specific product areas, such as Ads, Search, Workspace and YouTube, instead of generic AI tools that provide broadly applicable services like code completion, code review, and question answering. Google's code migrations involved: changing 32-bit IDs in the 500-plus-million-line codebase for Google Ads to 64-bit IDs; converting its old JUnit3 testing library to JUnit4; and replacing the Joda time library with Java's standard java.time package. The int32 to int64 migration, the Googlers explain, was not trivial as the IDs were often generically defined (int32_t in C++ or Integer in Java) and were not easily searchable. They existed in tens of thousands of code locations across thousands of files. Changes had to be tracked across multiple teams and changes to class interfaces had to be considered across multiple files. "The full effort, if done manually, was expected to require hundreds of software engineering years and complex crossteam coordination," the authors explain.

For their LLM-based workflow, Google's software engineers implemented the following process. An engineer from Ads would identify an ID in need of migration using a combination of code search, Kythe, and custom scripts. Then an LLM-based migration toolkit, triggered by someone knowledgeable in the art, was run to generate verified changes containing code that passed unit tests. Those changes would be manually checked by the same engineer and potentially corrected. Thereafter, the code changes would be sent to multiple reviewers who are responsible for the portion of the codebase affected by the changes. The result was that 80 percent of the code modifications in the change lists (CLs) were purely the product of AI; the remainder were either human-authored or human-edited AI suggestions.

"We discovered that in most cases, the human needed to revert at least some changes the model made that were either incorrect or not necessary," the authors observe. "Given the complexity and sensitive nature of the modified code, effort has to be spent in carefully rolling out each change to users." Based on this, Google undertook further work on LLM-driven verification to reduce the need for detailed review. Even with the need to double-check the LLM's work, the authors estimate that the time required to complete the migration was reduced by 50 percent. With LLM assistance, it took just three months to migrate 5,359 files and modify 149,000 lines of code to complete the JUnit3-JUnit4 transition. Approximately 87 percent of the code generated by AI ended up being committed with no changes. As for the Joda-Java time framework switch, the authors estimate a time saving of 89 percent compared to the projected manual change time, though no specifics were provided to support that assertion.

Google Reports Halving Code Migration Time With AI Help

Comments Filter:
  • by quonset ( 4839537 ) on Friday January 17, 2025 @07:12PM (#65097873)

    Same story [slashdot.org], different wording, from yesterday.

  • by Somervillain ( 4719341 ) on Friday January 17, 2025 @07:47PM (#65097967)
    Hmm...if you haven't updated your testing code to modern specs in 19 years....one has to wonder what you're actually doing? I haven't seen JUnit 3 in decades. It's OLD!!!! I know legacy stuff often lives longer than it is expected, but if you're running JUnit 3 today, that's technological malpractice. Production code can go decades without being touched, but in an ideal world, unit and integration tests should be tuned and updated often...especially in mature code.

    I just am not impressed and am skeptical that a company as advanced and wealthy as Google would keep so much obsolete code in the first place. What did they do?...go into some CVS archive from the early 2000s? AI needs to solve real problems...not made-up ones.

    I can think of hundreds of better uses for AI in code:

    1. how about an optimized compiler/JVM that reads spaghetti code and tunes it to have top notch performance?
    2. how about an AI that detects redundant tests in your unit test suite and removes them?
    3. how about porting your code to JUnit 5...something actually useful?

    I suspect they're picking these stupid problems because the code is not likely real code. They won't trust Gemini on any of their products that actually makes money. They know the bug rate is too high to have it write core Android or Google Search code.
    • JIT hallucination!

    • but in an ideal world, unit and integration tests should be tuned and updated often...especially in mature code.
      It definitely should not.

      It is running code. Unless you touch the implementation, there is no reason to change the test. Or test framework.
      The tests are only run during build on changed code anyway.

      No one cares if a test framework is 20 years old unless the code under test changes today. And then you decided a few weeks ago already: do we update the test frame work for this change? Yes/No?

      Suppose

    • A request for modifying a large system gets to the budgeting phase and the business decision makers are much more likely to approve code changes for business reasons and less likely to approve non-production code changes to the large set of unit tests.

  • A statement like "halving code migration time" implies, they did a significant amount of the porting without "AI help", so they have a reference number?

    During Y2K reengineering, we simply used "standard parsers", picked some random places where we knew there are dates, and the parsers did dataflow analysis, so everything touched was marked as date. The new found dates where examined (as some were so called "buffer variables", a long string that got reused and reused to be printed or written to a file, and f

  • AI company says AI is good!

    News at 11.

egrep -n '^[a-z].*\(' $ | sort -t':' +2.0

Working...