Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
AI Programming

FreeBSD Project Isn't Ready To Let AI Commit Code Just Yet (theregister.com) 21

The latest status report from the FreeBSD Project says no thanks to code generated by LLM-based assistants. From a report: The FreeBSD Project's Status Report for the second quarter of 2025 contains updates from various sub-teams that are working on improving the FreeBSD OS, including separate sub-projects such as enabling FreeBSD apps to run on Linux, Chinese translation efforts, support for Solaris-style Extended Attributes, and for Apple's legacy HFS+ file system.

The thing that stood out to us, though, was that the core team is working on what it terms a "Policy on generative AI created code and documentation." The relevant paragraph says: "Core is investigating setting up a policy for LLM/AI usage (including but not limited to generating code). The result will be added to the Contributors Guide in the doc repository. AI can be useful for translations (which seems faster than doing the work manually), explaining long/obscure documents, tracking down bugs, or helping to understand large code bases. We currently tend to not use it to generate code because of license concerns. The discussion continues at the core session at BSDCan 2025 developer summit, and core is still collecting feedback and working on the policy."

FreeBSD Project Isn't Ready To Let AI Commit Code Just Yet

Comments Filter:
  • by Locke2005 ( 849178 ) on Wednesday September 03, 2025 @01:03PM (#65636264)
    The MerlinBot used at Microsoft to review controller firmware has actually found bugs in code I checked in. Just remember if you farm the review out to Chat GTP, your proprietary code gets sent to servers you don't control. Microsoft uses their in-house AI to review code, so presumably it can be trusted to not leak trade secrets.
    • by mysidia ( 191772 )

      That's not AI generating code or documentation but AI generating text based on code and a prompt.

      Perhaps it is harmless, so long as it does not occur prior to appropriate peer review by an actual human being.
      If done too early, then the AI might end up tainting the review process by causing people to believe the machine when they shouldn't, Or for reviewers to perhaps look no further than what the machine had said (that is if the AI reviewed the code, then reviewers might not analyze the code so diligently

      • It adds comments and suggestions to be resolved by the author. It cannot approve a pull request, so it does not replace human peer review. AI should not replace human peer review!
  • Translation (Score:2, Interesting)

    by packrat0x ( 798359 )

    TFS:"explaining long/obscure documents,"
    How about for translating help/man/info pages?
    It's cliche that programmers dislike documentation, so I guess translating documentation is even lower priority.

    • by allo ( 1728082 )

      In the best case you have community members who can translate to their native language and rather do that than coding. Not everyone needs to be a programmer to help a project.

  • by williamyf ( 227051 ) on Wednesday September 03, 2025 @01:21PM (#65636322)

    Train a coding AI model just on BSD, MIT, ISC, APACHE, WTFPL, CC0 and compatible licensed code, and only accept code generated by that model.

    Easy peasy.

    • by unrtst ( 777550 )

      Dunno if that's the ticket (those are different, though similar, licenses), but I love the idea of coding LLM's trained only a targeted codebase. For example, train one only on Linux Kernel source for use in working with Linux Kernel code... I imagine the code style and such would be a better fit, and that codebase is big enough to learn a lot from it.

      As a counter-example, if a coding assistant was trained with a lot of obfuscated C, I wouldn't want the results going into my production codebase.

      • by allo ( 1728082 )

        You could try to train an LLM to de-obfuscate code, though. I am still waiting for the javascript un-minimizer. You can format it nicely and remove some of the optimizations like scientific notation for (smallish) integers, but a LLM could infer readable variable names.

    • by mysidia ( 191772 ) on Wednesday September 03, 2025 @03:55PM (#65636882)

      Those licenses still require you to produce the correct terms when you are redistributing code.

      You cannot ship code and state that the terms are the BSD license when that code is under MIT license, Etc.
      Also, these licenses require including the copyright statement of the author, so your redistribution can be infringing if you don't specify them.

      Thus training only on that group of licenses does not give a free pass. You would have to train only on code where the author has provided a distribution license that allows the distribution method you are planning.

    • by allo ( 1728082 )

      If you really want to follow the path of considering the training code licenses, you still have unmet conditions like mentioning the names of contributors.

      On the other hand, when you now take two lines of a program and put them into your program, you usually don't need a license as it is too trivial. I guess there is no clear line, but for copyright to be relevant, you need a non-trivial amount of code. Even if one were to build an AI that is just remixing originals, each work would probably contain only a

      • by mysidia ( 191772 )

        for copyright to be relevant, you need a non-trivial amount of code.
        That is not necessarily true, and the importance of the portion used is a major factor.

        You are thinking of Copyright in terms of number of lines of identical code perhaps, but copyright on computer software does not exclusively work that way.

        each work would probably contain only a few keywords of each author

        Copyright over software does not look solely at direct 1:1 copies. The keywords can be different and still infringing. It is referred

        • by allo ( 1728082 )

          I'm thinking of copyright more or less in entropy.

          If you copy three lines that you could have written yourself without knowing the other code, it is probably under the limit to be copyrightable.

          For more code it can become difficult. You may have a full algorithm that is just the pseudo code in a text book put into let's say python. Now there are only few ways (modulo different variable names) to put the code verbatim into python and only few variations that make sense. There is little to copyright there, be

          • by mysidia ( 191772 )

            but your implementation of quicksort is probably not as unique as you may think

            It may well not be unique, But copyright rights apply based on originality, not novelty.
            You may be contemplating an issue here that technically copyright does not even have.

            If two or three or four people happen to write the exact same program -- it is perfectly fine with copyright law, so long as they did not actually have access to each others' works or copy from one another. They are then in fact all entitled to copyright p

  • Big Tech may have finally found a way to destroy open source! Sounds like a great place to employ AI, Overwhelm and Destroy!
  • WTF? Is everyone bucking under the pressure of our AI overlords?
  • Indeed (Score:5, Interesting)

    by MBGMorden ( 803437 ) on Wednesday September 03, 2025 @03:22PM (#65636734)

    I've tried some of the AI coding tools. It works OK for some really basic stuff. If you need a quick 10 line function that does something very specific and you can describe that fairly accurately, its good. Anything that gets remotely complex though it tends to confidently spit out code full of bugs or even code that won't even compile.

    Sometimes it even makes up calls to functions in a library that don't even exist (my only guess is that somewhere it parsed in someone talking about trying to call that function when they assumed it did, and that worked its way into its data as a function call).

    Overall, it can be ok for some basic stuff, but its far from ready to just turn it loose on anything of value.

    • by evanh ( 627108 )

      The non-existent libraries probably do exist for the sources the LLM is copying from.

  • by Z80a ( 971949 ) on Wednesday September 03, 2025 @04:40PM (#65637016)

    Please generate a text that look as close as possible as the text i actually need, at a point that if it's wrong, i won't be able to pick it off unless i manually check it with my advanced debugging skills.

  • ... never will. Human control 4 ever.

"Buy land. They've stopped making it." -- Mark Twain

Working...