Free Software Foundation Will Fund Papers on Issues Around Microsoft's 'GitHub Copilot' (fsf.org) 111

Posted by EditorDavid on Saturday July 31, 2021 @06:34PM from the AI-issues dept.

GitHub's new "Copilot" tool (created by Microsoft and OpenAI) shares the autocompletion suggestions of an AI trained on code repositories. But can that violate the original coder's license? Now the Free Software Foundation (FSF) is calling for a closer look at these and many other issues...

"We already know that Copilot as it stands is unacceptable and unjust, from our perspective," they wrote in a blog post this week, arguing that Copilot "requires running software that is not free/libre (Visual Studio, or parts of Visual Studio Code), and Copilot is Service as a Software Substitute. These are settled questions as far as we are concerned."

"However, Copilot raises many other questions which require deeper examination..." The Free Software Foundation has received numerous inquiries about our position on these questions. We can see that Copilot's use of freely licensed software has many implications for an incredibly large portion of the free software community. Developers want to know whether training a neural network on their software can really be considered fair use. Others who may be interested in using Copilot wonder if the code snippets and other elements copied from GitHub-hosted repositories could result in copyright infringement. And even if everything might be legally copacetic, activists wonder if there isn't something fundamentally unfair about a proprietary software company building a service off their work.

With all these questions, many of them with legal implications that at first glance may have not been previously tested in a court of law, there aren't many simple answers. To get the answers the community needs, and to identify the best opportunities for defending user freedom in this space, the FSF is announcing a funded call for white papers to address Copilot, copyright, machine learning, and free software.

We will read the submitted white papers, and we will publish ones that we think help elucidate the problem. We will provide a monetary reward of $500 for the papers we publish.
They add that the following questions are of particular interest:

Is Copilot's training on public repositories infringing copyright? Is it fair use?
How likely is the output of Copilot to generate actionable claims of violations on GPL-licensed works?
How can developers ensure that any code to which they hold the copyright is protected against violations generated by Copilot?
Is there a way for developers using Copilot to comply with free software licenses like the GPL?
If Copilot learns from AGPL-covered code, is Copilot infringing the AGPL?
If Copilot generates code which does give rise to a violation of a free software licensed work, how can this violation be discovered by the copyright holder on the underlying work?
Is a trained artificial intelligence (AI) / machine learning (ML) model resulting from machine learning a compiled version of the training data, or is it something else, like source code that users can modify by doing further training?
Is the Copilot trained AI/ML model copyrighted? If so, who holds that copyright?
Should ethical advocacy organizations like the FSF argue for change in copyright law relevant to these questions?

Free Software Foundation Will Fund Papers on Issues Around Microsoft's 'GitHub Copilot'

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 111 Comments Log In/Create an Account

Comments Filter:

Restricted Software Foundation (Score:3, Insightful)

by ShanghaiBill ( 739463 ) writes: on Saturday July 31, 2021 @06:42PM (#61642607)

The Free Software Foundation should focus on making software free rather than looking for new ways to restrict it.

- Re: (Score:2)
  
  by 93 Escort Wagon ( 326346 ) writes:
  
  There has to be a way to shoehorn a GNU Hurd joke in here...
- Re: (Score:3, Informative)
  
  by K. S. Kyosuke ( 729550 ) writes:
  
  If you read again, you will notice that this is what they're doing.
  - Re: (Score:1, Insightful)
    
    by gacattac ( 7156519 ) writes:
    
    Seems they are trying to restrict a tool that helps people write better code faster.
    - Re:Restricted Software Foundation (Score:5, Insightful)
      
      by tlhIngan ( 30335 ) writes: <slashdot AT worf DOT net> on Sunday August 01, 2021 @05:29AM (#61643413)
      
      Seems they are trying to restrict a tool that helps people write better code faster.
      You're kidding, right? The tool's a joke and practically useless.
      There are plenty of examples where its results are less than ... usable. Some of the more egregious examples include regurgitating GPL code into your project [reddit.com] (to be fair, it did include the license, but if you're not coding for GPL, may be problematic).
      Or such brilliant things like storing currency values as a float [twitter.com]. Which is supposed to be a "premier example" of good coding.
      Honestly, TheDailyWTF [thedailywtf.com] probably will have to include a special section for code created by Copilot.
      The problem is the training set they used for it just isn't very good. And with things like what we've seen, I'm not entirely sure it's that useful. I mean, the inclusion of GPL code is particularly egregious - if your code is incompatible with the GPL, the last thing you want is to have GPL code tossed in by a tool.
      
      - Re: (Score:2)
        
        by DA_MAN_DA_MYTH ( 182037 ) writes:
        
        I don't think the tool's a joke and or practically useless. I think there's quite a few things that it can help with certain mundane tasks, and by the way forget that it's in beta, it's also the first version of this project. The applications of GPT-3 in general (which I'm pretty positive GH CoPilot) can assist with all sorts of things, is the results perfect, far from it. However as of right now as a code completion tool it's just astonishing, that it does feel good enough to be like a junior programmer
  - - Re: (Score:2)
      
      by K. S. Kyosuke ( 729550 ) writes:
      
      No, that is not true. Do you have no clue how copyright works? No copyright restriction can ever hamper learning, since copyright protects forms, not ideas.
      - Re: (Score:2)
        
        by pegr ( 46683 ) writes:
        
        Almost right. Copyright protects creative expressions of ideas, not the ideas themselves.
        If the intent of open code is for learning, it shouldn’t matter if I teach a machine instead of a person.
        If your “copyrighted code” comes out of an AI, I would suggest your expression isn’t particularly creative to begin with. Most folks can’t wrap their head around the idea that the vast majority of software is purely functional expression and thus isn’t particularly copyrightable
        
        Re: (Score:2)
        
        by UnderCoverPenguin ( 1001627 ) writes:
        
        If your “copyrighted code” comes out of an AI, I would suggest your
        expression isn’t particularly creative to begin with.
        That's not the problem.
        Copilot is not *generating* code that happens to be the same.
        Copilot is "coughing up" pieces of the source code it supposedly analyzed.
        This is no different from taking samples of some one else's audio recording
        and including them in your own. The courts have afformed that including samples
        without permission is infringing copyright.
        Copying and pasting samples of some one else's code into your own is no
        different from copying and pasting samples of some one else's music into your
        own. If y
    - Re: (Score:2)
      
      by Aighearach ( 97333 ) writes:
      
      No, there is lots of Apache 2 and BSD licensed code to learn from. That's what "open source" means. That's why the term "FOSS" exists; "free software" isn't "open source."
      It's partially open source, just the same as those "community license" things.
- Re: (Score:3)
  
  by Antique Geekmeister ( 740220 ) writes:
  
  "Free as in Speech", not "Free as in Beer". The distinction is fuhdamental and at the core of the _free speech_ issues that the Free Software Foundation advocates , teaches, and publishes software for. They've consistently protected the right of people to see, use, and modify software. Where people in favor of "open source" have disagreed has normally been when individuals or companies seek to proprietize a project, to seal away parts of the software to sell or seal away from public view. The most blatant e
  - Re: (Score:2)
    
    by Aighearach ( 97333 ) writes:
    
    Free speech isn't sticky; if you say something, and I hear you, and I say the same thing, that's the power of free speech, the reason for free speech, the ebb and flow of the marketplace of ideas.
    Free as in speech is a lie; free as in "defended from capitalism."
    You have to point to lawsuits designed to stop certain speech to find examples of freedom to speak? With the Apache 2 license nobody gets to sue anybody, and everybody gets to copy the code. That's freedom.
    - Re: (Score:2)
      
      by Antique Geekmeister ( 740220 ) writes:
      
      > Free speech isn't sticky; if you say something, and I hear you, and I say the same thing, that's the power of free speech,
      The power of copyright is a distinct though related right. Let us not confuse them. Copyright is the power to control who may repeat your exact words, especially written words. It's designed to protect the authors and the publishers from wholesale plagiarism, and was developed in response to the invention of the printing press.
      Part of the difficulty which the Free Software Foundatio
      - Re: (Score:2)
        
        by Aighearach ( 97333 ) writes:
        
        Part of the difficulty which the Free Software Foundation addresses, successfully, is the popular tendency to copy someone else's words and claim them as your own, then to use copyright against others.
        lol they don't help with that at all! Copyright itself gives you that power, and your ability to enforce it depends entirely on your access to lawyers.
        And if you give your copyright over the FSF, they do not actually enforce it at all, they use it to extort from large violators, and in the end never implement the promised terms. It is all lies, and horse shit.
        Which Apache 2 license, I have the exact same protections under the Copyright Act, and yet, no lies, no bullshit, no need for lawyers unless indeed so
        
        Re: (Score:2)
        
        by Antique Geekmeister ( 740220 ) writes:
        
        > lol they don't help with that at all! Copyright itself gives you that power, and your ability to enforce it depends entirely on your access to lawyers.
        When the source code is kept secret, it's much more difficult to prove the copyright violation. Have you ever tried to enforce a software copyright for anything you published?
        
        Re: (Score:2)
        
        by Antique Geekmeister ( 740220 ) writes:
        
        This was part of the difficulty with the SCO versus Linux users lawsuit. SCO refused to display the source code they claimed was infringed, for years. By compelling publishers of software to include access to the source code for those clients, it makes it much easier to trace a violation.
        The GPL also blocked Sourceforge from bundling spamware into GIMP: Sourceforge took over the idle source code repository for Windows compatible GIMP, inserted various adware and spew, and published it as a Windows GIMP pack
        
        Re: (Score:2)
        
        by Aighearach ( 97333 ) writes:
        
        Did you know they make these things called "calendars?"
        Take a look at what year it is. Then look up the SCO lawsuit. Then realize that's your best, most recent example.
        It proves my point. Look at my user id. I know about the SCO lawsuit. I followed it here on slashdot.
        
        Re: (Score:2)
        
        by Antique Geekmeister ( 740220 ) writes:
        
        I'm old. The most infamous examples and public examples are not recent, but they are compelling. For me, I need to keep such discovered abuses more private and resolve them more discreetly, so I cannot post them here.
- Re: (Score:2)
  
  by drinkypoo ( 153816 ) writes:
  
  The FSF is focused on freeing the software itself from being locked away by malicious actors, because this makes it free for the users. And the users are the ones who matter. The code doesn't have feelings. The developers are in the minority, and if they aren't there to serve the users, fuck them anyway.
  - Re: (Score:2)
    
    by Aighearach ( 97333 ) writes:
    
    How do you "lock away" BSD? How would you even try? It is just a lie that the FSF tells, it isn't an actual thing they're doing.
    - Re: (Score:2)
      
      by exomondo ( 1725132 ) writes:
      
      Because it's easy to make things sound more nefarious if you are disingenuous. While I do support some of the FSF's goals they often get ridiculous when they do things like comparing their cause to slavery, as if slaves had a choice about when and whether to be slaves or not like computer users do over when and whether to use free or non-free software. But hey, it makes non-free software sound so much more evil!
      - Re: (Score:2)
        
        by Aighearach ( 97333 ) writes:
        
        As a developer my favorite thing about the Apache 2 license is that I have no control at all once I give it away. I have no temptation to outrage or manipulation. I can share future updates, or not, and that is it. Just so simple, and free. Use it or don't. Use it what I thought of, or something else. I don't have to care, I'm not the one using it for that! No calculation, no hyperbole, no temptation to a cause. Just some code.
        
        Re: (Score:2)
        
        by exomondo ( 1725132 ) writes:
        
        Yes, it's altruism.
        
        Re: (Score:2)
        
        by Aighearach ( 97333 ) writes:
        
        Well, it is whatever you want it to be.
        You give it away, some company uses it, they want paid support, they might want to hire you.
        You apply for some job, you want to talk about your open source in the interview, who are your users? A bunch of FSF neckbeards, or are companies using your code in their products? Which has more commercial value?
        Or if you don't care about any of that, maybe it was just "altruism," also known as, that warm fuzzy feeling when you do something Virtuous.
        Or maybe you just want there
  - Re: (Score:2)
    
    by exomondo ( 1725132 ) writes:
    
    The FSF is focused on freeing the software itself from being locked away by malicious actors, because this makes it free for the users.
    You mean preventing non-free derivative works. In this instance this "Copilot" thing is facilitating sharing of code, the exact thing the FSF claims to be in support of.
- Re: (Score:2)
  
  by Aighearach ( 97333 ) writes:
  
  Restriction is what they mean by "freedom." When they say "software freedom," they do not mean the word software combined with the word freedom. They mean instead, freedom from unapproved choices.
  For end users this isn't really that noticeable, because people just want to download and use stuff without paying anything, which they can do.
  For developers... well, the vast majority of new open source projects that have users are Apache 2 licensed, or BSD.
  Remember, the "free software" people consider "open sourc
Is this really new? (Score:1)

by dowhileor ( 7796472 ) writes:

Auto complete, Error/Suggest,....and the like have been a thing for a while now? The internet is a cognitive entity and may have been from the beginning?
Microsoft buys things to kill them... (Score:1)

by QuietLagoon ( 813062 ) writes:

... so they've seemed to choose GitHub. Did anyone here really think that Microsoft acquired GitHub for anything besides capitalist purposes? This is Microsoft we are talking about. Buy something for the purpose of making money off of it, not to improve what is bought.
.
How many products have Microsoft bought to kill them?
- Re:Microsoft buys things to kill them... (Score:5, Insightful)
  
  by ChatHuant ( 801522 ) writes: on Saturday July 31, 2021 @08:36PM (#61642791)
  
  Did anyone here really think that Microsoft acquired GitHub for anything besides capitalist purposes? This is Microsoft we are talking about. Buy something for the purpose of making money off of it, not to improve what is bought.
  How many products have Microsoft bought to kill them?
  You don't really know what you're talking about, do you? Take a look here [wikipedia.org], for more information. I picked just a few counter-examples to your assertion:
  Forethought: purchased in 1987, their product became Powerpoint, still available after 35 years.
  SQL server - licensed originally from Sybase, 1989. Still going strong 32 years later.
  Flight Simulator - originally from Sublogic, 1982. Newest release 2020, 38 years after.
  LinkedIn - purchased by Microsoft in 2016, still up and running 6 years after.
  Navision - purchased in 2002, available in 2021 as Microsoft Dynamics, 19 years later.
  Visio, bought in 2000, still available in 2021
  There are some companies that MS bought then dropped (Nokia is probably the worst example, and Skype too), but comparing this to Google's approach is just silly.
  
  - Re:Microsoft buys things to kill them... (Score:5, Funny)
    
    by fahrbot-bot ( 874524 ) writes: on Saturday July 31, 2021 @08:52PM (#61642811)
    
    ... Powerpoint, still available after 35 years.
    It's debatable as to whether this one is a good thing. :-)
    
    - Re: Microsoft buys things to kill them... (Score:2)
      
      by Anonymouse Cowtard ( 6211666 ) writes:
      
      "It looks like you're shitposting...." - Clippy
    - Re: (Score:2)
      
      by bn-7bc ( 909819 ) writes:
      
      You can't blame the tool for the output created by users that realy should not do presentations at all. That would be like blaming a nail gun for nailing your foot to something else when it was functioning correctly at the time. You could ifc argue that Piwerpoint shuld nake it harder for people to overload presentations with way to many transitions and other effects, or to flick thru 70 slides in 25 minutes etc ( numbers might be slightly inflated/ reduced). But a bad presentation would probably be bad an
      - Re: (Score:2)
        
        by angel'o'sphere ( 80593 ) writes:
        
        Well, if you are tired about walking in a circle, just nail your other food, too!
  - Re: (Score:3)
    
    by Guspaz ( 556486 ) writes:
    
    It's also worth noting that after buying GitHub, Microsoft went and started replacing everything with Git. Azure DevOps (the successor to TFS and VSTS) uses Git via GitHub's libgit2 as its primary source control system (TFSVC is still supported but Git is the default).
    - Re: (Score:3)
      
      by serviscope_minor ( 664417 ) writes:
      
      It started before github.
      I know a few Microsofties. What's well known is that Microsoft has a pretty strong dog fooding policy. What's moderately well known is they also had two of their own version control systems both of which are terrible. There was a long running and large internal fight about using git internally, which had been rumbling on door at least 5 years maybe longer.
      Eventually team git won.
      - Re: (Score:2)
        
        by Chelloveck ( 14643 ) writes:
        
        And imagine what a horrorshow those MS systems must have been in order to make git look good by comparison.
  - Re: (Score:2)
    
    by bn-7bc ( 909819 ) writes:
    
    Well flight simulatot hat a rather long pause where ms did notting with the ip, from October 16 2006 ( the release of fsx steam edition) until they started work on fs 2020, so while you might say that they have had the product for 38 years, it has certainly not been acrivly developed during all of that time. IIRC ms at one point officialy said that they stopped all development on fs and fired most if the devs.
  - Re: (Score:2)
    
    by Aighearach ( 97333 ) writes:
    
    Skype wasn't dropped. I pay $14/m for an unlimited outbound bridge to the phone system in Thailand. Before that I was buying those shitty "phone cards" all the time.
    - Re: Microsoft buys things to kill them... (Score:1)
      
      by AnonymousNoel ( 6972222 ) writes:
      
      Not dropped yet. Seems imminent...
  - Re: (Score:2)
    
    by QuietLagoon ( 813062 ) writes:
    
    Actually, I do know what I am talking about. Your cherry-picked examples notwithstanding.
- Re: (Score:3)
  
  by 93 Escort Wagon ( 326346 ) writes:
  
  How many products have Microsoft bought to kill them?
  I think Microsoft is more like Adobe - they buy companies to prevent competition, then gradually turn their products into crapware.
  - Re: (Score:2)
    
    by drinkypoo ( 153816 ) writes:
    
    Microsoft has actually purchased many companies and products specifically with the intent of adding the functionality to Windows. WLBS used to be Wolfpack, many people said it was better when it was but the point is they didn't just throw it away. It became a Windows feature. Might have been superseded by now, I haven't kept up with clustering on Windows.
- Re: (Score:2)
  
  by paulpach ( 798828 ) writes:
  
  Of course they make acquisitions that they think will make them money. That is what companies _should_ do.
  Killing GitHub would _not_ make Microsoft money. It's the complete opposite, they make money by making it better. Consider some of the changes MS made since they bought Github:
  * They added GitHub actions which became the #1 CI/CD tool practically overnight.
  * Unlimited private repos for free
  * Fixed Lots of usability issues [github.blog]
  * Github Codespaces [github.com]
  * Automatically found and fixed millions of security issue
What about us? (Score:5, Insightful)

by PhrostyMcByte ( 589271 ) writes: <phrosty@gmail.com> on Saturday July 31, 2021 @08:31PM (#61642779) Homepage

When I read a book, read source code, etc. and train my electrochemical neural network brain, and then create code later on using that neural network, does that constitute a copyright issue?
No. No it doesn't. Unless the AI is spitting out verbatim copyrighted code, or something very close to it, then I don't see how this could be an issue.

- Re:What about us? (Score:5, Interesting)
  
  by Waffle Iron ( 339739 ) writes: on Saturday July 31, 2021 @09:37PM (#61642857)
  
  I'd argue that current computer neural networks currently do not work similarly enough to biological ones to count as creating novel works.
  Computer memory is much more precise than biological memory. Even if the code has been taken apart and stored in an unrecognizable fashion, the computer neural network is still likely to reconstitute the code in nearly verbatim chunks. This system almost certainly has no "understanding" of the problems its solving. It's just matching patterns it has stored against what someone has typed, then mechanically reassembling the patterns.
  Your argument will probably hold if they ever develop AI to the point where it actually understands the problem statement and produces code that addresses requirements of the project just from that. Unfortunately, if that ever happens, software developers will all be out of a job anyway.
  
- Re: (Score:2)
  
  by godrik ( 1287354 ) writes:
  
  Unless the AI is spitting out verbatim copyrighted code, or something very close to it, then I don't see how this could be an issue.
  Well... in many cases it does spit out existing code verbatim..
  Coments, authorship, and license included.
  That's why they are concerned. It seems like a pretty clear case of infrigement even in more moderate cases.
  - Re: (Score:2)
    
    by Ostracus ( 1354233 ) writes:
    
    How's that any different than the current practice of copying code out of github? Does "done with an ai..." somehow change the practice, let alone the obligations?
    - Re: (Score:3)
      
      by godrik ( 1287354 ) writes:
      
      Copying code out of github in many cases IS copyright infringement. And FSF does fight it and help the copyright owner fight it in those cases too.
      What changes with copilot is that the legal status of the generated code is very unclear. To me, it seems that the machine learning model is derived work from the code and therefore should be GPLed and the code derived from the model is also derived work and should be GPLed too.
      Though, I am a random slashdoter, what do I know...
      - Re: (Score:2)
        
        by angel'o'sphere ( 80593 ) writes:
        
        To me, it seems that the machine learning model is derived work from the code and therefore should be GPLed and the code derived from the model is also derived work and should be GPLed too.
        Luckily this is not the definition of "derived work".
        
        Re: (Score:2)
        
        by godrik ( 1287354 ) writes:
        
        Isn't it?
        If I print a GPL code it is derived work. And then if I scan the printed version, it is still derived work of the original work. And it is subject to the GPL.
        If I zip a piece of code and unzip it, it does not magically become not subject to the GPL.
        Provided when you run the code through their machine learning model it is still able to reproduce the code exactly as is including variable names, comments, license, and authorship, it seems pretty clear to me that their model just transform the code to
        
        Re: (Score:2)
        
        by Aighearach ( 97333 ) writes:
        
        Isn't it?
        If I print a GPL code it is derived work. And then if I scan the printed version, it is still derived work of the original work.
        No, if you print it is a copy of the original work, and if you scan that, it is still a copy of the original work.
        
        Re: (Score:2)
        
        by angel'o'sphere ( 80593 ) writes:
        
        I suggest you simply read the relevant law.
        What "derived work" is, is a legal term. And it is completely clearly written in said law.
    - Re: (Score:2)
      
      by account_deleted ( 4530225 ) writes:
      
      Comment removed based on user account deletion
      - Re: (Score:2)
        
        by loonycyborg ( 1262242 ) writes:
        
        Copyright not in any way applies until you actually "distribute" what you made. As I see it co-pilot distributes FOSS licensed snippets to coders who use it, yet it's coder's responsibility to double-check all code's legal requirements before distributing his code if it was made with assistance of this tech. While if it's something you do purely for yourself, licensing concerns never apply since you don't distribute it.
        
        Re: What about us? (Score:1)
        
        by AnonymousNoel ( 6972222 ) writes:
        
        True, but bear in mind that pushing it into a git repo that is not on your own machine and which others can pull from probably counts as distributing, so even private companies using distributed source control for internal projects may, technically, be distributing the source code.
        I mean, it would be hard to argue that distributed source control doesn't contain distributed code.
- Re: (Score:2)
  
  by Antique Geekmeister ( 740220 ) writes:
  
  The difficulty for copyright is when you copy large portions verbatim. This happened to Hellen Keller, quite accidentally by all accounts. It happens to new writers when editors pass submissions to different authors to see what they can do with the story, and they leave in too much of the original accidentally. The AI would have to be written cautiously to avoid just this problem, and it can be difficult for even a lawyer, judge, or editor to judge consistently.
- Re: (Score:2)
  
  by AmiMoJo ( 196126 ) writes:
  
  Unless the AI is spitting out verbatim copyrighted code, or something very close to it, then I don't see how this could be an issue.
  That's exactly what it's doing, right down to reproducing the copyright notice comments.
- Re: (Score:2)
  
  by angel'o'sphere ( 80593 ) writes:
  
  In Europe it would not be a problem.
  Work solely created by a computer/algorithm is nit copyrightable. Not even by the author of that algorithm.
  So if you want to nitpick: it is a combined work of the programmer - copyrighted by him - and uncopyrighted snippets contributed by the AI.
  But who would care?
  A better programmer had written the same - or better code - by himself.
  And coming for the same problem to the same solution, that is hardly a copyright issue.
  - Re: (Score:2)
    
    by Ostracus ( 1354233 ) writes:
    
    Sounds just like the monkey selfie case.
It seems to me .... (Score:3, Insightful)

by Anonymous Coward writes: on Saturday July 31, 2021 @09:11PM (#61642839)

... that training the CoPilot expert system on freely-available public repositories absolutely qualifies as fair use under the current definition of the term.
The other questions are mostly harder to answer definitively. For instance, the question of whether CoPilot suggesting verbatim re-use of code copyrighted under (i.e. - GPL) free software licenses constitutes violation of that copyright depends on whether the programmer who incorporates that code gives credit to the author of the copied stuff, and on whether he/she makes the resultant program free and open.
Disclaimer: I am not a programmer. I am not YOUR programmer. If you need a programmer's services, hire one ...
(Posted anonymously only so as not to undo positive mods to previous comments on this story.)
--
Check out my novel [amazon.com].

- Re: (Score:3)
  
  by Ostracus ( 1354233 ) writes:
  
  Seems to me Github Copilot gets to the heart of, can one copyright ideas? It's not giving snippits of code from code repositories, but the idea that goes with the intent the programmer is trying for.
  Also the tool is going through growing pains [www.fast.ai] so all this may be premature.
Reminds me of an old question... (Score:5, Interesting)

by imp ( 7585 ) writes: on Saturday July 31, 2021 @10:39PM (#61642945) Homepage

Once upon a time, AT&T said that if I read their proprietary sources to learn how Unix worked, my brain had been infected and I couldn't pass along that knowledge. That position lost in court. It sounds a bit like what FSF is saying here: Read GPL code, then the code you produce must be GPL'd.

- Re:Reminds me of an old question... (Score:5, Informative)
  
  by i.r.id10t ( 595143 ) writes: on Saturday July 31, 2021 @11:05PM (#61643015)
  
  More like someone with perfect photographic memory reads the code and then re-types it verbatim.
  
Hackers Love It (Score:1)

by Hari Pota ( 7672548 ) writes:

https://www.ehackingnews.com/2... [ehackingnews.com]
GPL v4 (Score:4, Insightful)

by backslashdot ( 95548 ) writes: on Sunday August 01, 2021 @12:59AM (#61643169)

IF they are so mad about it, the FSF ought to add an AI-training exclusion in the next version of GPL. Meanwhile it was never stated in a GPL license that people can't use it to train their AI or use snippets of the code. As long as Microsoft discloses which code it used to train the AI they are in the clear.

- Re: (Score:2)
  
  by bn-7bc ( 909819 ) writes:
  
  Well thst would nean gpl v3,5 or 4, as the fsf has no possibility to change existing versions if the gpl unless all projects abr related contributors accept it, that's why the kinux kernel stays in gol v2
Worst case (Score:2)

by backslashdot ( 95548 ) writes:

I can see an argument for making it so that all code generated by Copilot must be released under the GPL, and that too seems shaky to me.
I don't see any cause for there being a copyright violation to train an AI with GPL code. FSF is being a bitch.
- Re: (Score:2)
  
  by serviscope_minor ( 664417 ) writes:
  
  Being a birch? Grow up. You're just engaging in reflexive anti FSF rhetoric before engaging your brain.
  This is a huge unaddressed question across the whole industry. No one knows if using copyright data to train a network violates copyright. And no one knows how close you have to get to the original work before copyright is violated.
  You're not a lawyer and you don't know. In fact no lawyers do either, though they can make educated guesses. The only person who will ultimately get to decide this is a judge, o
Is the onus is on the user to do due diligence? (Score:2)

by bb_matt ( 5705262 ) writes:

If folk are using copilot, then perhaps the onus should be on them to do due diligence on the code it generates, to find the originating source and associated license?
I mean, it's fairly easy to tell the difference between a fairly common algorithm vs. a big chunk of code that a coder could fairly easily be suspicious about - "wow, that's a pretty complex bit of auto-generated code!"
However, if copilot isn't revealing *where* that code originated - in which repository - I guess that makes a users decision m
Let's ask the RIAA (Score:2)

by ptaff ( 165113 ) writes:

When a song includes a minimal sample from another song, it's considered copyright violation.
Even if two songs are not digital clones of each other, reproducing the same patterns (as a direct song cover or as a very similar melody over very similar chords) is again copyright violation.
Isn't the above exactly what Copilot does?
(The AI process is irrelevant - if AI was used to help composers and producers write music and added for good taste a sample of U2 in a song or reproduced the melody of a Metallica son
- Re: (Score:3)
  
  by benjymouse ( 756774 ) writes:
  
  There's only so many ways to write a function to calculate the nth Fibonacci number. What you have in the music industry would be akin to the first person to copyright such a function would be able to prevent everyone else from publishing a function using the same algorithm.
  Programming is not art. A function is not an artistic expression. Give a number of programmers same problem, and you will often find that two or more programmers independently arrive at very similar solutions.
  In software we often emphasi
  - Re: (Score:2)
    
    by drinkypoo ( 153816 ) writes:
    
    It seems to me that we need to establish when it becomes too "inspired" and when you could reasonably have arrived at the same formulation yourself.
    It seems to me that we need to do away with software patents so that software can be of the highest quality possible without having to worry about whether one has reinvented a wheel.
    One might argue that patents in general are holding back progress today, but software patents are clearly bananas. The same terms don't make sense as for physical inventions.
  - Re: (Score:2)
    
    by angel'o'sphere ( 80593 ) writes:
    
    it becomes a burden to prove copyright infringement.
    Even accidentally copying bunches of code outside of a 'cleanroom implementation' is not a copyright violation, but an independent development.
- Re: (Score:2)
  
  by angel'o'sphere ( 80593 ) writes:
  
  The RIIA most likely would not care. As it would be an issue between Metallica and U2.
Arrrr, matey! (Score:2)

by Miles_O'Toole ( 5152533 ) writes:

If Microsoft is allowed to get away with this, then I hope a lot of people take to the high seas to make them aware of just how hard it can be to fuck people over, even with all the tools a mega-corporation with the morals and ethics of a serial killer can bring to bear.
GNU Ghost says Hello World (Score:1)

by gnughost ( 8459657 ) writes:

Hello World from https://gnughost.gitlab.io/ [gitlab.io]
Hey ShanghaiBill .. (Score:1)

by takionya ( 7833802 ) writes:

link [slashdot.org]: Haaaa!
- Re:hmmm (Score:5, Informative)
  
  by Digital Avatar ( 752673 ) writes: on Saturday July 31, 2021 @07:02PM (#61642641) Journal
  
  Pretty sure "open source" does not mean "our source code is open for your to freely reproduce protected expressions from it". The problem with Copilot is not that it mines source code for ideas which it then reproduces in novel expressions, but that it reproduces protected expressions verbatim, thus violating the licenses of the programs it mined them from. Copilot is infringing and there's no doubt about it. I don't need an FSF-funded study to know that.
  
  - - Re: (Score:3)
      
      by AleRunner ( 4556245 ) writes:
      
      actually, it does mean exactly that, you are free to use it to train and inform and definitely to use it for ideas, what you are not allowed to to is reproduce it outside of the rules of the license.
      Our problem here is that we are in the middle of an AI hype cycle and people have lost track of what "train" means in a "deep learning" system. If you train a human being then, after some time working in the medium and copying recipes, they begin to develop what we call "understanding". This is pretty easy to see because you can as them to explain why they do things and they will give an explanation. Even if that explanation is wrong, it means that when they generate new things, they can do it based on t
      - Re: (Score:2)
        
        by Aighearach ( 97333 ) writes:
        
        If you train a human being then, after some time working in the medium and copying recipes, they begin to develop what we call "understanding". This is pretty easy to see ...
        A deep learning network is different. It's basically complex, disassociated pattern storage and matching
        Philosophers have been trying, and failing, to prove this for thousands of years. That you think it is easy just demonstrates the Dunning-Kruger effect.
        You claim to think, therefore you claim you are. But you can't even prove that!
        
        Re: (Score:2)
        
        by AleRunner ( 4556245 ) writes:
        
        Philosophers have been trying, and failing, to prove this for thousands of years. That you think it is easy just demonstrates the Dunning-Kruger effect.
        You claim to think, therefore you claim you are. But you can't even prove that!
        I'm not a philosopher, I'm an experimentalist. My idea of prove is different from yours. Sure, we might all be a great big simulation and I can't "disprove" it but that's not a useful statement and I just don't care.
        We can go through hours of weird arguments about "Chinese" rooms and never agree. Alternately we can just notice that, despite years of trying and billions of hours of training, deep learning AI completely fails to provide proper independent self driving cars and those cars very much fail at
        
        Re: (Score:2)
        
        by AleRunner ( 4556245 ) writes:
        
        You claim to think, therefore you claim you are. But you can't even prove that!
        Oh, and, separately I definitely don't claim that I think. I might claim that "there is a thought" and therefore "something is". Let's see you disprove that.
  - Re: (Score:2)
    
    by micheas ( 231635 ) writes:
    
    But isn't that the question?
    If an AI grabs a snippet of code as a representation of how to solve a problem is it an actual novel expression or a de minimis amount of code?
    Personally, I can easily see how our copyright law could be interpreted either way depending on which case law you look for inspiration from. With literature, the amount of copying can be quite small and be declared a protected work, on the other hand, movies have huge amounts of them that are not protected. And really, I'm not sure that i
    - Re: (Score:3)
      
      by Cassini2 ( 956052 ) writes:
      
      If an AI grabs a snippet of code as a representation of how to solve a problem is it an actual novel expression or a de minimis amount of code?
      I think this is precisely the issue, and I can see it going to court in a few more years. Having an AI generating code could upend entire sections of patent and copyright law. How would the Google / Oracle lawsuit turn out if some AI copied minor bits of code?
      Is the inventor of the AI liable for copyright infringement if someone uses its output to generate copyrig
      - Re: (Score:2)
        
        by exomondo ( 1725132 ) writes:
        
        What happens if Microsoft Word starts including automatic sentence completion? How would this affect school students?
        Most virtual keyboards already include this. Google Docs already has it in the form of Smart Compose.
  - Re: (Score:2)
    
    by angel'o'sphere ( 80593 ) writes:
    
    Copying short sequences is usually not a copyright infringement.
    So you personally most definitely need a study to fix your fake knowledge about copyright.
    - Re: (Score:2)
      
      by UnderCoverPenguin ( 1001627 ) writes:
      
      Copying short sequences is usually not a copyright infringement.
      
      This not just a few short sequences, this is mass collection of sequences.
      And, technically, short sequences is only a defense. The plaintiff can still take the matter to court. Just very likely that a reasonable judge will toss out the case.
- Re:hmmm (Score:5, Insightful)
  
  by Waffle Iron ( 339739 ) writes: on Saturday July 31, 2021 @09:22PM (#61642849)
  
  Seems FSF are doing their absolute best to ensure open source is anything but free and open.
  I'm sure that the FSF would be perfectly fine with this tool as long as the output of this tool, trained with free and open code, could in turn only be released as free and open code.
  As always, if you think that their efforts to keep their code free and open are interfering with your proprietary project, you are welcome to write your own damned code.
  
  - Re: (Score:1)
    
    by Anonymous Coward writes:
    
    That is just idiotic. By the same reasoning anyone that learns coding from open source should forever be required to publish their code as open source as it was all based on knowledge they learnt by reading open source code.
    - Re: (Score:3)
      
      by Captain Segfault ( 686912 ) writes:
      
      There may be an intermediate range in between the ends, but it isn't clear how laundering a bunch of code through a neural network avoids creating derivative works. I'm not saying it is an impossible problem, but something that sometimes blindly spews out copies of code it has seen, complete with matching comments, almost certainly doesn't achieve that bar.
      - Re: (Score:2)
        
        by arQon ( 447508 ) writes:
        
        posting to undo mismoderation. (meant to + parent, tired, misclicked, sorry)
  - Re: (Score:2)
    
    by exomondo ( 1725132 ) writes:
    
    I'm sure that the FSF would be perfectly fine with this tool as long as the output of this tool, trained with free and open code, could in turn only be released as free and open code.
    I doubt it, the GNU GCC itself could have been licensed such that its outputs had to be distributed under GPL-compatible terms but they didn't. They needed the support and help of developers that didn't necessarily share their ideology (permissively licensed free software, open source, proprietary software, etc) in order to succeed. Nowadays GCC is one of the biggest enablers of non-free and permissively licensed software.
- Re: (Score:2)
  
  by drinkypoo ( 153816 ) writes:
  
  Free and Open are different and often incompatible ideals.
  Long before the FSF was even imagined, the phrase "Open Source" was being used by many people (myself included, which is how I* got involved in this battle in the first place) to mean that you could get your hands on the sources. But it didn't specify the license at all. Let me repeat, AT ALL.
  The Unix community, which was at the time dominated by corporations, used the term "Open" to mean documented and interoperable. "Open Standards" meant just this
  - Re: (Score:2)
    
    by angel'o'sphere ( 80593 ) writes:
    
    Same for me.
    When I heard about the debate "who coined" the term open source - somewhere late 1990s I only could shake my head.
    I worked as programmer/sysadmin at the university, I installed hundreds of "open source" software long before the OSI existed.
  - Re: (Score:1)
    
    by dowhileor ( 7796472 ) writes:
    
    You are "free" to "openly" submit your work for public criticism, emulation... While others are "free" to have their work submissions discussed in an "open" (membership/invite only) forum while compensation is a condition for involvement. Not illegal but some do think that the latter violates some conditions of free/open software development?
  - Re: (Score:2)
    
    by Aighearach ( 97333 ) writes:
    
    This is why "Free Software" is even a thing; specifically, because Open Source licenses were all over the map, and most of them did not preserve freedom for users. It's not concerned with freedom for developers; in fact, it restricts developer freedom in order to increase user freedom.
    These are just bare assertions, and the parade of horibles used to support the claim have remained theoretical, even in this modern age where most open source is not GPL.
    The assertion gets weaker and weaker as time goes by.
- - Re: (Score:1)
    
    by dowhileor ( 7796472 ) writes:
    
    not my usual tactics to reply but.... harsh mod?

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Restricted Software Foundation (Score:3, Insightful)

Re: (Score:2)

Re: (Score:3, Informative)

Re: (Score:1, Insightful)

Re:Restricted Software Foundation (Score:5, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Is this really new? (Score:1)

Microsoft buys things to kill them... (Score:1)

Re:Microsoft buys things to kill them... (Score:5, Insightful)

Re:Microsoft buys things to kill them... (Score:5, Funny)

Re: Microsoft buys things to kill them... (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: Microsoft buys things to kill them... (Score:1)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

What about us? (Score:5, Insightful)

Re:What about us? (Score:5, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: What about us? (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

It seems to me .... (Score:3, Insightful)

Re: (Score:3)

Reminds me of an old question... (Score:5, Interesting)

Re:Reminds me of an old question... (Score:5, Informative)

Hackers Love It (Score:1)

GPL v4 (Score:4, Insightful)

Re: (Score:2)

Worst case (Score:2)

Re: (Score:2)

Is the onus is on the user to do due diligence? (Score:2)

Let's ask the RIAA (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Arrrr, matey! (Score:2)

GNU Ghost says Hello World (Score:1)

Hey ShanghaiBill .. (Score:1)

Re:hmmm (Score:5, Informative)

Re: (Score:3)

Re: (Score:2)