Richard Stallman Shares His Concerns About GitHub's Copilot -- and About GitHub (gnu.org) 45
destinyland writes: A newly-released video at GNU.org shows an hour-long talk given by free software advocate Richard Stallman for the BigBlueBotton open source conference (which was held online last July). After a 14-minute clip from an earlier speech, Stallman answers questions from the audience — and the first question asked Stallman for his opinion about the AI Copilot [automated pair programming tool] developed for Microsoft's GitHub in collaboration with AI research and deployment company OpenAI.
Stallman's response?
There are many legal questions about Copilot whose answers I don't know, and maybe nobody knows. And it's likely some of theo depend on the country you're in [because of the copyright laws in those countries.] In the U.S. we won't be able to have reliable answers until there are court cases about it, and who knows how many years it'll take for those court cases to arise and be finally decided. So basically what we have is a gigantic amount of uncertainty.
Now the next thing is, what about morally? What can I say morally about Copilot? Well the basic idea seems okay. Why shouldn't a program be able to give you hints like that?
But there is one pitfall, which is that if you follow those hints, you might end up putting a substantial block of code copied from a GPL-covered program, written by someone else, or one hint after another after another after another — it adds up to a substantial amount of code, perhaps, with very little change, perhaps. And then you've infringed the GPL by releasing that code, unless your program is covered by the same versions — plural — of the GPL, in which case it would be permitted. But you might not even know that. Copilot might not tell you — it doesn't endeavor to inform you. So you're likely not to know. Which means Copilot is leading users — some of its users — into a pitfall. Well, they should fix it so it doesn't do that.
But basically, what can you expect from GitHub? GitHub gives people inadequate advice about what it means to choose a license. They tell you you can choose GPL version 2 or GPL version 3. I think they don't tell you that really you could choose GPL version 2 only, or GPL version 2 or later, or GPL version 3 only, or GPL version 3 or later — and those are four different choices. They give users different permissions over the future. So it's important to make each program say clearly which choice covers it. And GitHub doesn't tell you how to do that.
It doesn't tell you that you need to do that. Because the way you do that is with a licensed notice that is supposed to be in every source file. It's unreliable to put just one statement in a free program and say "This program is covered by such-and-such license." What happens if somebody copies one of the files into some other program which says it's covered by a different license? Now that program has been inaccurately mis-licensed, which is illegal and is going to mislead users. So any self-respecting — any repository that wants to be honest has to explain these things, not just tell people to make the licensing of each piece of code clear, but help users do so — make it easy.
So GitHub has had this enormous problem for all of its existence, and Copilot has the similar — a basically, vaguely similar sort of problem, in the same area. It's not exactly the same problem. I don't think that copying a snippet of a few lines of code infringes any license. I think it's de minimus. But I'm not a lawyer.
Stallman's response?
There are many legal questions about Copilot whose answers I don't know, and maybe nobody knows. And it's likely some of theo depend on the country you're in [because of the copyright laws in those countries.] In the U.S. we won't be able to have reliable answers until there are court cases about it, and who knows how many years it'll take for those court cases to arise and be finally decided. So basically what we have is a gigantic amount of uncertainty.
Now the next thing is, what about morally? What can I say morally about Copilot? Well the basic idea seems okay. Why shouldn't a program be able to give you hints like that?
But there is one pitfall, which is that if you follow those hints, you might end up putting a substantial block of code copied from a GPL-covered program, written by someone else, or one hint after another after another after another — it adds up to a substantial amount of code, perhaps, with very little change, perhaps. And then you've infringed the GPL by releasing that code, unless your program is covered by the same versions — plural — of the GPL, in which case it would be permitted. But you might not even know that. Copilot might not tell you — it doesn't endeavor to inform you. So you're likely not to know. Which means Copilot is leading users — some of its users — into a pitfall. Well, they should fix it so it doesn't do that.
But basically, what can you expect from GitHub? GitHub gives people inadequate advice about what it means to choose a license. They tell you you can choose GPL version 2 or GPL version 3. I think they don't tell you that really you could choose GPL version 2 only, or GPL version 2 or later, or GPL version 3 only, or GPL version 3 or later — and those are four different choices. They give users different permissions over the future. So it's important to make each program say clearly which choice covers it. And GitHub doesn't tell you how to do that.
It doesn't tell you that you need to do that. Because the way you do that is with a licensed notice that is supposed to be in every source file. It's unreliable to put just one statement in a free program and say "This program is covered by such-and-such license." What happens if somebody copies one of the files into some other program which says it's covered by a different license? Now that program has been inaccurately mis-licensed, which is illegal and is going to mislead users. So any self-respecting — any repository that wants to be honest has to explain these things, not just tell people to make the licensing of each piece of code clear, but help users do so — make it easy.
So GitHub has had this enormous problem for all of its existence, and Copilot has the similar — a basically, vaguely similar sort of problem, in the same area. It's not exactly the same problem. I don't think that copying a snippet of a few lines of code infringes any license. I think it's de minimus. But I'm not a lawyer.