Related: did GitHub train the model on their own enterprise code? Considering the answer is obviously no, why do individuals not get the privilege of opting out? On the other side, I've seen a few people claim that Copilot seems to be a technique for corporations to circumvent copyleft restrictions. ![]() Or, something like, the model just needs more training to stop producing verbatim copies of copyrighted code. One common opinion seems to be that Copilot isn't really creating derivative works per se, it's just taking input sequences and producing output ones using some specific mathematical transformations.Īlong these lines, the Copilot FAQ says, "Training machine learning models on publicly available data is considered fair use across the machine learning community." Lots of people, including myself, are wondering how this can be legal: things like GNU GPL restrictions, and even attribution requirements in MIT or BSD licenses, seem clearly violated. ![]() ![]() The FAQ section on "protecting originality" claims that Copilot regurgitates verbatim copies of training inputs about 0.1% of the time, but since its release, there have been dozens of examples of it doing exactly that. The other day, GitHub released the technical preview of Copilot, an "AI pair programmer" that was trained on publicly available code repositories, including those hosted on their platform.
0 Comments
Leave a Reply. |