Maybe that is the future then, not distinguishing by license type, for anything? Then it's either your code is publicly accessible and can be used in any way by anyone, or it is not accessible at all?

@Erik @tindall this happened to flickr images regardless of cc license type wrt the training set for facial recognition. Is data mining breach of copyright. Seems not. Is ml output creative output worthy of copyright. Seems not.

@ton @Erik @tindall Copilot has been known to generate existing snippets verbatim so it has the original code encoded in its knowledge base. Facial recognition AI training (which I do not condone, FYI) presumably does not store the raw images after training, if it's not running afoul of copyright law. Either that or it is and the regulators don't give a shit, like many suspect will be the case with GitHub.

@AgreeableLandscape @Erik @tindall afair the IBM facial recog training set based on flickr did not contain images, it contained values describing the image (facial measurements and color tones), and metadata pointing to the image. If code is getting regurgitated verbatim the trained model indeed might be a copyright breach, and I suppose specific output might be too, depending on clip size and originality of snippet.

@AgreeableLandscape @Erik @tindall another q I haven't seen much disc on is if the ML model is not in breach nor quoting chunks entirely, then any output fails the originanlity req for copyright, and is public domain. Ianal, but that's bound to have consequences on the orginiality and copyright of anyone using copilot heavily?

