Because condensation is literally part of the definition of derivative, and basically the weights are a condensed form of the input data. It's some sort of lossy compression, when looking at it from the right point of view.
Summarization and translation are also clearly derivative.
The definition of transformative I found:
- add something new (context of other books I guess, this one might pass)
- with a further purpose or different character (further purpose clearly yes)
- do not substitute for the original use of the work (this one I find difficult. In the case of books, probably. In the case of github, it aims to replace quite some aspects of it)
is it compression though? It's a model of the way our language works and contains a set of knowledge.
If I read 200 physics textbooks, I couldn't recall any of them exactly, but I could write another book... that would contain largely the same knowledge.
The same is true of AI, it's not "compression", it's "learning" (by recording statistics about relationships between pieces of information) -- it can't (usually) recreate whole works, it'll make stuff up and alter it and whathaveyou. How much of the sum total of the internet and all books can you compress into an n-gigabyte model? I've got an 8 gigabyte file here that seems remarkably smart, but it can't recite the contents of most books to me, though it knows their famous quotes and some excerpts.
The "compression" argument does seem to have some merit, but it also seems thin in the end and, in my opinion, unlikely to hold up in court, but we'll see.
I do think a very strong, probably winning argument could be made in court that it is transformative, but that's the nature of this whole debate, we don't know yet with certainty.
You're right though, that's arguably still up for debate, but I think the precedent of transformative work is pretty well attested.