Some hero should scrape it.

lambertsimnel · on Feb 8, 2020

Would that be legal? I wonder who holds the copyright to user-contributed data in DuoLingo (assuming they're copyrightable).

If some of the user-generated content isn't copyrightable, or was contributed by users willing and able to share it with a FOSS project, could only that data be scraped, or would it be too difficult to identify?

bayesian_horse · on Feb 10, 2020

One way is to get Premium and download the course. I haven't looked at it, but I assume they haven't bothered to do any copy protection on those data packages. Not sure if they contain account-bound watermarks of any kind.