However, he did 'break into' an MIT switch closet to run 'keepgrabbing.py' over a 1Gbit/s connection. He wasn't just downloading 100KiB PDFs, either. He downloaded at least two million documents. The indictment isn't clear exactly how many, and it sounds like he downloaded a lot more than 2 million, to boot. Not all of JSTOR's documents are neat 100KiB PDFs, either: a substantial portion are scanned images (1+ MiB PDFs) from old journals. So, we're looking at the TB range of data.
This is not to say that his intentions were ignoble...
So there is at least 2 million scientific documents that publishers are profiting from withholding.
I'm not generally anti-copyright, but I believe the profits publishers make on scientific publishing are unconscionable - not only do they impede progress, but in many cases (eg, medical research) they cost lives.
However, he did 'break into' an MIT switch closet to run 'keepgrabbing.py' over a 1Gbit/s connection. He wasn't just downloading 100KiB PDFs, either. He downloaded at least two million documents. The indictment isn't clear exactly how many, and it sounds like he downloaded a lot more than 2 million, to boot. Not all of JSTOR's documents are neat 100KiB PDFs, either: a substantial portion are scanned images (1+ MiB PDFs) from old journals. So, we're looking at the TB range of data.
This is not to say that his intentions were ignoble...