It's a pity that robots.txt doesn't let you specify what the crawler can do with the resources it's allowed to fetch.
I think that if we had such a feature (or something similar, like a "License" header) standardized early enough , a few issues regarding crawling and search engines would be moot, or at least easier to solve automatically.