You're conflating crawling with querying/ranking in a weird way. And: grep - are you serious?
(Yes, you also namedropped Pagerank for some odd reason.)
The thing is, though: You can't easily outsource the crawling and then do the quering/ranking inhouse. The reverse index and various other data structures you need are inherently tied to the data structures from the crawler output. This is a very large amount of data and it's changing often.
The outsourcing that is being done is at the "search query to results" level. That is why this is so disappointing.
(Yes, you also namedropped Pagerank for some odd reason.)
The thing is, though: You can't easily outsource the crawling and then do the quering/ranking inhouse. The reverse index and various other data structures you need are inherently tied to the data structures from the crawler output. This is a very large amount of data and it's changing often.
The outsourcing that is being done is at the "search query to results" level. That is why this is so disappointing.