FWIW, the distinct proxy business can be solved in one of two ways, depending on...

Jake232 · on March 11, 2014

Thanks, good to know regarding the proxy. There was a couple of other little things that just didn't work the way I wanted though (I honestly don't remember them now).

I've built private libraries on top of requests now that allow me to do everything in such a trivial amount of time, so I prefer this approach with more control.

I think if I was going to write a long running spider, I'd probably look into scrapy again beforehand.